[2022-10-11 01:17:07 swin_tiny_patch4_window7_224] (main.py 301): INFO Full config saved to output/swin_tiny_patch4_window7_224/fix_graph_fp32/config.json [2022-10-11 01:17:07 swin_tiny_patch4_window7_224] (main.py 304): INFO AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 128 CACHE_MODE: part DATASET: imagenet DATA_PATH: /data/ImageNet/extract/ IMG_SIZE: 224 INTERPOLATION: bicubic NUM_WORKERS: 8 PIN_MEMORY: true ZIP_MODE: false EVAL_MODE: false LOCAL_RANK: 0 MODEL: DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 NAME: swin_tiny_patch4_window7_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 TYPE: swin OUTPUT: output/swin_tiny_patch4_window7_224/fix_graph_fp32 PRINT_FREQ: 100 SAVE_FREQ: 10 SEED: 0 TAG: fix_graph_fp32 TEST: CROP: true SEQUENTIAL: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 0 AUTO_RESUME: false BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 NAME: cosine MIN_LR: 1.0e-05 OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 [2022-10-11 01:17:10 swin_tiny_patch4_window7_224] (main.py 74): INFO Creating model:swin/swin_tiny_patch4_window7_224 [2022-10-11 01:17:13 swin_tiny_patch4_window7_224] (main.py 76): INFO SwinTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((96,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=384, out_features=192, bias=False) (norm): LayerNorm((384,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=768, out_features=384, bias=False) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=1536, out_features=768, bias=False) (norm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d() (head): Linear(in_features=768, out_features=1000, bias=True) ) [2022-10-11 01:17:13 swin_tiny_patch4_window7_224] (main.py 107): INFO Start training [2022-10-11 01:17:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][0/1251] eta 10:15:04 lr 0.000001 time 29.5001 (29.5001) loss 6.9829 (6.9829) grad_norm 0.0000 (0.0000) [2022-10-11 01:18:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][100/1251] eta 0:11:50 lr 0.000001 time 0.3359 (0.6172) loss 6.9250 (6.9488) grad_norm 0.0000 (0.0000) [2022-10-11 01:18:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][200/1251] eta 0:08:17 lr 0.000001 time 0.3194 (0.4730) loss 6.9130 (6.9321) grad_norm 0.0000 (0.0000) [2022-10-11 01:19:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][300/1251] eta 0:06:43 lr 0.000001 time 0.3198 (0.4243) loss 6.8900 (6.9206) grad_norm 0.0000 (0.0000) [2022-10-11 01:19:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][400/1251] eta 0:05:40 lr 0.000001 time 0.3676 (0.4004) loss 6.8782 (6.9111) grad_norm 0.0000 (0.0000) [2022-10-11 01:20:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][500/1251] eta 0:04:49 lr 0.000001 time 0.3260 (0.3858) loss 6.8623 (6.9029) grad_norm 0.0000 (0.0000) [2022-10-11 01:20:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][600/1251] eta 0:04:04 lr 0.000001 time 0.3138 (0.3763) loss 6.8619 (6.8951) grad_norm 0.0000 (0.0000) [2022-10-11 01:21:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][700/1251] eta 0:03:23 lr 0.000001 time 0.3782 (0.3696) loss 6.8311 (6.8874) grad_norm 0.0000 (0.0000) [2022-10-11 01:22:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][800/1251] eta 0:02:44 lr 0.000001 time 0.3320 (0.3644) loss 6.8269 (6.8793) grad_norm 0.0000 (0.0000) [2022-10-11 01:22:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][900/1251] eta 0:02:06 lr 0.000001 time 0.3182 (0.3603) loss 6.7913 (6.8708) grad_norm 0.0000 (0.0000) [2022-10-11 01:23:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][1000/1251] eta 0:01:29 lr 0.000001 time 0.3211 (0.3571) loss 6.8068 (6.8618) grad_norm 0.0000 (0.0000) [2022-10-11 01:23:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][1100/1251] eta 0:00:53 lr 0.000001 time 0.3108 (0.3544) loss 6.7765 (6.8527) grad_norm 0.0000 (0.0000) [2022-10-11 01:24:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [0/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3208 (0.3522) loss 6.7370 (6.8435) grad_norm 0.0000 (0.0000) [2022-10-11 01:24:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 0 training takes 0:07:19 [2022-10-11 01:24:32 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_0 saving...... [2022-10-11 01:24:32 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_0 saved !!! [2022-10-11 01:24:39 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 6.552 (6.552) Loss 6.3644 (6.3644) Acc@1 1.465 (1.465) Acc@5 6.152 (6.152) [2022-10-11 01:24:50 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 2.098 Acc@5 7.112 [2022-10-11 01:24:50 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 2.1% [2022-10-11 01:24:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 2.10% [2022-10-11 01:24:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][0/1251] eta 1:03:08 lr 0.000001 time 3.0284 (3.0284) loss 6.7273 (6.7273) grad_norm 0.0000 (0.0000) [2022-10-11 01:25:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3288 (0.3630) loss 6.6816 (6.6957) grad_norm 0.0000 (0.0000) [2022-10-11 01:25:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3630 (0.3458) loss 6.6589 (6.6842) grad_norm 0.0000 (0.0000) [2022-10-11 01:26:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][300/1251] eta 0:05:23 lr 0.000001 time 0.3343 (0.3398) loss 6.5684 (6.6710) grad_norm 0.0000 (0.0000) [2022-10-11 01:27:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][400/1251] eta 0:04:46 lr 0.000001 time 0.3215 (0.3370) loss 6.5898 (6.6578) grad_norm 0.0000 (0.0000) [2022-10-11 01:27:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3342 (0.3357) loss 6.5581 (6.6465) grad_norm 0.0000 (0.0000) [2022-10-11 01:28:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][600/1251] eta 0:03:37 lr 0.000001 time 0.3354 (0.3346) loss 6.6389 (6.6362) grad_norm 0.0000 (0.0000) [2022-10-11 01:28:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][700/1251] eta 0:03:03 lr 0.000001 time 0.3318 (0.3339) loss 6.5034 (6.6240) grad_norm 0.0000 (0.0000) [2022-10-11 01:29:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3241 (0.3333) loss 6.5645 (6.6139) grad_norm 0.0000 (0.0000) [2022-10-11 01:29:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][900/1251] eta 0:01:56 lr 0.000001 time 0.3271 (0.3329) loss 6.4875 (6.6035) grad_norm 0.0000 (0.0000) [2022-10-11 01:30:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3340 (0.3329) loss 6.5452 (6.5933) grad_norm 0.0000 (0.0000) [2022-10-11 01:30:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3494 (0.3328) loss 6.5075 (6.5829) grad_norm 0.0000 (0.0000) [2022-10-11 01:31:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [1/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.2994 (0.3327) loss 6.4074 (6.5734) grad_norm 0.0000 (0.0000) [2022-10-11 01:31:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 1 training takes 0:06:55 [2022-10-11 01:31:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.801 (2.801) Loss 5.5543 (5.5543) Acc@1 6.934 (6.934) Acc@5 18.359 (18.359) [2022-10-11 01:32:00 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 6.228 Acc@5 18.160 [2022-10-11 01:32:00 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 6.2% [2022-10-11 01:32:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 6.23% [2022-10-11 01:32:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][0/1251] eta 1:09:37 lr 0.000001 time 3.3396 (3.3396) loss 6.4446 (6.4446) grad_norm 0.0000 (0.0000) [2022-10-11 01:32:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3149 (0.3644) loss 6.2716 (6.4215) grad_norm 0.0000 (0.0000) [2022-10-11 01:33:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3159 (0.3468) loss 6.4608 (6.4229) grad_norm 0.0000 (0.0000) [2022-10-11 01:33:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3344 (0.3413) loss 6.3744 (6.4097) grad_norm 0.0000 (0.0000) [2022-10-11 01:34:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3192 (0.3384) loss 6.3362 (6.4012) grad_norm 0.0000 (0.0000) [2022-10-11 01:34:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3590 (0.3368) loss 6.3998 (6.3920) grad_norm 0.0000 (0.0000) [2022-10-11 01:35:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3279 (0.3355) loss 6.3195 (6.3841) grad_norm 0.0000 (0.0000) [2022-10-11 01:35:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3376 (0.3346) loss 6.3506 (6.3756) grad_norm 0.0000 (0.0000) [2022-10-11 01:36:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3315 (0.3339) loss 6.3245 (6.3665) grad_norm 0.0000 (0.0000) [2022-10-11 01:37:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3288 (0.3336) loss 6.2940 (6.3575) grad_norm 0.0000 (0.0000) [2022-10-11 01:37:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3422 (0.3335) loss 6.2133 (6.3474) grad_norm 0.0000 (0.0000) [2022-10-11 01:38:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3010 (0.3333) loss 6.2848 (6.3388) grad_norm 0.0000 (0.0000) [2022-10-11 01:38:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [2/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3382 (0.3331) loss 6.1643 (6.3300) grad_norm 0.0000 (0.0000) [2022-10-11 01:38:57 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 2 training takes 0:06:56 [2022-10-11 01:39:00 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.377 (3.377) Loss 4.9387 (4.9387) Acc@1 11.914 (11.914) Acc@5 30.371 (30.371) [2022-10-11 01:39:12 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 12.426 Acc@5 29.444 [2022-10-11 01:39:12 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 12.4% [2022-10-11 01:39:12 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 12.43% [2022-10-11 01:39:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][0/1251] eta 1:13:36 lr 0.000001 time 3.5307 (3.5307) loss 6.2463 (6.2463) grad_norm 0.0000 (0.0000) [2022-10-11 01:39:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3332 (0.3640) loss 6.2672 (6.2218) grad_norm 0.0000 (0.0000) [2022-10-11 01:40:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3124 (0.3473) loss 6.3133 (6.1991) grad_norm 0.0000 (0.0000) [2022-10-11 01:40:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3235 (0.3410) loss 6.1063 (6.1867) grad_norm 0.0000 (0.0000) [2022-10-11 01:41:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3131 (0.3376) loss 6.1597 (6.1803) grad_norm 0.0000 (0.0000) [2022-10-11 01:42:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3204 (0.3360) loss 6.1560 (6.1718) grad_norm 0.0000 (0.0000) [2022-10-11 01:42:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][600/1251] eta 0:03:37 lr 0.000001 time 0.3416 (0.3347) loss 6.0454 (6.1639) grad_norm 0.0000 (0.0000) [2022-10-11 01:43:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3315 (0.3341) loss 6.1219 (6.1580) grad_norm 0.0000 (0.0000) [2022-10-11 01:43:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3168 (0.3337) loss 6.0722 (6.1509) grad_norm 0.0000 (0.0000) [2022-10-11 01:44:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3132 (0.3335) loss 6.0982 (6.1435) grad_norm 0.0000 (0.0000) [2022-10-11 01:44:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3529 (0.3334) loss 6.0538 (6.1338) grad_norm 0.0000 (0.0000) [2022-10-11 01:45:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3526 (0.3331) loss 6.1925 (6.1256) grad_norm 0.0000 (0.0000) [2022-10-11 01:45:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [3/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3059 (0.3328) loss 6.0571 (6.1170) grad_norm 0.0000 (0.0000) [2022-10-11 01:46:08 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 3 training takes 0:06:55 [2022-10-11 01:46:11 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.502 (3.502) Loss 4.3591 (4.3591) Acc@1 19.238 (19.238) Acc@5 41.699 (41.699) [2022-10-11 01:46:23 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 18.668 Acc@5 39.916 [2022-10-11 01:46:23 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 18.7% [2022-10-11 01:46:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 18.67% [2022-10-11 01:46:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][0/1251] eta 1:08:27 lr 0.000001 time 3.2837 (3.2837) loss 5.8596 (5.8596) grad_norm 0.0000 (0.0000) [2022-10-11 01:46:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3067 (0.3639) loss 6.1583 (5.9999) grad_norm 0.0000 (0.0000) [2022-10-11 01:47:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3308 (0.3471) loss 6.1426 (5.9918) grad_norm 0.0000 (0.0000) [2022-10-11 01:48:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3082 (0.3407) loss 5.9257 (5.9894) grad_norm 0.0000 (0.0000) [2022-10-11 01:48:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3336 (0.3380) loss 5.9491 (5.9791) grad_norm 0.0000 (0.0000) [2022-10-11 01:49:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3313 (0.3364) loss 5.9796 (5.9755) grad_norm 0.0000 (0.0000) [2022-10-11 01:49:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3235 (0.3352) loss 5.7174 (5.9671) grad_norm 0.0000 (0.0000) [2022-10-11 01:50:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3361 (0.3343) loss 5.9557 (5.9602) grad_norm 0.0000 (0.0000) [2022-10-11 01:50:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3088 (0.3338) loss 5.7429 (5.9500) grad_norm 0.0000 (0.0000) [2022-10-11 01:51:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3043 (0.3334) loss 5.9048 (5.9449) grad_norm 0.0000 (0.0000) [2022-10-11 01:51:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3265 (0.3332) loss 5.9826 (5.9383) grad_norm 0.0000 (0.0000) [2022-10-11 01:52:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3522 (0.3332) loss 5.6590 (5.9298) grad_norm 0.0000 (0.0000) [2022-10-11 01:53:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [4/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3290 (0.3331) loss 5.8034 (5.9214) grad_norm 0.0000 (0.0000) [2022-10-11 01:53:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 4 training takes 0:06:56 [2022-10-11 01:53:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.699 (2.699) Loss 3.9029 (3.9029) Acc@1 25.195 (25.195) Acc@5 48.828 (48.828) [2022-10-11 01:53:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 25.314 Acc@5 48.666 [2022-10-11 01:53:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 25.3% [2022-10-11 01:53:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 25.31% [2022-10-11 01:53:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][0/1251] eta 1:15:22 lr 0.000001 time 3.6149 (3.6149) loss 5.9839 (5.9839) grad_norm 0.0000 (0.0000) [2022-10-11 01:54:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3154 (0.3651) loss 5.6921 (5.8203) grad_norm 0.0000 (0.0000) [2022-10-11 01:54:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3311 (0.3477) loss 5.6897 (5.8210) grad_norm 0.0000 (0.0000) [2022-10-11 01:55:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3233 (0.3424) loss 5.9106 (5.8091) grad_norm 0.0000 (0.0000) [2022-10-11 01:55:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3234 (0.3389) loss 6.0412 (5.8028) grad_norm 0.0000 (0.0000) [2022-10-11 01:56:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3269 (0.3370) loss 5.8172 (5.7920) grad_norm 0.0000 (0.0000) [2022-10-11 01:56:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3390 (0.3359) loss 5.6388 (5.7838) grad_norm 0.0000 (0.0000) [2022-10-11 01:57:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3285 (0.3351) loss 5.6718 (5.7787) grad_norm 0.0000 (0.0000) [2022-10-11 01:58:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3009 (0.3344) loss 5.9269 (5.7695) grad_norm 0.0000 (0.0000) [2022-10-11 01:58:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3179 (0.3340) loss 5.6175 (5.7654) grad_norm 0.0000 (0.0000) [2022-10-11 01:59:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3491 (0.3336) loss 5.7517 (5.7612) grad_norm 0.0000 (0.0000) [2022-10-11 01:59:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3269 (0.3334) loss 5.5838 (5.7550) grad_norm 0.0000 (0.0000) [2022-10-11 02:00:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [5/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3201 (0.3333) loss 5.8308 (5.7501) grad_norm 0.0000 (0.0000) [2022-10-11 02:00:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 5 training takes 0:06:56 [2022-10-11 02:00:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.890 (2.890) Loss 3.5192 (3.5192) Acc@1 30.078 (30.078) Acc@5 55.859 (55.859) [2022-10-11 02:00:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 30.630 Acc@5 55.342 [2022-10-11 02:00:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 30.6% [2022-10-11 02:00:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 30.63% [2022-10-11 02:00:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][0/1251] eta 1:07:26 lr 0.000001 time 3.2345 (3.2345) loss 5.6725 (5.6725) grad_norm 0.0000 (0.0000) [2022-10-11 02:01:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3434 (0.3646) loss 5.7472 (5.6522) grad_norm 0.0000 (0.0000) [2022-10-11 02:01:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3299 (0.3464) loss 5.6465 (5.6493) grad_norm 0.0000 (0.0000) [2022-10-11 02:02:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3337 (0.3409) loss 5.7016 (5.6539) grad_norm 0.0000 (0.0000) [2022-10-11 02:03:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3038 (0.3380) loss 5.5833 (5.6542) grad_norm 0.0000 (0.0000) [2022-10-11 02:03:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3256 (0.3366) loss 5.6706 (5.6470) grad_norm 0.0000 (0.0000) [2022-10-11 02:04:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3475 (0.3355) loss 5.8014 (5.6453) grad_norm 0.0000 (0.0000) [2022-10-11 02:04:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3235 (0.3350) loss 5.5304 (5.6387) grad_norm 0.0000 (0.0000) [2022-10-11 02:05:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3273 (0.3346) loss 5.2512 (5.6313) grad_norm 0.0000 (0.0000) [2022-10-11 02:05:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3364 (0.3341) loss 5.7076 (5.6266) grad_norm 0.0000 (0.0000) [2022-10-11 02:06:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3289 (0.3339) loss 5.5304 (5.6216) grad_norm 0.0000 (0.0000) [2022-10-11 02:06:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3426 (0.3336) loss 5.7531 (5.6173) grad_norm 0.0000 (0.0000) [2022-10-11 02:07:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [6/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3384 (0.3333) loss 5.6792 (5.6126) grad_norm 0.0000 (0.0000) [2022-10-11 02:07:43 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 6 training takes 0:06:56 [2022-10-11 02:07:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.818 (2.818) Loss 3.2021 (3.2021) Acc@1 33.398 (33.398) Acc@5 61.035 (61.035) [2022-10-11 02:07:58 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 34.448 Acc@5 59.996 [2022-10-11 02:07:58 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 34.4% [2022-10-11 02:07:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 34.45% [2022-10-11 02:08:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][0/1251] eta 1:12:23 lr 0.000001 time 3.4720 (3.4720) loss 5.6100 (5.6100) grad_norm 0.0000 (0.0000) [2022-10-11 02:08:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3048 (0.3643) loss 5.2376 (5.5368) grad_norm 0.0000 (0.0000) [2022-10-11 02:09:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3322 (0.3466) loss 5.8791 (5.5214) grad_norm 0.0000 (0.0000) [2022-10-11 02:09:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3359 (0.3408) loss 5.4260 (5.5101) grad_norm 0.0000 (0.0000) [2022-10-11 02:10:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3321 (0.3377) loss 5.6536 (5.5087) grad_norm 0.0000 (0.0000) [2022-10-11 02:10:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3012 (0.3360) loss 5.5526 (5.5037) grad_norm 0.0000 (0.0000) [2022-10-11 02:11:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3192 (0.3351) loss 5.6266 (5.4963) grad_norm 0.0000 (0.0000) [2022-10-11 02:11:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3337 (0.3346) loss 5.5216 (5.4952) grad_norm 0.0000 (0.0000) [2022-10-11 02:12:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3298 (0.3341) loss 5.3818 (5.4882) grad_norm 0.0000 (0.0000) [2022-10-11 02:12:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3500 (0.3336) loss 5.4569 (5.4848) grad_norm 0.0000 (0.0000) [2022-10-11 02:13:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3627 (0.3333) loss 5.7665 (5.4836) grad_norm 0.0000 (0.0000) [2022-10-11 02:14:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3043 (0.3330) loss 5.3714 (5.4794) grad_norm 0.0000 (0.0000) [2022-10-11 02:14:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [7/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3216 (0.3329) loss 5.6330 (5.4740) grad_norm 0.0000 (0.0000) [2022-10-11 02:14:54 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 7 training takes 0:06:56 [2022-10-11 02:14:57 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.244 (3.244) Loss 2.9887 (2.9887) Acc@1 37.305 (37.305) Acc@5 62.988 (62.988) [2022-10-11 02:15:09 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 37.682 Acc@5 63.422 [2022-10-11 02:15:09 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 37.7% [2022-10-11 02:15:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 37.68% [2022-10-11 02:15:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][0/1251] eta 1:12:55 lr 0.000001 time 3.4980 (3.4980) loss 5.4075 (5.4075) grad_norm 0.0000 (0.0000) [2022-10-11 02:15:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3131 (0.3651) loss 5.3743 (5.3935) grad_norm 0.0000 (0.0000) [2022-10-11 02:16:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3113 (0.3473) loss 5.4679 (5.3919) grad_norm 0.0000 (0.0000) [2022-10-11 02:16:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3625 (0.3424) loss 5.1443 (5.3911) grad_norm 0.0000 (0.0000) [2022-10-11 02:17:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3289 (0.3391) loss 5.4895 (5.3903) grad_norm 0.0000 (0.0000) [2022-10-11 02:17:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3453 (0.3373) loss 5.4050 (5.3889) grad_norm 0.0000 (0.0000) [2022-10-11 02:18:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3169 (0.3357) loss 5.4454 (5.3843) grad_norm 0.0000 (0.0000) [2022-10-11 02:19:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3508 (0.3347) loss 5.0422 (5.3815) grad_norm 0.0000 (0.0000) [2022-10-11 02:19:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3327 (0.3340) loss 5.5196 (5.3773) grad_norm 0.0000 (0.0000) [2022-10-11 02:20:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][900/1251] eta 0:01:57 lr 0.000001 time 0.2991 (0.3336) loss 5.1306 (5.3769) grad_norm 0.0000 (0.0000) [2022-10-11 02:20:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3090 (0.3333) loss 5.4305 (5.3738) grad_norm 0.0000 (0.0000) [2022-10-11 02:21:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3258 (0.3334) loss 5.4561 (5.3684) grad_norm 0.0000 (0.0000) [2022-10-11 02:21:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [8/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3411 (0.3332) loss 5.2332 (5.3623) grad_norm 0.0000 (0.0000) [2022-10-11 02:22:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 8 training takes 0:06:56 [2022-10-11 02:22:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.481 (3.481) Loss 2.8628 (2.8628) Acc@1 40.527 (40.527) Acc@5 64.941 (64.941) [2022-10-11 02:22:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 41.440 Acc@5 67.112 [2022-10-11 02:22:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 41.4% [2022-10-11 02:22:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 41.44% [2022-10-11 02:22:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][0/1251] eta 1:08:52 lr 0.000001 time 3.3034 (3.3034) loss 5.3393 (5.3393) grad_norm 0.0000 (0.0000) [2022-10-11 02:22:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3198 (0.3637) loss 5.4579 (5.3287) grad_norm 0.0000 (0.0000) [2022-10-11 02:23:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3277 (0.3463) loss 5.0650 (5.3273) grad_norm 0.0000 (0.0000) [2022-10-11 02:24:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3207 (0.3408) loss 5.3393 (5.3247) grad_norm 0.0000 (0.0000) [2022-10-11 02:24:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3187 (0.3380) loss 5.3967 (5.3159) grad_norm 0.0000 (0.0000) [2022-10-11 02:25:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3340 (0.3361) loss 5.1900 (5.3089) grad_norm 0.0000 (0.0000) [2022-10-11 02:25:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3392 (0.3355) loss 5.2656 (5.3067) grad_norm 0.0000 (0.0000) [2022-10-11 02:26:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3404 (0.3348) loss 5.5055 (5.3057) grad_norm 0.0000 (0.0000) [2022-10-11 02:26:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3568 (0.3341) loss 5.0024 (5.2972) grad_norm 0.0000 (0.0000) [2022-10-11 02:27:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3161 (0.3338) loss 5.2472 (5.2902) grad_norm 0.0000 (0.0000) [2022-10-11 02:27:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3582 (0.3336) loss 5.5043 (5.2858) grad_norm 0.0000 (0.0000) [2022-10-11 02:28:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3413 (0.3333) loss 5.4192 (5.2808) grad_norm 0.0000 (0.0000) [2022-10-11 02:29:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [9/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3270 (0.3331) loss 5.2774 (5.2763) grad_norm 0.0000 (0.0000) [2022-10-11 02:29:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 9 training takes 0:06:56 [2022-10-11 02:29:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.060 (3.060) Loss 2.5571 (2.5571) Acc@1 44.922 (44.922) Acc@5 71.289 (71.289) [2022-10-11 02:29:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 44.206 Acc@5 69.780 [2022-10-11 02:29:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 44.2% [2022-10-11 02:29:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 44.21% [2022-10-11 02:29:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][0/1251] eta 1:10:42 lr 0.000001 time 3.3914 (3.3914) loss 4.9902 (4.9902) grad_norm 0.0000 (0.0000) [2022-10-11 02:30:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3402 (0.3645) loss 5.2583 (5.2154) grad_norm 0.0000 (0.0000) [2022-10-11 02:30:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3208 (0.3471) loss 5.2568 (5.2162) grad_norm 0.0000 (0.0000) [2022-10-11 02:31:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3548 (0.3407) loss 5.4180 (5.2152) grad_norm 0.0000 (0.0000) [2022-10-11 02:31:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3351 (0.3386) loss 5.3491 (5.2070) grad_norm 0.0000 (0.0000) [2022-10-11 02:32:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3234 (0.3370) loss 5.4629 (5.2034) grad_norm 0.0000 (0.0000) [2022-10-11 02:32:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3138 (0.3359) loss 5.0208 (5.1983) grad_norm 0.0000 (0.0000) [2022-10-11 02:33:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3517 (0.3353) loss 5.3995 (5.1964) grad_norm 0.0000 (0.0000) [2022-10-11 02:34:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3505 (0.3350) loss 4.7374 (5.1958) grad_norm 0.0000 (0.0000) [2022-10-11 02:34:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3524 (0.3346) loss 5.2553 (5.1947) grad_norm 0.0000 (0.0000) [2022-10-11 02:35:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3347 (0.3342) loss 5.1527 (5.1910) grad_norm 0.0000 (0.0000) [2022-10-11 02:35:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3075 (0.3339) loss 5.3443 (5.1854) grad_norm 0.0000 (0.0000) [2022-10-11 02:36:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [10/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3357 (0.3336) loss 5.0991 (5.1822) grad_norm 0.0000 (0.0000) [2022-10-11 02:36:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 10 training takes 0:06:56 [2022-10-11 02:36:29 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_10 saving...... [2022-10-11 02:36:29 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_10 saved !!! [2022-10-11 02:36:33 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.237 (3.237) Loss 2.5201 (2.5201) Acc@1 43.066 (43.066) Acc@5 70.801 (70.801) [2022-10-11 02:36:44 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 47.044 Acc@5 72.470 [2022-10-11 02:36:44 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 47.0% [2022-10-11 02:36:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 47.04% [2022-10-11 02:36:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][0/1251] eta 1:14:03 lr 0.000001 time 3.5521 (3.5521) loss 5.2435 (5.2435) grad_norm 0.0000 (0.0000) [2022-10-11 02:37:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3421 (0.3648) loss 5.1912 (5.0960) grad_norm 0.0000 (0.0000) [2022-10-11 02:37:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3329 (0.3479) loss 4.9676 (5.1026) grad_norm 0.0000 (0.0000) [2022-10-11 02:38:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3186 (0.3426) loss 5.2434 (5.0934) grad_norm 0.0000 (0.0000) [2022-10-11 02:39:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3414 (0.3394) loss 5.4799 (5.1066) grad_norm 0.0000 (0.0000) [2022-10-11 02:39:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3187 (0.3375) loss 5.0382 (5.1076) grad_norm 0.0000 (0.0000) [2022-10-11 02:40:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3132 (0.3365) loss 5.1051 (5.1031) grad_norm 0.0000 (0.0000) [2022-10-11 02:40:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3175 (0.3359) loss 5.1776 (5.1042) grad_norm 0.0000 (0.0000) [2022-10-11 02:41:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3224 (0.3355) loss 5.0655 (5.0998) grad_norm 0.0000 (0.0000) [2022-10-11 02:41:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3227 (0.3349) loss 5.0714 (5.0978) grad_norm 0.0000 (0.0000) [2022-10-11 02:42:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3226 (0.3347) loss 5.0749 (5.0955) grad_norm 0.0000 (0.0000) [2022-10-11 02:42:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3614 (0.3344) loss 4.8839 (5.0922) grad_norm 0.0000 (0.0000) [2022-10-11 02:43:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [11/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3253 (0.3342) loss 5.1311 (5.0883) grad_norm 0.0000 (0.0000) [2022-10-11 02:43:42 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 11 training takes 0:06:57 [2022-10-11 02:43:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.065 (3.065) Loss 2.4495 (2.4495) Acc@1 47.363 (47.363) Acc@5 72.949 (72.949) [2022-10-11 02:43:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 48.580 Acc@5 73.864 [2022-10-11 02:43:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 48.6% [2022-10-11 02:43:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 48.58% [2022-10-11 02:44:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][0/1251] eta 1:05:41 lr 0.000001 time 3.1509 (3.1509) loss 4.9582 (4.9582) grad_norm 0.0000 (0.0000) [2022-10-11 02:44:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3443 (0.3669) loss 4.9618 (5.0472) grad_norm 0.0000 (0.0000) [2022-10-11 02:45:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3180 (0.3489) loss 5.2511 (5.0486) grad_norm 0.0000 (0.0000) [2022-10-11 02:45:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3486 (0.3430) loss 5.0165 (5.0493) grad_norm 0.0000 (0.0000) [2022-10-11 02:46:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3492 (0.3395) loss 4.9224 (5.0450) grad_norm 0.0000 (0.0000) [2022-10-11 02:46:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3256 (0.3377) loss 5.2400 (5.0366) grad_norm 0.0000 (0.0000) [2022-10-11 02:47:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3441 (0.3368) loss 5.1933 (5.0388) grad_norm 0.0000 (0.0000) [2022-10-11 02:47:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3814 (0.3363) loss 4.7080 (5.0367) grad_norm 0.0000 (0.0000) [2022-10-11 02:48:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3370 (0.3358) loss 4.7883 (5.0359) grad_norm 0.0000 (0.0000) [2022-10-11 02:48:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3335 (0.3356) loss 4.9094 (5.0330) grad_norm 0.0000 (0.0000) [2022-10-11 02:49:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3509 (0.3352) loss 5.1067 (5.0324) grad_norm 0.0000 (0.0000) [2022-10-11 02:50:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3559 (0.3350) loss 4.7882 (5.0300) grad_norm 0.0000 (0.0000) [2022-10-11 02:50:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [12/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3207 (0.3349) loss 5.0430 (5.0274) grad_norm 0.0000 (0.0000) [2022-10-11 02:50:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 12 training takes 0:06:58 [2022-10-11 02:50:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.057 (3.057) Loss 2.1866 (2.1866) Acc@1 52.637 (52.637) Acc@5 75.684 (75.684) [2022-10-11 02:51:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 50.386 Acc@5 75.310 [2022-10-11 02:51:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 50.4% [2022-10-11 02:51:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 50.39% [2022-10-11 02:51:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][0/1251] eta 1:12:51 lr 0.000001 time 3.4943 (3.4943) loss 4.9225 (4.9225) grad_norm 0.0000 (0.0000) [2022-10-11 02:51:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3350 (0.3656) loss 4.9339 (4.9566) grad_norm 0.0000 (0.0000) [2022-10-11 02:52:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3519 (0.3494) loss 4.7588 (4.9574) grad_norm 0.0000 (0.0000) [2022-10-11 02:52:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3034 (0.3429) loss 4.8682 (4.9651) grad_norm 0.0000 (0.0000) [2022-10-11 02:53:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3227 (0.3394) loss 4.9421 (4.9700) grad_norm 0.0000 (0.0000) [2022-10-11 02:54:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3337 (0.3377) loss 5.2065 (4.9684) grad_norm 0.0000 (0.0000) [2022-10-11 02:54:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3486 (0.3366) loss 5.0832 (4.9667) grad_norm 0.0000 (0.0000) [2022-10-11 02:55:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3261 (0.3359) loss 4.8327 (4.9643) grad_norm 0.0000 (0.0000) [2022-10-11 02:55:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3082 (0.3355) loss 4.7732 (4.9611) grad_norm 0.0000 (0.0000) [2022-10-11 02:56:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3500 (0.3352) loss 5.0446 (4.9588) grad_norm 0.0000 (0.0000) [2022-10-11 02:56:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3682 (0.3349) loss 4.9753 (4.9538) grad_norm 0.0000 (0.0000) [2022-10-11 02:57:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3047 (0.3347) loss 4.9687 (4.9508) grad_norm 0.0000 (0.0000) [2022-10-11 02:57:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [13/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3343 (0.3345) loss 4.8986 (4.9452) grad_norm 0.0000 (0.0000) [2022-10-11 02:58:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 13 training takes 0:06:57 [2022-10-11 02:58:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.243 (3.243) Loss 2.2164 (2.2164) Acc@1 52.148 (52.148) Acc@5 75.879 (75.879) [2022-10-11 02:58:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 52.184 Acc@5 77.088 [2022-10-11 02:58:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 52.2% [2022-10-11 02:58:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 52.18% [2022-10-11 02:58:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][0/1251] eta 1:12:31 lr 0.000001 time 3.4781 (3.4781) loss 4.8600 (4.8600) grad_norm 0.0000 (0.0000) [2022-10-11 02:59:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3156 (0.3672) loss 4.4905 (4.8842) grad_norm 0.0000 (0.0000) [2022-10-11 02:59:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3300 (0.3499) loss 4.7746 (4.9026) grad_norm 0.0000 (0.0000) [2022-10-11 03:00:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3217 (0.3436) loss 4.7679 (4.9032) grad_norm 0.0000 (0.0000) [2022-10-11 03:00:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3461 (0.3409) loss 5.0481 (4.9055) grad_norm 0.0000 (0.0000) [2022-10-11 03:01:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3079 (0.3387) loss 5.1082 (4.9017) grad_norm 0.0000 (0.0000) [2022-10-11 03:01:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3194 (0.3377) loss 5.1939 (4.9044) grad_norm 0.0000 (0.0000) [2022-10-11 03:02:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3162 (0.3366) loss 4.8877 (4.9030) grad_norm 0.0000 (0.0000) [2022-10-11 03:02:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3335 (0.3360) loss 4.9723 (4.9013) grad_norm 0.0000 (0.0000) [2022-10-11 03:03:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3224 (0.3356) loss 4.5674 (4.9006) grad_norm 0.0000 (0.0000) [2022-10-11 03:04:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3276 (0.3352) loss 4.8398 (4.8990) grad_norm 0.0000 (0.0000) [2022-10-11 03:04:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3628 (0.3349) loss 5.0234 (4.8971) grad_norm 0.0000 (0.0000) [2022-10-11 03:05:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [14/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3444 (0.3346) loss 4.8172 (4.8943) grad_norm 0.0000 (0.0000) [2022-10-11 03:05:22 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 14 training takes 0:06:58 [2022-10-11 03:05:25 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.811 (2.811) Loss 2.1061 (2.1061) Acc@1 54.785 (54.785) Acc@5 77.539 (77.539) [2022-10-11 03:05:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 53.646 Acc@5 78.438 [2022-10-11 03:05:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 53.6% [2022-10-11 03:05:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 53.65% [2022-10-11 03:05:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][0/1251] eta 1:12:46 lr 0.000001 time 3.4903 (3.4903) loss 5.0669 (5.0669) grad_norm 0.0000 (0.0000) [2022-10-11 03:06:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3346 (0.3665) loss 5.2263 (4.8414) grad_norm 0.0000 (0.0000) [2022-10-11 03:06:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3400 (0.3496) loss 4.9300 (4.8376) grad_norm 0.0000 (0.0000) [2022-10-11 03:07:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3175 (0.3437) loss 4.8490 (4.8468) grad_norm 0.0000 (0.0000) [2022-10-11 03:07:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3306 (0.3403) loss 4.6280 (4.8482) grad_norm 0.0000 (0.0000) [2022-10-11 03:08:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3330 (0.3387) loss 4.9119 (4.8527) grad_norm 0.0000 (0.0000) [2022-10-11 03:09:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3406 (0.3377) loss 4.7754 (4.8575) grad_norm 0.0000 (0.0000) [2022-10-11 03:09:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3570 (0.3368) loss 5.1528 (4.8516) grad_norm 0.0000 (0.0000) [2022-10-11 03:10:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3156 (0.3360) loss 4.7108 (4.8471) grad_norm 0.0000 (0.0000) [2022-10-11 03:10:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3234 (0.3355) loss 4.6406 (4.8441) grad_norm 0.0000 (0.0000) [2022-10-11 03:11:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3149 (0.3352) loss 4.9259 (4.8444) grad_norm 0.0000 (0.0000) [2022-10-11 03:11:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3373 (0.3350) loss 4.7066 (4.8429) grad_norm 0.0000 (0.0000) [2022-10-11 03:12:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [15/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.2999 (0.3348) loss 4.8505 (4.8403) grad_norm 0.0000 (0.0000) [2022-10-11 03:12:36 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 15 training takes 0:06:58 [2022-10-11 03:12:39 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.221 (3.221) Loss 2.1004 (2.1004) Acc@1 50.977 (50.977) Acc@5 76.953 (76.953) [2022-10-11 03:12:51 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 54.940 Acc@5 79.510 [2022-10-11 03:12:51 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 54.9% [2022-10-11 03:12:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 54.94% [2022-10-11 03:12:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][0/1251] eta 1:07:38 lr 0.000001 time 3.2443 (3.2443) loss 4.7068 (4.7068) grad_norm 0.0000 (0.0000) [2022-10-11 03:13:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3428 (0.3672) loss 4.9516 (4.8081) grad_norm 0.0000 (0.0000) [2022-10-11 03:14:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3415 (0.3484) loss 5.0173 (4.7889) grad_norm 0.0000 (0.0000) [2022-10-11 03:14:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3459 (0.3425) loss 5.0604 (4.7894) grad_norm 0.0000 (0.0000) [2022-10-11 03:15:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3306 (0.3395) loss 4.3797 (4.7911) grad_norm 0.0000 (0.0000) [2022-10-11 03:15:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3262 (0.3377) loss 4.8701 (4.7991) grad_norm 0.0000 (0.0000) [2022-10-11 03:16:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3256 (0.3361) loss 4.5841 (4.7967) grad_norm 0.0000 (0.0000) [2022-10-11 03:16:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3369 (0.3354) loss 5.0939 (4.7999) grad_norm 0.0000 (0.0000) [2022-10-11 03:17:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3262 (0.3348) loss 4.5813 (4.7976) grad_norm 0.0000 (0.0000) [2022-10-11 03:17:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3495 (0.3343) loss 5.0498 (4.7973) grad_norm 0.0000 (0.0000) [2022-10-11 03:18:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3192 (0.3340) loss 4.9876 (4.7951) grad_norm 0.0000 (0.0000) [2022-10-11 03:18:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3300 (0.3338) loss 4.7547 (4.7930) grad_norm 0.0000 (0.0000) [2022-10-11 03:19:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [16/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3238 (0.3337) loss 4.9500 (4.7940) grad_norm 0.0000 (0.0000) [2022-10-11 03:19:49 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 16 training takes 0:06:57 [2022-10-11 03:19:52 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.117 (3.117) Loss 1.9906 (1.9906) Acc@1 55.957 (55.957) Acc@5 80.176 (80.176) [2022-10-11 03:20:04 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 56.514 Acc@5 80.694 [2022-10-11 03:20:04 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 56.5% [2022-10-11 03:20:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 56.51% [2022-10-11 03:20:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][0/1251] eta 1:04:26 lr 0.000001 time 3.0906 (3.0906) loss 4.8404 (4.8404) grad_norm 0.0000 (0.0000) [2022-10-11 03:20:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3169 (0.3657) loss 4.7050 (4.7849) grad_norm 0.0000 (0.0000) [2022-10-11 03:21:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3333 (0.3485) loss 4.7401 (4.7682) grad_norm 0.0000 (0.0000) [2022-10-11 03:21:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3556 (0.3426) loss 4.7558 (4.7699) grad_norm 0.0000 (0.0000) [2022-10-11 03:22:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3387 (0.3396) loss 4.6812 (4.7641) grad_norm 0.0000 (0.0000) [2022-10-11 03:22:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3331 (0.3377) loss 4.7990 (4.7640) grad_norm 0.0000 (0.0000) [2022-10-11 03:23:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3239 (0.3365) loss 4.4813 (4.7600) grad_norm 0.0000 (0.0000) [2022-10-11 03:23:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3315 (0.3357) loss 4.5961 (4.7617) grad_norm 0.0000 (0.0000) [2022-10-11 03:24:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3262 (0.3354) loss 4.7082 (4.7592) grad_norm 0.0000 (0.0000) [2022-10-11 03:25:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3293 (0.3351) loss 5.0381 (4.7587) grad_norm 0.0000 (0.0000) [2022-10-11 03:25:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3527 (0.3345) loss 4.6955 (4.7576) grad_norm 0.0000 (0.0000) [2022-10-11 03:26:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3332 (0.3341) loss 4.7514 (4.7582) grad_norm 0.0000 (0.0000) [2022-10-11 03:26:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [17/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3523 (0.3340) loss 4.5965 (4.7585) grad_norm 0.0000 (0.0000) [2022-10-11 03:27:01 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 17 training takes 0:06:57 [2022-10-11 03:27:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.856 (3.856) Loss 2.0308 (2.0308) Acc@1 56.445 (56.445) Acc@5 79.492 (79.492) [2022-10-11 03:27:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 57.306 Acc@5 81.226 [2022-10-11 03:27:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 57.3% [2022-10-11 03:27:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 57.31% [2022-10-11 03:27:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][0/1251] eta 1:10:58 lr 0.000001 time 3.4037 (3.4037) loss 5.1084 (5.1084) grad_norm 0.0000 (0.0000) [2022-10-11 03:27:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3048 (0.3682) loss 4.2910 (4.6952) grad_norm 0.0000 (0.0000) [2022-10-11 03:28:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3390 (0.3488) loss 4.7755 (4.7057) grad_norm 0.0000 (0.0000) [2022-10-11 03:29:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3312 (0.3426) loss 4.8030 (4.7158) grad_norm 0.0000 (0.0000) [2022-10-11 03:29:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3258 (0.3400) loss 4.8730 (4.7154) grad_norm 0.0000 (0.0000) [2022-10-11 03:30:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3277 (0.3377) loss 4.8355 (4.7159) grad_norm 0.0000 (0.0000) [2022-10-11 03:30:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3368 (0.3366) loss 5.0681 (4.7206) grad_norm 0.0000 (0.0000) [2022-10-11 03:31:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3345 (0.3359) loss 4.6347 (4.7193) grad_norm 0.0000 (0.0000) [2022-10-11 03:31:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3448 (0.3351) loss 4.9393 (4.7217) grad_norm 0.0000 (0.0000) [2022-10-11 03:32:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3119 (0.3346) loss 4.7649 (4.7153) grad_norm 0.0000 (0.0000) [2022-10-11 03:32:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3461 (0.3342) loss 4.7180 (4.7148) grad_norm 0.0000 (0.0000) [2022-10-11 03:33:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3166 (0.3339) loss 4.9052 (4.7167) grad_norm 0.0000 (0.0000) [2022-10-11 03:33:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [18/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3448 (0.3337) loss 4.8261 (4.7156) grad_norm 0.0000 (0.0000) [2022-10-11 03:34:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 18 training takes 0:06:57 [2022-10-11 03:34:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.277 (3.277) Loss 1.9372 (1.9372) Acc@1 57.520 (57.520) Acc@5 81.152 (81.152) [2022-10-11 03:34:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 57.954 Acc@5 81.754 [2022-10-11 03:34:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 58.0% [2022-10-11 03:34:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 57.95% [2022-10-11 03:34:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][0/1251] eta 1:19:20 lr 0.000001 time 3.8052 (3.8052) loss 4.4047 (4.4047) grad_norm 0.0000 (0.0000) [2022-10-11 03:35:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3361 (0.3660) loss 4.7323 (4.6928) grad_norm 0.0000 (0.0000) [2022-10-11 03:35:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3143 (0.3487) loss 4.2903 (4.6908) grad_norm 0.0000 (0.0000) [2022-10-11 03:36:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3218 (0.3422) loss 4.5300 (4.6888) grad_norm 0.0000 (0.0000) [2022-10-11 03:36:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3670 (0.3393) loss 4.4589 (4.6920) grad_norm 0.0000 (0.0000) [2022-10-11 03:37:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3696 (0.3375) loss 4.8393 (4.6906) grad_norm 0.0000 (0.0000) [2022-10-11 03:37:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3539 (0.3361) loss 4.6529 (4.6917) grad_norm 0.0000 (0.0000) [2022-10-11 03:38:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3087 (0.3354) loss 4.8022 (4.6889) grad_norm 0.0000 (0.0000) [2022-10-11 03:38:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3302 (0.3349) loss 4.1814 (4.6858) grad_norm 0.0000 (0.0000) [2022-10-11 03:39:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3271 (0.3343) loss 4.8142 (4.6859) grad_norm 0.0000 (0.0000) [2022-10-11 03:40:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3382 (0.3339) loss 4.7378 (4.6870) grad_norm 0.0000 (0.0000) [2022-10-11 03:40:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3401 (0.3337) loss 5.0115 (4.6852) grad_norm 0.0000 (0.0000) [2022-10-11 03:41:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [19/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3108 (0.3335) loss 4.7193 (4.6846) grad_norm 0.0000 (0.0000) [2022-10-11 03:41:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 19 training takes 0:06:56 [2022-10-11 03:41:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.212 (3.212) Loss 1.8077 (1.8077) Acc@1 57.715 (57.715) Acc@5 82.715 (82.715) [2022-10-11 03:41:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 58.748 Acc@5 82.322 [2022-10-11 03:41:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 58.7% [2022-10-11 03:41:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 58.75% [2022-10-11 03:41:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][0/1251] eta 1:06:59 lr 0.000001 time 3.2130 (3.2130) loss 4.5440 (4.5440) grad_norm 0.0000 (0.0000) [2022-10-11 03:42:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3384 (0.3678) loss 4.5691 (4.6360) grad_norm 0.0000 (0.0000) [2022-10-11 03:42:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3103 (0.3499) loss 4.7473 (4.6599) grad_norm 0.0000 (0.0000) [2022-10-11 03:43:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3398 (0.3436) loss 4.7735 (4.6685) grad_norm 0.0000 (0.0000) [2022-10-11 03:43:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3151 (0.3407) loss 4.5057 (4.6624) grad_norm 0.0000 (0.0000) [2022-10-11 03:44:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3298 (0.3387) loss 4.4726 (4.6603) grad_norm 0.0000 (0.0000) [2022-10-11 03:45:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3594 (0.3374) loss 4.4241 (4.6491) grad_norm 0.0000 (0.0000) [2022-10-11 03:45:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3105 (0.3364) loss 4.8565 (4.6509) grad_norm 0.0000 (0.0000) [2022-10-11 03:46:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3561 (0.3356) loss 4.8593 (4.6505) grad_norm 0.0000 (0.0000) [2022-10-11 03:46:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3476 (0.3350) loss 5.0192 (4.6462) grad_norm 0.0000 (0.0000) [2022-10-11 03:47:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3222 (0.3346) loss 4.7188 (4.6452) grad_norm 0.0000 (0.0000) [2022-10-11 03:47:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3560 (0.3344) loss 4.9021 (4.6443) grad_norm 0.0000 (0.0000) [2022-10-11 03:48:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [20/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3254 (0.3343) loss 4.8643 (4.6439) grad_norm 0.0000 (0.0000) [2022-10-11 03:48:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 20 training takes 0:06:57 [2022-10-11 03:48:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_20 saving...... [2022-10-11 03:48:39 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_20 saved !!! [2022-10-11 03:48:42 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.062 (3.062) Loss 1.7874 (1.7874) Acc@1 59.863 (59.863) Acc@5 83.301 (83.301) [2022-10-11 03:48:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 59.792 Acc@5 83.248 [2022-10-11 03:48:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 59.8% [2022-10-11 03:48:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 59.79% [2022-10-11 03:48:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][0/1251] eta 1:09:13 lr 0.000001 time 3.3202 (3.3202) loss 4.4081 (4.4081) grad_norm 0.0000 (0.0000) [2022-10-11 03:49:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3499 (0.3671) loss 4.6018 (4.5892) grad_norm 0.0000 (0.0000) [2022-10-11 03:50:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3476 (0.3507) loss 4.5731 (4.5978) grad_norm 0.0000 (0.0000) [2022-10-11 03:50:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3447 (0.3441) loss 4.7311 (4.5903) grad_norm 0.0000 (0.0000) [2022-10-11 03:51:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3424 (0.3415) loss 4.3621 (4.5956) grad_norm 0.0000 (0.0000) [2022-10-11 03:51:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3173 (0.3396) loss 4.5718 (4.5998) grad_norm 0.0000 (0.0000) [2022-10-11 03:52:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3114 (0.3382) loss 4.6419 (4.6009) grad_norm 0.0000 (0.0000) [2022-10-11 03:52:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3083 (0.3375) loss 4.4686 (4.5984) grad_norm 0.0000 (0.0000) [2022-10-11 03:53:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3454 (0.3372) loss 4.4326 (4.5963) grad_norm 0.0000 (0.0000) [2022-10-11 03:53:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3359 (0.3367) loss 4.6072 (4.5971) grad_norm 0.0000 (0.0000) [2022-10-11 03:54:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3321 (0.3364) loss 4.8055 (4.5994) grad_norm 0.0000 (0.0000) [2022-10-11 03:55:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3411 (0.3360) loss 4.6660 (4.5990) grad_norm 0.0000 (0.0000) [2022-10-11 03:55:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [21/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3410 (0.3358) loss 4.5077 (4.5988) grad_norm 0.0000 (0.0000) [2022-10-11 03:55:54 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 21 training takes 0:06:59 [2022-10-11 03:55:57 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.283 (3.283) Loss 1.7241 (1.7241) Acc@1 61.133 (61.133) Acc@5 84.082 (84.082) [2022-10-11 03:56:09 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 60.964 Acc@5 84.132 [2022-10-11 03:56:09 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 61.0% [2022-10-11 03:56:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 60.96% [2022-10-11 03:56:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][0/1251] eta 1:07:57 lr 0.000001 time 3.2593 (3.2593) loss 4.7472 (4.7472) grad_norm 0.0000 (0.0000) [2022-10-11 03:56:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3925 (0.3671) loss 4.5532 (4.5660) grad_norm 0.0000 (0.0000) [2022-10-11 03:57:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3376 (0.3499) loss 4.9570 (4.5753) grad_norm 0.0000 (0.0000) [2022-10-11 03:57:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3333 (0.3445) loss 4.5803 (4.5747) grad_norm 0.0000 (0.0000) [2022-10-11 03:58:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3143 (0.3418) loss 4.8713 (4.5582) grad_norm 0.0000 (0.0000) [2022-10-11 03:58:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3246 (0.3399) loss 4.6878 (4.5584) grad_norm 0.0000 (0.0000) [2022-10-11 03:59:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3177 (0.3388) loss 4.7139 (4.5615) grad_norm 0.0000 (0.0000) [2022-10-11 04:00:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3603 (0.3379) loss 4.5989 (4.5634) grad_norm 0.0000 (0.0000) [2022-10-11 04:00:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3399 (0.3374) loss 4.2368 (4.5621) grad_norm 0.0000 (0.0000) [2022-10-11 04:01:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3621 (0.3370) loss 4.6374 (4.5589) grad_norm 0.0000 (0.0000) [2022-10-11 04:01:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3580 (0.3368) loss 4.5076 (4.5545) grad_norm 0.0000 (0.0000) [2022-10-11 04:02:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3278 (0.3365) loss 4.6515 (4.5566) grad_norm 0.0000 (0.0000) [2022-10-11 04:02:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [22/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3273 (0.3361) loss 4.6547 (4.5554) grad_norm 0.0000 (0.0000) [2022-10-11 04:03:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 22 training takes 0:07:00 [2022-10-11 04:03:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.194 (3.194) Loss 1.7022 (1.7022) Acc@1 62.891 (62.891) Acc@5 84.180 (84.180) [2022-10-11 04:03:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 61.634 Acc@5 84.474 [2022-10-11 04:03:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 61.6% [2022-10-11 04:03:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 61.63% [2022-10-11 04:03:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][0/1251] eta 1:06:55 lr 0.000001 time 3.2099 (3.2099) loss 4.5657 (4.5657) grad_norm 0.0000 (0.0000) [2022-10-11 04:04:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3289 (0.3645) loss 4.1735 (4.5247) grad_norm 0.0000 (0.0000) [2022-10-11 04:04:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3202 (0.3491) loss 4.3309 (4.5207) grad_norm 0.0000 (0.0000) [2022-10-11 04:05:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3186 (0.3426) loss 4.4464 (4.5246) grad_norm 0.0000 (0.0000) [2022-10-11 04:05:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3120 (0.3400) loss 4.5465 (4.5231) grad_norm 0.0000 (0.0000) [2022-10-11 04:06:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3279 (0.3384) loss 4.7014 (4.5226) grad_norm 0.0000 (0.0000) [2022-10-11 04:06:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3253 (0.3373) loss 4.6995 (4.5240) grad_norm 0.0000 (0.0000) [2022-10-11 04:07:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3302 (0.3365) loss 4.6744 (4.5199) grad_norm 0.0000 (0.0000) [2022-10-11 04:07:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3157 (0.3357) loss 4.5840 (4.5210) grad_norm 0.0000 (0.0000) [2022-10-11 04:08:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3424 (0.3354) loss 4.7081 (4.5205) grad_norm 0.0000 (0.0000) [2022-10-11 04:08:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3089 (0.3350) loss 4.2364 (4.5208) grad_norm 0.0000 (0.0000) [2022-10-11 04:09:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3457 (0.3347) loss 4.5537 (4.5213) grad_norm 0.0000 (0.0000) [2022-10-11 04:10:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [23/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3357 (0.3346) loss 4.0516 (4.5188) grad_norm 0.0000 (0.0000) [2022-10-11 04:10:22 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 23 training takes 0:06:58 [2022-10-11 04:10:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.658 (3.658) Loss 1.6714 (1.6714) Acc@1 62.109 (62.109) Acc@5 84.668 (84.668) [2022-10-11 04:10:37 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 62.310 Acc@5 85.118 [2022-10-11 04:10:37 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 62.3% [2022-10-11 04:10:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 62.31% [2022-10-11 04:10:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][0/1251] eta 1:15:31 lr 0.000001 time 3.6221 (3.6221) loss 4.5806 (4.5806) grad_norm 0.0000 (0.0000) [2022-10-11 04:11:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3422 (0.3691) loss 4.4266 (4.5102) grad_norm 0.0000 (0.0000) [2022-10-11 04:11:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3271 (0.3512) loss 3.9043 (4.5057) grad_norm 0.0000 (0.0000) [2022-10-11 04:12:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3255 (0.3450) loss 4.5501 (4.5059) grad_norm 0.0000 (0.0000) [2022-10-11 04:12:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3362 (0.3412) loss 4.7033 (4.5013) grad_norm 0.0000 (0.0000) [2022-10-11 04:13:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3218 (0.3392) loss 4.2883 (4.5048) grad_norm 0.0000 (0.0000) [2022-10-11 04:14:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3153 (0.3380) loss 4.2854 (4.5004) grad_norm 0.0000 (0.0000) [2022-10-11 04:14:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3416 (0.3371) loss 4.4147 (4.4966) grad_norm 0.0000 (0.0000) [2022-10-11 04:15:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3355 (0.3366) loss 4.7711 (4.4994) grad_norm 0.0000 (0.0000) [2022-10-11 04:15:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3098 (0.3361) loss 4.3629 (4.4965) grad_norm 0.0000 (0.0000) [2022-10-11 04:16:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3382 (0.3360) loss 4.6692 (4.4936) grad_norm 0.0000 (0.0000) [2022-10-11 04:16:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3260 (0.3357) loss 4.2744 (4.4942) grad_norm 0.0000 (0.0000) [2022-10-11 04:17:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [24/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3274 (0.3353) loss 4.4133 (4.4930) grad_norm 0.0000 (0.0000) [2022-10-11 04:17:37 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 24 training takes 0:06:59 [2022-10-11 04:17:40 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.500 (3.500) Loss 1.6121 (1.6121) Acc@1 63.379 (63.379) Acc@5 84.961 (84.961) [2022-10-11 04:17:52 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 62.772 Acc@5 85.464 [2022-10-11 04:17:52 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 62.8% [2022-10-11 04:17:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 62.77% [2022-10-11 04:17:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][0/1251] eta 1:00:48 lr 0.000001 time 2.9162 (2.9162) loss 4.6002 (4.6002) grad_norm 0.0000 (0.0000) [2022-10-11 04:18:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3466 (0.3675) loss 4.7913 (4.4751) grad_norm 0.0000 (0.0000) [2022-10-11 04:19:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3103 (0.3511) loss 4.4386 (4.4655) grad_norm 0.0000 (0.0000) [2022-10-11 04:19:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3599 (0.3451) loss 4.4792 (4.4579) grad_norm 0.0000 (0.0000) [2022-10-11 04:20:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3244 (0.3417) loss 4.2626 (4.4561) grad_norm 0.0000 (0.0000) [2022-10-11 04:20:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3302 (0.3400) loss 4.7043 (4.4615) grad_norm 0.0000 (0.0000) [2022-10-11 04:21:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3317 (0.3386) loss 4.8417 (4.4620) grad_norm 0.0000 (0.0000) [2022-10-11 04:21:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3448 (0.3377) loss 4.3729 (4.4626) grad_norm 0.0000 (0.0000) [2022-10-11 04:22:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3376 (0.3370) loss 4.5442 (4.4591) grad_norm 0.0000 (0.0000) [2022-10-11 04:22:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3160 (0.3368) loss 4.3604 (4.4596) grad_norm 0.0000 (0.0000) [2022-10-11 04:23:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3413 (0.3364) loss 4.4877 (4.4599) grad_norm 0.0000 (0.0000) [2022-10-11 04:24:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3431 (0.3361) loss 4.3143 (4.4616) grad_norm 0.0000 (0.0000) [2022-10-11 04:24:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [25/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3215 (0.3358) loss 4.6912 (4.4613) grad_norm 0.0000 (0.0000) [2022-10-11 04:24:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 25 training takes 0:06:59 [2022-10-11 04:24:55 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.322 (3.322) Loss 1.5184 (1.5184) Acc@1 65.234 (65.234) Acc@5 87.012 (87.012) [2022-10-11 04:25:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 63.276 Acc@5 85.896 [2022-10-11 04:25:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 63.3% [2022-10-11 04:25:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 63.28% [2022-10-11 04:25:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][0/1251] eta 1:09:08 lr 0.000001 time 3.3161 (3.3161) loss 4.5092 (4.5092) grad_norm 0.0000 (0.0000) [2022-10-11 04:25:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3279 (0.3686) loss 4.4257 (4.4524) grad_norm 0.0000 (0.0000) [2022-10-11 04:26:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3325 (0.3499) loss 4.6088 (4.4473) grad_norm 0.0000 (0.0000) [2022-10-11 04:26:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3338 (0.3440) loss 4.5237 (4.4448) grad_norm 0.0000 (0.0000) [2022-10-11 04:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3171 (0.3412) loss 4.2494 (4.4344) grad_norm 0.0000 (0.0000) [2022-10-11 04:27:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3188 (0.3390) loss 4.3720 (4.4345) grad_norm 0.0000 (0.0000) [2022-10-11 04:28:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3202 (0.3379) loss 4.4127 (4.4343) grad_norm 0.0000 (0.0000) [2022-10-11 04:29:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3214 (0.3373) loss 4.2467 (4.4352) grad_norm 0.0000 (0.0000) [2022-10-11 04:29:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3498 (0.3367) loss 4.3034 (4.4357) grad_norm 0.0000 (0.0000) [2022-10-11 04:30:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3510 (0.3362) loss 4.4568 (4.4356) grad_norm 0.0000 (0.0000) [2022-10-11 04:30:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3293 (0.3360) loss 4.4387 (4.4334) grad_norm 0.0000 (0.0000) [2022-10-11 04:31:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3222 (0.3356) loss 4.5130 (4.4332) grad_norm 0.0000 (0.0000) [2022-10-11 04:31:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [26/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3236 (0.3354) loss 4.4951 (4.4307) grad_norm 0.0000 (0.0000) [2022-10-11 04:32:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 26 training takes 0:06:59 [2022-10-11 04:32:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.528 (3.528) Loss 1.4983 (1.4983) Acc@1 64.453 (64.453) Acc@5 86.133 (86.133) [2022-10-11 04:32:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 63.912 Acc@5 86.160 [2022-10-11 04:32:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 63.9% [2022-10-11 04:32:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 63.91% [2022-10-11 04:32:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][0/1251] eta 1:15:46 lr 0.000001 time 3.6342 (3.6342) loss 4.3012 (4.3012) grad_norm 0.0000 (0.0000) [2022-10-11 04:32:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3058 (0.3671) loss 4.4822 (4.4204) grad_norm 0.0000 (0.0000) [2022-10-11 04:33:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3367 (0.3501) loss 4.3357 (4.4029) grad_norm 0.0000 (0.0000) [2022-10-11 04:34:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3213 (0.3444) loss 4.5613 (4.4048) grad_norm 0.0000 (0.0000) [2022-10-11 04:34:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3276 (0.3410) loss 4.2385 (4.3971) grad_norm 0.0000 (0.0000) [2022-10-11 04:35:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3233 (0.3394) loss 4.2605 (4.3983) grad_norm 0.0000 (0.0000) [2022-10-11 04:35:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3324 (0.3379) loss 4.6263 (4.4014) grad_norm 0.0000 (0.0000) [2022-10-11 04:36:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3196 (0.3370) loss 4.4720 (4.4015) grad_norm 0.0000 (0.0000) [2022-10-11 04:36:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3564 (0.3364) loss 4.2753 (4.4036) grad_norm 0.0000 (0.0000) [2022-10-11 04:37:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3163 (0.3359) loss 4.3142 (4.4022) grad_norm 0.0000 (0.0000) [2022-10-11 04:37:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3397 (0.3356) loss 4.3868 (4.4049) grad_norm 0.0000 (0.0000) [2022-10-11 04:38:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3361 (0.3354) loss 4.2371 (4.4057) grad_norm 0.0000 (0.0000) [2022-10-11 04:39:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [27/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3276 (0.3351) loss 4.2607 (4.4038) grad_norm 0.0000 (0.0000) [2022-10-11 04:39:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 27 training takes 0:06:58 [2022-10-11 04:39:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.171 (3.171) Loss 1.6831 (1.6831) Acc@1 62.305 (62.305) Acc@5 84.863 (84.863) [2022-10-11 04:39:35 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 64.780 Acc@5 86.676 [2022-10-11 04:39:35 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 64.8% [2022-10-11 04:39:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 64.78% [2022-10-11 04:39:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][0/1251] eta 1:06:44 lr 0.000001 time 3.2013 (3.2013) loss 4.4221 (4.4221) grad_norm 0.0000 (0.0000) [2022-10-11 04:40:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3526 (0.3677) loss 4.5417 (4.3834) grad_norm 0.0000 (0.0000) [2022-10-11 04:40:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3206 (0.3488) loss 4.4742 (4.3981) grad_norm 0.0000 (0.0000) [2022-10-11 04:41:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3119 (0.3431) loss 4.4848 (4.3921) grad_norm 0.0000 (0.0000) [2022-10-11 04:41:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3656 (0.3403) loss 4.5024 (4.3911) grad_norm 0.0000 (0.0000) [2022-10-11 04:42:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3239 (0.3388) loss 4.4161 (4.3881) grad_norm 0.0000 (0.0000) [2022-10-11 04:42:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3378 (0.3375) loss 4.0183 (4.3861) grad_norm 0.0000 (0.0000) [2022-10-11 04:43:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3076 (0.3368) loss 4.0587 (4.3836) grad_norm 0.0000 (0.0000) [2022-10-11 04:44:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3398 (0.3362) loss 4.4834 (4.3821) grad_norm 0.0000 (0.0000) [2022-10-11 04:44:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3463 (0.3358) loss 4.3808 (4.3793) grad_norm 0.0000 (0.0000) [2022-10-11 04:45:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3148 (0.3355) loss 4.4139 (4.3783) grad_norm 0.0000 (0.0000) [2022-10-11 04:45:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3302 (0.3352) loss 4.6412 (4.3794) grad_norm 0.0000 (0.0000) [2022-10-11 04:46:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [28/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3655 (0.3351) loss 4.3488 (4.3809) grad_norm 0.0000 (0.0000) [2022-10-11 04:46:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 28 training takes 0:06:58 [2022-10-11 04:46:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.650 (3.650) Loss 1.5497 (1.5497) Acc@1 63.477 (63.477) Acc@5 87.500 (87.500) [2022-10-11 04:46:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 65.224 Acc@5 86.998 [2022-10-11 04:46:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 65.2% [2022-10-11 04:46:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 65.22% [2022-10-11 04:46:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][0/1251] eta 1:09:40 lr 0.000001 time 3.3418 (3.3418) loss 4.4106 (4.4106) grad_norm 0.0000 (0.0000) [2022-10-11 04:47:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3427 (0.3658) loss 4.5858 (4.3377) grad_norm 0.0000 (0.0000) [2022-10-11 04:47:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3146 (0.3484) loss 4.0666 (4.3300) grad_norm 0.0000 (0.0000) [2022-10-11 04:48:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3423 (0.3424) loss 4.4329 (4.3434) grad_norm 0.0000 (0.0000) [2022-10-11 04:49:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3192 (0.3391) loss 4.4881 (4.3480) grad_norm 0.0000 (0.0000) [2022-10-11 04:49:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3410 (0.3372) loss 4.2008 (4.3504) grad_norm 0.0000 (0.0000) [2022-10-11 04:50:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3212 (0.3363) loss 4.1775 (4.3498) grad_norm 0.0000 (0.0000) [2022-10-11 04:50:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3160 (0.3354) loss 4.5253 (4.3515) grad_norm 0.0000 (0.0000) [2022-10-11 04:51:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3153 (0.3347) loss 4.3988 (4.3518) grad_norm 0.0000 (0.0000) [2022-10-11 04:51:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3189 (0.3344) loss 4.4717 (4.3552) grad_norm 0.0000 (0.0000) [2022-10-11 04:52:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3206 (0.3343) loss 4.4946 (4.3523) grad_norm 0.0000 (0.0000) [2022-10-11 04:52:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3388 (0.3342) loss 4.6865 (4.3513) grad_norm 0.0000 (0.0000) [2022-10-11 04:53:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [29/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3430 (0.3340) loss 4.3336 (4.3536) grad_norm 0.0000 (0.0000) [2022-10-11 04:53:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 29 training takes 0:06:57 [2022-10-11 04:53:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.043 (3.043) Loss 1.5648 (1.5648) Acc@1 65.430 (65.430) Acc@5 86.914 (86.914) [2022-10-11 04:54:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 65.490 Acc@5 87.252 [2022-10-11 04:54:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 65.5% [2022-10-11 04:54:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 65.49% [2022-10-11 04:54:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][0/1251] eta 1:12:12 lr 0.000001 time 3.4636 (3.4636) loss 4.2071 (4.2071) grad_norm 0.0000 (0.0000) [2022-10-11 04:54:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3396 (0.3677) loss 4.1374 (4.3181) grad_norm 0.0000 (0.0000) [2022-10-11 04:55:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3316 (0.3496) loss 4.0722 (4.3160) grad_norm 0.0000 (0.0000) [2022-10-11 04:55:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3124 (0.3437) loss 4.2955 (4.3243) grad_norm 0.0000 (0.0000) [2022-10-11 04:56:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3480 (0.3402) loss 4.3323 (4.3160) grad_norm 0.0000 (0.0000) [2022-10-11 04:56:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3214 (0.3385) loss 4.4480 (4.3158) grad_norm 0.0000 (0.0000) [2022-10-11 04:57:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3161 (0.3373) loss 4.3054 (4.3182) grad_norm 0.0000 (0.0000) [2022-10-11 04:57:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3287 (0.3366) loss 4.1191 (4.3246) grad_norm 0.0000 (0.0000) [2022-10-11 04:58:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3517 (0.3360) loss 4.3212 (4.3270) grad_norm 0.0000 (0.0000) [2022-10-11 04:59:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3306 (0.3354) loss 4.3144 (4.3282) grad_norm 0.0000 (0.0000) [2022-10-11 04:59:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3261 (0.3349) loss 3.9196 (4.3307) grad_norm 0.0000 (0.0000) [2022-10-11 05:00:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3235 (0.3347) loss 4.6646 (4.3281) grad_norm 0.0000 (0.0000) [2022-10-11 05:00:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [30/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3620 (0.3346) loss 4.3980 (4.3276) grad_norm 0.0000 (0.0000) [2022-10-11 05:01:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 30 training takes 0:06:58 [2022-10-11 05:01:00 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_30 saving...... [2022-10-11 05:01:01 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_30 saved !!! [2022-10-11 05:01:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.308 (3.308) Loss 1.4731 (1.4731) Acc@1 65.332 (65.332) Acc@5 87.988 (87.988) [2022-10-11 05:01:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 66.012 Acc@5 87.382 [2022-10-11 05:01:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 66.0% [2022-10-11 05:01:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 66.01% [2022-10-11 05:01:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][0/1251] eta 1:11:50 lr 0.000001 time 3.4456 (3.4456) loss 3.9615 (3.9615) grad_norm 0.0000 (0.0000) [2022-10-11 05:01:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3325 (0.3649) loss 4.5483 (4.2937) grad_norm 0.0000 (0.0000) [2022-10-11 05:02:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3424 (0.3481) loss 4.4755 (4.3076) grad_norm 0.0000 (0.0000) [2022-10-11 05:02:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3618 (0.3425) loss 4.1966 (4.3113) grad_norm 0.0000 (0.0000) [2022-10-11 05:03:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3299 (0.3392) loss 4.6342 (4.3200) grad_norm 0.0000 (0.0000) [2022-10-11 05:04:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3031 (0.3374) loss 4.2427 (4.3222) grad_norm 0.0000 (0.0000) [2022-10-11 05:04:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3119 (0.3361) loss 4.3333 (4.3243) grad_norm 0.0000 (0.0000) [2022-10-11 05:05:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3358 (0.3354) loss 4.6102 (4.3270) grad_norm 0.0000 (0.0000) [2022-10-11 05:05:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3292 (0.3347) loss 4.4064 (4.3274) grad_norm 0.0000 (0.0000) [2022-10-11 05:06:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3364 (0.3341) loss 4.0883 (4.3274) grad_norm 0.0000 (0.0000) [2022-10-11 05:06:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3379 (0.3337) loss 4.2287 (4.3237) grad_norm 0.0000 (0.0000) [2022-10-11 05:07:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3446 (0.3333) loss 4.5102 (4.3218) grad_norm 0.0000 (0.0000) [2022-10-11 05:07:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [31/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3215 (0.3330) loss 4.4570 (4.3222) grad_norm 0.0000 (0.0000) [2022-10-11 05:08:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 31 training takes 0:06:56 [2022-10-11 05:08:15 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.275 (3.275) Loss 1.4645 (1.4645) Acc@1 65.234 (65.234) Acc@5 87.598 (87.598) [2022-10-11 05:08:27 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 66.240 Acc@5 87.680 [2022-10-11 05:08:27 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 66.2% [2022-10-11 05:08:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 66.24% [2022-10-11 05:08:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][0/1251] eta 1:10:49 lr 0.000001 time 3.3967 (3.3967) loss 4.2201 (4.2201) grad_norm 0.0000 (0.0000) [2022-10-11 05:09:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3184 (0.3669) loss 4.0189 (4.2742) grad_norm 0.0000 (0.0000) [2022-10-11 05:09:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3150 (0.3493) loss 4.1413 (4.2939) grad_norm 0.0000 (0.0000) [2022-10-11 05:10:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3349 (0.3430) loss 4.1591 (4.2910) grad_norm 0.0000 (0.0000) [2022-10-11 05:10:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3196 (0.3398) loss 3.9090 (4.2971) grad_norm 0.0000 (0.0000) [2022-10-11 05:11:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3299 (0.3379) loss 4.1115 (4.3059) grad_norm 0.0000 (0.0000) [2022-10-11 05:11:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3452 (0.3369) loss 4.2631 (4.3067) grad_norm 0.0000 (0.0000) [2022-10-11 05:12:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3430 (0.3363) loss 4.4286 (4.3091) grad_norm 0.0000 (0.0000) [2022-10-11 05:12:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3346 (0.3358) loss 4.3869 (4.3035) grad_norm 0.0000 (0.0000) [2022-10-11 05:13:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3163 (0.3354) loss 4.3094 (4.3029) grad_norm 0.0000 (0.0000) [2022-10-11 05:14:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3379 (0.3349) loss 4.3157 (4.3017) grad_norm 0.0000 (0.0000) [2022-10-11 05:14:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3406 (0.3347) loss 4.4992 (4.3007) grad_norm 0.0000 (0.0000) [2022-10-11 05:15:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [32/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3327 (0.3345) loss 4.2477 (4.3015) grad_norm 0.0000 (0.0000) [2022-10-11 05:15:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 32 training takes 0:06:58 [2022-10-11 05:15:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.438 (3.438) Loss 1.4103 (1.4103) Acc@1 66.406 (66.406) Acc@5 88.281 (88.281) [2022-10-11 05:15:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 66.710 Acc@5 87.848 [2022-10-11 05:15:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 66.7% [2022-10-11 05:15:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 66.71% [2022-10-11 05:15:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][0/1251] eta 1:10:11 lr 0.000001 time 3.3662 (3.3662) loss 4.4140 (4.4140) grad_norm 0.0000 (0.0000) [2022-10-11 05:16:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3290 (0.3684) loss 4.1979 (4.2563) grad_norm 0.0000 (0.0000) [2022-10-11 05:16:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3144 (0.3499) loss 4.3606 (4.2852) grad_norm 0.0000 (0.0000) [2022-10-11 05:17:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3287 (0.3435) loss 4.4484 (4.2790) grad_norm 0.0000 (0.0000) [2022-10-11 05:17:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3114 (0.3403) loss 4.2010 (4.2718) grad_norm 0.0000 (0.0000) [2022-10-11 05:18:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3093 (0.3385) loss 4.4888 (4.2713) grad_norm 0.0000 (0.0000) [2022-10-11 05:19:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3323 (0.3371) loss 4.2049 (4.2721) grad_norm 0.0000 (0.0000) [2022-10-11 05:19:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3324 (0.3360) loss 4.3387 (4.2786) grad_norm 0.0000 (0.0000) [2022-10-11 05:20:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3310 (0.3355) loss 4.3340 (4.2760) grad_norm 0.0000 (0.0000) [2022-10-11 05:20:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3471 (0.3350) loss 4.3369 (4.2773) grad_norm 0.0000 (0.0000) [2022-10-11 05:21:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3523 (0.3347) loss 4.2768 (4.2772) grad_norm 0.0000 (0.0000) [2022-10-11 05:21:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3452 (0.3346) loss 3.9116 (4.2785) grad_norm 0.0000 (0.0000) [2022-10-11 05:22:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [33/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3384 (0.3344) loss 4.3883 (4.2794) grad_norm 0.0000 (0.0000) [2022-10-11 05:22:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 33 training takes 0:06:57 [2022-10-11 05:22:42 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.316 (3.316) Loss 1.3753 (1.3753) Acc@1 67.773 (67.773) Acc@5 88.867 (88.867) [2022-10-11 05:22:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 67.078 Acc@5 88.344 [2022-10-11 05:22:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 67.1% [2022-10-11 05:22:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 67.08% [2022-10-11 05:22:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][0/1251] eta 1:08:27 lr 0.000001 time 3.2835 (3.2835) loss 4.2756 (4.2756) grad_norm 0.0000 (0.0000) [2022-10-11 05:23:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3311 (0.3687) loss 4.1203 (4.2211) grad_norm 0.0000 (0.0000) [2022-10-11 05:24:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3227 (0.3501) loss 4.2035 (4.2499) grad_norm 0.0000 (0.0000) [2022-10-11 05:24:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3124 (0.3438) loss 4.1520 (4.2577) grad_norm 0.0000 (0.0000) [2022-10-11 05:25:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3237 (0.3410) loss 4.3231 (4.2512) grad_norm 0.0000 (0.0000) [2022-10-11 05:25:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3676 (0.3394) loss 4.4298 (4.2527) grad_norm 0.0000 (0.0000) [2022-10-11 05:26:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3367 (0.3381) loss 4.2201 (4.2548) grad_norm 0.0000 (0.0000) [2022-10-11 05:26:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3185 (0.3372) loss 4.5089 (4.2582) grad_norm 0.0000 (0.0000) [2022-10-11 05:27:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3410 (0.3366) loss 4.2684 (4.2631) grad_norm 0.0000 (0.0000) [2022-10-11 05:27:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3251 (0.3359) loss 4.3928 (4.2631) grad_norm 0.0000 (0.0000) [2022-10-11 05:28:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3168 (0.3355) loss 4.1042 (4.2642) grad_norm 0.0000 (0.0000) [2022-10-11 05:29:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3327 (0.3354) loss 3.9674 (4.2651) grad_norm 0.0000 (0.0000) [2022-10-11 05:29:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [34/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3451 (0.3352) loss 4.6652 (4.2665) grad_norm 0.0000 (0.0000) [2022-10-11 05:29:54 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 34 training takes 0:06:59 [2022-10-11 05:29:57 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.431 (3.431) Loss 1.4202 (1.4202) Acc@1 65.820 (65.820) Acc@5 88.672 (88.672) [2022-10-11 05:30:09 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 67.236 Acc@5 88.278 [2022-10-11 05:30:09 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 67.2% [2022-10-11 05:30:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 67.24% [2022-10-11 05:30:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][0/1251] eta 1:05:32 lr 0.000001 time 3.1438 (3.1438) loss 4.2320 (4.2320) grad_norm 0.0000 (0.0000) [2022-10-11 05:30:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3811 (0.3687) loss 4.0488 (4.2795) grad_norm 0.0000 (0.0000) [2022-10-11 05:31:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3136 (0.3512) loss 3.9351 (4.2746) grad_norm 0.0000 (0.0000) [2022-10-11 05:31:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3424 (0.3446) loss 4.2402 (4.2774) grad_norm 0.0000 (0.0000) [2022-10-11 05:32:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3324 (0.3406) loss 4.5757 (4.2803) grad_norm 0.0000 (0.0000) [2022-10-11 05:32:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3232 (0.3389) loss 3.8986 (4.2748) grad_norm 0.0000 (0.0000) [2022-10-11 05:33:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3179 (0.3377) loss 4.5045 (4.2775) grad_norm 0.0000 (0.0000) [2022-10-11 05:34:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3325 (0.3367) loss 4.3663 (4.2697) grad_norm 0.0000 (0.0000) [2022-10-11 05:34:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3496 (0.3360) loss 3.9427 (4.2667) grad_norm 0.0000 (0.0000) [2022-10-11 05:35:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3376 (0.3355) loss 3.9810 (4.2643) grad_norm 0.0000 (0.0000) [2022-10-11 05:35:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3348 (0.3350) loss 4.2981 (4.2636) grad_norm 0.0000 (0.0000) [2022-10-11 05:36:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3259 (0.3347) loss 4.4234 (4.2615) grad_norm 0.0000 (0.0000) [2022-10-11 05:36:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [35/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3214 (0.3345) loss 4.0522 (4.2587) grad_norm 0.0000 (0.0000) [2022-10-11 05:37:07 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 35 training takes 0:06:58 [2022-10-11 05:37:10 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.203 (3.203) Loss 1.3972 (1.3972) Acc@1 66.406 (66.406) Acc@5 89.355 (89.355) [2022-10-11 05:37:22 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 67.514 Acc@5 88.334 [2022-10-11 05:37:22 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 67.5% [2022-10-11 05:37:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 67.51% [2022-10-11 05:37:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][0/1251] eta 1:11:18 lr 0.000001 time 3.4199 (3.4199) loss 4.3934 (4.3934) grad_norm 0.0000 (0.0000) [2022-10-11 05:37:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3178 (0.3669) loss 4.1715 (4.2417) grad_norm 0.0000 (0.0000) [2022-10-11 05:38:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3577 (0.3493) loss 4.4932 (4.2340) grad_norm 0.0000 (0.0000) [2022-10-11 05:39:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3309 (0.3437) loss 4.1569 (4.2379) grad_norm 0.0000 (0.0000) [2022-10-11 05:39:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3464 (0.3402) loss 4.4085 (4.2370) grad_norm 0.0000 (0.0000) [2022-10-11 05:40:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3360 (0.3382) loss 4.3448 (4.2407) grad_norm 0.0000 (0.0000) [2022-10-11 05:40:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3568 (0.3373) loss 4.1786 (4.2391) grad_norm 0.0000 (0.0000) [2022-10-11 05:41:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3339 (0.3363) loss 4.0695 (4.2395) grad_norm 0.0000 (0.0000) [2022-10-11 05:41:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3221 (0.3355) loss 4.1212 (4.2405) grad_norm 0.0000 (0.0000) [2022-10-11 05:42:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3365 (0.3351) loss 4.1530 (4.2394) grad_norm 0.0000 (0.0000) [2022-10-11 05:42:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3163 (0.3347) loss 3.9481 (4.2397) grad_norm 0.0000 (0.0000) [2022-10-11 05:43:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3101 (0.3343) loss 3.9955 (4.2409) grad_norm 0.0000 (0.0000) [2022-10-11 05:44:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [36/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3366 (0.3342) loss 4.6295 (4.2405) grad_norm 0.0000 (0.0000) [2022-10-11 05:44:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 36 training takes 0:06:57 [2022-10-11 05:44:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.958 (2.958) Loss 1.4209 (1.4209) Acc@1 68.066 (68.066) Acc@5 86.719 (86.719) [2022-10-11 05:44:36 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 67.660 Acc@5 88.576 [2022-10-11 05:44:36 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 67.7% [2022-10-11 05:44:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 67.66% [2022-10-11 05:44:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][0/1251] eta 1:10:02 lr 0.000001 time 3.3592 (3.3592) loss 4.5260 (4.5260) grad_norm 0.0000 (0.0000) [2022-10-11 05:45:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3290 (0.3672) loss 4.5660 (4.2059) grad_norm 0.0000 (0.0000) [2022-10-11 05:45:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3365 (0.3498) loss 4.2685 (4.2090) grad_norm 0.0000 (0.0000) [2022-10-11 05:46:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3214 (0.3433) loss 3.9757 (4.2101) grad_norm 0.0000 (0.0000) [2022-10-11 05:46:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3521 (0.3402) loss 4.3012 (4.2127) grad_norm 0.0000 (0.0000) [2022-10-11 05:47:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3149 (0.3383) loss 4.0890 (4.2275) grad_norm 0.0000 (0.0000) [2022-10-11 05:47:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3052 (0.3370) loss 4.3511 (4.2281) grad_norm 0.0000 (0.0000) [2022-10-11 05:48:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3328 (0.3360) loss 4.4253 (4.2247) grad_norm 0.0000 (0.0000) [2022-10-11 05:49:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3435 (0.3354) loss 4.3145 (4.2270) grad_norm 0.0000 (0.0000) [2022-10-11 05:49:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3040 (0.3348) loss 4.4987 (4.2275) grad_norm 0.0000 (0.0000) [2022-10-11 05:50:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3449 (0.3345) loss 4.2895 (4.2275) grad_norm 0.0000 (0.0000) [2022-10-11 05:50:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3613 (0.3341) loss 4.1823 (4.2272) grad_norm 0.0000 (0.0000) [2022-10-11 05:51:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [37/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3334 (0.3339) loss 4.1372 (4.2284) grad_norm 0.0000 (0.0000) [2022-10-11 05:51:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 37 training takes 0:06:57 [2022-10-11 05:51:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.376 (3.376) Loss 1.2580 (1.2580) Acc@1 69.824 (69.824) Acc@5 90.234 (90.234) [2022-10-11 05:51:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.024 Acc@5 88.736 [2022-10-11 05:51:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.0% [2022-10-11 05:51:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.02% [2022-10-11 05:51:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][0/1251] eta 1:12:26 lr 0.000001 time 3.4744 (3.4744) loss 4.1670 (4.1670) grad_norm 0.0000 (0.0000) [2022-10-11 05:52:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3144 (0.3673) loss 4.2240 (4.2397) grad_norm 0.0000 (0.0000) [2022-10-11 05:52:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3383 (0.3495) loss 4.3082 (4.2381) grad_norm 0.0000 (0.0000) [2022-10-11 05:53:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3248 (0.3439) loss 3.9557 (4.2337) grad_norm 0.0000 (0.0000) [2022-10-11 05:54:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3389 (0.3409) loss 4.0111 (4.2287) grad_norm 0.0000 (0.0000) [2022-10-11 05:54:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3167 (0.3391) loss 3.9583 (4.2262) grad_norm 0.0000 (0.0000) [2022-10-11 05:55:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3313 (0.3380) loss 4.5885 (4.2252) grad_norm 0.0000 (0.0000) [2022-10-11 05:55:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3206 (0.3372) loss 4.2112 (4.2233) grad_norm 0.0000 (0.0000) [2022-10-11 05:56:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3113 (0.3364) loss 4.1106 (4.2203) grad_norm 0.0000 (0.0000) [2022-10-11 05:56:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3407 (0.3358) loss 4.3664 (4.2194) grad_norm 0.0000 (0.0000) [2022-10-11 05:57:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3084 (0.3354) loss 4.3407 (4.2205) grad_norm 0.0000 (0.0000) [2022-10-11 05:57:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3548 (0.3354) loss 4.0296 (4.2203) grad_norm 0.0000 (0.0000) [2022-10-11 05:58:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [38/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3054 (0.3352) loss 4.0557 (4.2182) grad_norm 0.0000 (0.0000) [2022-10-11 05:58:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 38 training takes 0:06:58 [2022-10-11 05:58:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.096 (3.096) Loss 1.3636 (1.3636) Acc@1 69.141 (69.141) Acc@5 89.258 (89.258) [2022-10-11 05:59:03 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.390 Acc@5 89.014 [2022-10-11 05:59:03 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-11 05:59:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.39% [2022-10-11 05:59:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][0/1251] eta 1:11:30 lr 0.000001 time 3.4294 (3.4294) loss 4.0461 (4.0461) grad_norm 0.0000 (0.0000) [2022-10-11 05:59:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3185 (0.3662) loss 4.2462 (4.1871) grad_norm 0.0000 (0.0000) [2022-10-11 06:00:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3188 (0.3486) loss 4.4956 (4.1971) grad_norm 0.0000 (0.0000) [2022-10-11 06:00:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3176 (0.3431) loss 4.6073 (4.1939) grad_norm 0.0000 (0.0000) [2022-10-11 06:01:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3369 (0.3401) loss 4.4432 (4.1969) grad_norm 0.0000 (0.0000) [2022-10-11 06:01:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3155 (0.3382) loss 4.2874 (4.1999) grad_norm 0.0000 (0.0000) [2022-10-11 06:02:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3273 (0.3367) loss 4.1836 (4.2003) grad_norm 0.0000 (0.0000) [2022-10-11 06:02:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3371 (0.3360) loss 4.0015 (4.1958) grad_norm 0.0000 (0.0000) [2022-10-11 06:03:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3225 (0.3351) loss 4.0467 (4.1949) grad_norm 0.0000 (0.0000) [2022-10-11 06:04:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3280 (0.3345) loss 4.4405 (4.1945) grad_norm 0.0000 (0.0000) [2022-10-11 06:04:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3354 (0.3340) loss 4.1296 (4.1953) grad_norm 0.0000 (0.0000) [2022-10-11 06:05:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3544 (0.3337) loss 4.2634 (4.1967) grad_norm 0.0000 (0.0000) [2022-10-11 06:05:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [39/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3295 (0.3335) loss 4.0387 (4.2000) grad_norm 0.0000 (0.0000) [2022-10-11 06:06:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 39 training takes 0:06:56 [2022-10-11 06:06:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.332 (3.332) Loss 1.2775 (1.2775) Acc@1 71.094 (71.094) Acc@5 90.918 (90.918) [2022-10-11 06:06:15 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.438 Acc@5 88.760 [2022-10-11 06:06:15 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-11 06:06:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.44% [2022-10-11 06:06:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][0/1251] eta 1:07:09 lr 0.000001 time 3.2211 (3.2211) loss 4.0333 (4.0333) grad_norm 0.0000 (0.0000) [2022-10-11 06:06:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3048 (0.3683) loss 4.1551 (4.1625) grad_norm 0.0000 (0.0000) [2022-10-11 06:07:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3230 (0.3502) loss 4.3225 (4.1591) grad_norm 0.0000 (0.0000) [2022-10-11 06:07:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3407 (0.3441) loss 4.4174 (4.1684) grad_norm 0.0000 (0.0000) [2022-10-11 06:08:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3507 (0.3410) loss 4.4180 (4.1735) grad_norm 0.0000 (0.0000) [2022-10-11 06:09:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3338 (0.3389) loss 4.4159 (4.1815) grad_norm 0.0000 (0.0000) [2022-10-11 06:09:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3332 (0.3378) loss 3.7897 (4.1821) grad_norm 0.0000 (0.0000) [2022-10-11 06:10:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3351 (0.3368) loss 4.1125 (4.1895) grad_norm 0.0000 (0.0000) [2022-10-11 06:10:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3223 (0.3361) loss 3.9393 (4.1895) grad_norm 0.0000 (0.0000) [2022-10-11 06:11:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3017 (0.3357) loss 4.2353 (4.1907) grad_norm 0.0000 (0.0000) [2022-10-11 06:11:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3377 (0.3352) loss 4.0780 (4.1883) grad_norm 0.0000 (0.0000) [2022-10-11 06:12:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3468 (0.3349) loss 4.2423 (4.1889) grad_norm 0.0000 (0.0000) [2022-10-11 06:12:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [40/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3466 (0.3345) loss 4.1769 (4.1882) grad_norm 0.0000 (0.0000) [2022-10-11 06:13:13 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 40 training takes 0:06:58 [2022-10-11 06:13:13 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_40 saving...... [2022-10-11 06:13:13 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_40 saved !!! [2022-10-11 06:13:16 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.866 (2.866) Loss 1.3804 (1.3804) Acc@1 66.797 (66.797) Acc@5 88.867 (88.867) [2022-10-11 06:13:28 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.642 Acc@5 88.920 [2022-10-11 06:13:28 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.6% [2022-10-11 06:13:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.64% [2022-10-11 06:13:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][0/1251] eta 1:14:28 lr 0.000001 time 3.5723 (3.5723) loss 3.9211 (3.9211) grad_norm 0.0000 (0.0000) [2022-10-11 06:14:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3269 (0.3681) loss 4.1289 (4.1956) grad_norm 0.0000 (0.0000) [2022-10-11 06:14:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3348 (0.3504) loss 3.8696 (4.1911) grad_norm 0.0000 (0.0000) [2022-10-11 06:15:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3262 (0.3446) loss 4.3407 (4.1936) grad_norm 0.0000 (0.0000) [2022-10-11 06:15:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3145 (0.3411) loss 4.1685 (4.1898) grad_norm 0.0000 (0.0000) [2022-10-11 06:16:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3216 (0.3390) loss 4.3068 (4.1861) grad_norm 0.0000 (0.0000) [2022-10-11 06:16:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3285 (0.3376) loss 4.2969 (4.1847) grad_norm 0.0000 (0.0000) [2022-10-11 06:17:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3349 (0.3366) loss 4.4304 (4.1843) grad_norm 0.0000 (0.0000) [2022-10-11 06:17:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3434 (0.3359) loss 4.1766 (4.1815) grad_norm 0.0000 (0.0000) [2022-10-11 06:18:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3235 (0.3356) loss 4.1863 (4.1852) grad_norm 0.0000 (0.0000) [2022-10-11 06:19:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3298 (0.3351) loss 4.0815 (4.1869) grad_norm 0.0000 (0.0000) [2022-10-11 06:19:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3272 (0.3346) loss 4.2676 (4.1877) grad_norm 0.0000 (0.0000) [2022-10-11 06:20:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [41/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3137 (0.3344) loss 3.9277 (4.1872) grad_norm 0.0000 (0.0000) [2022-10-11 06:20:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 41 training takes 0:06:57 [2022-10-11 06:20:30 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.294 (3.294) Loss 1.3303 (1.3303) Acc@1 68.555 (68.555) Acc@5 88.770 (88.770) [2022-10-11 06:20:42 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.684 Acc@5 89.186 [2022-10-11 06:20:42 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-11 06:20:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.68% [2022-10-11 06:20:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][0/1251] eta 1:09:42 lr 0.000001 time 3.3437 (3.3437) loss 4.4410 (4.4410) grad_norm 0.0000 (0.0000) [2022-10-11 06:21:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3265 (0.3663) loss 4.3635 (4.1573) grad_norm 0.0000 (0.0000) [2022-10-11 06:21:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3140 (0.3495) loss 3.8496 (4.1360) grad_norm 0.0000 (0.0000) [2022-10-11 06:22:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3113 (0.3430) loss 4.0748 (4.1483) grad_norm 0.0000 (0.0000) [2022-10-11 06:22:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3340 (0.3407) loss 3.9615 (4.1487) grad_norm 0.0000 (0.0000) [2022-10-11 06:23:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3151 (0.3389) loss 4.4736 (4.1508) grad_norm 0.0000 (0.0000) [2022-10-11 06:24:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3047 (0.3374) loss 3.9001 (4.1515) grad_norm 0.0000 (0.0000) [2022-10-11 06:24:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3429 (0.3365) loss 4.3031 (4.1554) grad_norm 0.0000 (0.0000) [2022-10-11 06:25:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3160 (0.3358) loss 4.0752 (4.1579) grad_norm 0.0000 (0.0000) [2022-10-11 06:25:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3652 (0.3351) loss 4.2257 (4.1633) grad_norm 0.0000 (0.0000) [2022-10-11 06:26:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3415 (0.3345) loss 4.0542 (4.1631) grad_norm 0.0000 (0.0000) [2022-10-11 06:26:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3222 (0.3341) loss 4.1603 (4.1631) grad_norm 0.0000 (0.0000) [2022-10-11 06:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [42/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3211 (0.3340) loss 4.2794 (4.1617) grad_norm 0.0000 (0.0000) [2022-10-11 06:27:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 42 training takes 0:06:57 [2022-10-11 06:27:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.454 (3.454) Loss 1.2630 (1.2630) Acc@1 71.680 (71.680) Acc@5 89.746 (89.746) [2022-10-11 06:27:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 68.662 Acc@5 89.254 [2022-10-11 06:27:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-11 06:27:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 68.68% [2022-10-11 06:27:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][0/1251] eta 1:15:50 lr 0.000001 time 3.6373 (3.6373) loss 4.0883 (4.0883) grad_norm 0.0000 (0.0000) [2022-10-11 06:28:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3015 (0.3664) loss 4.0549 (4.1705) grad_norm 0.0000 (0.0000) [2022-10-11 06:29:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3265 (0.3497) loss 4.3050 (4.1641) grad_norm 0.0000 (0.0000) [2022-10-11 06:29:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3309 (0.3434) loss 3.9316 (4.1651) grad_norm 0.0000 (0.0000) [2022-10-11 06:30:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3429 (0.3405) loss 4.2882 (4.1594) grad_norm 0.0000 (0.0000) [2022-10-11 06:30:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3530 (0.3386) loss 4.3722 (4.1647) grad_norm 0.0000 (0.0000) [2022-10-11 06:31:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3205 (0.3373) loss 4.1366 (4.1650) grad_norm 0.0000 (0.0000) [2022-10-11 06:31:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3454 (0.3363) loss 4.1235 (4.1627) grad_norm 0.0000 (0.0000) [2022-10-11 06:32:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3128 (0.3358) loss 4.2450 (4.1603) grad_norm 0.0000 (0.0000) [2022-10-11 06:32:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3490 (0.3351) loss 4.1354 (4.1589) grad_norm 0.0000 (0.0000) [2022-10-11 06:33:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3342 (0.3346) loss 4.1713 (4.1571) grad_norm 0.0000 (0.0000) [2022-10-11 06:34:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3607 (0.3343) loss 4.2126 (4.1595) grad_norm 0.0000 (0.0000) [2022-10-11 06:34:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [43/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3409 (0.3341) loss 3.8347 (4.1575) grad_norm 0.0000 (0.0000) [2022-10-11 06:34:52 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 43 training takes 0:06:57 [2022-10-11 06:34:56 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.543 (3.543) Loss 1.2258 (1.2258) Acc@1 69.434 (69.434) Acc@5 92.285 (92.285) [2022-10-11 06:35:08 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.088 Acc@5 89.368 [2022-10-11 06:35:08 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.1% [2022-10-11 06:35:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.09% [2022-10-11 06:35:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][0/1251] eta 1:13:13 lr 0.000001 time 3.5122 (3.5122) loss 3.9164 (3.9164) grad_norm 0.0000 (0.0000) [2022-10-11 06:35:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3235 (0.3674) loss 4.1502 (4.1638) grad_norm 0.0000 (0.0000) [2022-10-11 06:36:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3231 (0.3502) loss 3.8913 (4.1358) grad_norm 0.0000 (0.0000) [2022-10-11 06:36:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3427 (0.3439) loss 4.0757 (4.1350) grad_norm 0.0000 (0.0000) [2022-10-11 06:37:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3076 (0.3406) loss 4.3785 (4.1500) grad_norm 0.0000 (0.0000) [2022-10-11 06:37:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3323 (0.3387) loss 4.2776 (4.1497) grad_norm 0.0000 (0.0000) [2022-10-11 06:38:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3119 (0.3372) loss 4.0990 (4.1497) grad_norm 0.0000 (0.0000) [2022-10-11 06:39:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3488 (0.3363) loss 3.8594 (4.1461) grad_norm 0.0000 (0.0000) [2022-10-11 06:39:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3553 (0.3355) loss 4.1009 (4.1418) grad_norm 0.0000 (0.0000) [2022-10-11 06:40:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3245 (0.3348) loss 4.1676 (4.1417) grad_norm 0.0000 (0.0000) [2022-10-11 06:40:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3361 (0.3345) loss 4.1674 (4.1449) grad_norm 0.0000 (0.0000) [2022-10-11 06:41:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3458 (0.3341) loss 4.2899 (4.1460) grad_norm 0.0000 (0.0000) [2022-10-11 06:41:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [44/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3171 (0.3339) loss 4.0840 (4.1443) grad_norm 0.0000 (0.0000) [2022-10-11 06:42:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 44 training takes 0:06:57 [2022-10-11 06:42:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.415 (3.415) Loss 1.4513 (1.4513) Acc@1 66.699 (66.699) Acc@5 86.621 (86.621) [2022-10-11 06:42:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.304 Acc@5 89.322 [2022-10-11 06:42:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.3% [2022-10-11 06:42:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.30% [2022-10-11 06:42:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][0/1251] eta 1:20:51 lr 0.000001 time 3.8784 (3.8784) loss 4.3593 (4.3593) grad_norm 0.0000 (0.0000) [2022-10-11 06:42:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3442 (0.3685) loss 4.3593 (4.1565) grad_norm 0.0000 (0.0000) [2022-10-11 06:43:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3241 (0.3496) loss 4.0692 (4.1488) grad_norm 0.0000 (0.0000) [2022-10-11 06:44:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3302 (0.3432) loss 4.2885 (4.1455) grad_norm 0.0000 (0.0000) [2022-10-11 06:44:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3550 (0.3399) loss 4.2522 (4.1410) grad_norm 0.0000 (0.0000) [2022-10-11 06:45:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3568 (0.3381) loss 4.3340 (4.1458) grad_norm 0.0000 (0.0000) [2022-10-11 06:45:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3077 (0.3365) loss 3.9515 (4.1392) grad_norm 0.0000 (0.0000) [2022-10-11 06:46:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3238 (0.3357) loss 4.3858 (4.1429) grad_norm 0.0000 (0.0000) [2022-10-11 06:46:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3033 (0.3351) loss 4.2962 (4.1436) grad_norm 0.0000 (0.0000) [2022-10-11 06:47:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3258 (0.3347) loss 3.8334 (4.1408) grad_norm 0.0000 (0.0000) [2022-10-11 06:47:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3233 (0.3343) loss 4.1566 (4.1395) grad_norm 0.0000 (0.0000) [2022-10-11 06:48:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3189 (0.3340) loss 4.2984 (4.1380) grad_norm 0.0000 (0.0000) [2022-10-11 06:49:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [45/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3347 (0.3336) loss 4.1551 (4.1340) grad_norm 0.0000 (0.0000) [2022-10-11 06:49:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 45 training takes 0:06:56 [2022-10-11 06:49:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.344 (3.344) Loss 1.3874 (1.3874) Acc@1 68.164 (68.164) Acc@5 88.574 (88.574) [2022-10-11 06:49:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.176 Acc@5 89.574 [2022-10-11 06:49:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.2% [2022-10-11 06:49:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.30% [2022-10-11 06:49:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][0/1251] eta 1:10:58 lr 0.000001 time 3.4040 (3.4040) loss 4.0207 (4.0207) grad_norm 0.0000 (0.0000) [2022-10-11 06:50:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3543 (0.3679) loss 4.3762 (4.1330) grad_norm 0.0000 (0.0000) [2022-10-11 06:50:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3592 (0.3499) loss 4.2212 (4.1326) grad_norm 0.0000 (0.0000) [2022-10-11 06:51:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3389 (0.3435) loss 4.1339 (4.1237) grad_norm 0.0000 (0.0000) [2022-10-11 06:51:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3175 (0.3405) loss 3.9868 (4.1258) grad_norm 0.0000 (0.0000) [2022-10-11 06:52:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3359 (0.3386) loss 4.1311 (4.1306) grad_norm 0.0000 (0.0000) [2022-10-11 06:52:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3485 (0.3372) loss 4.1433 (4.1288) grad_norm 0.0000 (0.0000) [2022-10-11 06:53:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3319 (0.3364) loss 4.2238 (4.1292) grad_norm 0.0000 (0.0000) [2022-10-11 06:54:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3188 (0.3357) loss 3.9170 (4.1272) grad_norm 0.0000 (0.0000) [2022-10-11 06:54:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3285 (0.3350) loss 4.3995 (4.1261) grad_norm 0.0000 (0.0000) [2022-10-11 06:55:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3284 (0.3347) loss 4.0908 (4.1255) grad_norm 0.0000 (0.0000) [2022-10-11 06:55:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3746 (0.3344) loss 4.1123 (4.1265) grad_norm 0.0000 (0.0000) [2022-10-11 06:56:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [46/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3217 (0.3340) loss 4.2169 (4.1298) grad_norm 0.0000 (0.0000) [2022-10-11 06:56:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 46 training takes 0:06:57 [2022-10-11 06:56:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.317 (3.317) Loss 1.3026 (1.3026) Acc@1 70.801 (70.801) Acc@5 89.355 (89.355) [2022-10-11 06:56:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.558 Acc@5 89.662 [2022-10-11 06:56:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.6% [2022-10-11 06:56:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.56% [2022-10-11 06:56:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][0/1251] eta 1:11:20 lr 0.000001 time 3.4216 (3.4216) loss 4.2246 (4.2246) grad_norm 0.0000 (0.0000) [2022-10-11 06:57:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3477 (0.3695) loss 4.0482 (4.1161) grad_norm 0.0000 (0.0000) [2022-10-11 06:57:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3239 (0.3520) loss 4.0518 (4.1197) grad_norm 0.0000 (0.0000) [2022-10-11 06:58:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3323 (0.3457) loss 4.1936 (4.1165) grad_norm 0.0000 (0.0000) [2022-10-11 06:59:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3317 (0.3421) loss 4.1218 (4.1198) grad_norm 0.0000 (0.0000) [2022-10-11 06:59:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3288 (0.3400) loss 3.9878 (4.1205) grad_norm 0.0000 (0.0000) [2022-10-11 07:00:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3373 (0.3386) loss 3.9836 (4.1201) grad_norm 0.0000 (0.0000) [2022-10-11 07:00:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3343 (0.3375) loss 4.2829 (4.1184) grad_norm 0.0000 (0.0000) [2022-10-11 07:01:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][800/1251] eta 0:02:31 lr 0.000001 time 0.4098 (0.3367) loss 4.3037 (4.1173) grad_norm 0.0000 (0.0000) [2022-10-11 07:01:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3269 (0.3360) loss 3.8873 (4.1160) grad_norm 0.0000 (0.0000) [2022-10-11 07:02:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3049 (0.3354) loss 4.2349 (4.1158) grad_norm 0.0000 (0.0000) [2022-10-11 07:02:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3359 (0.3352) loss 4.2063 (4.1188) grad_norm 0.0000 (0.0000) [2022-10-11 07:03:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [47/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3329 (0.3348) loss 4.1330 (4.1198) grad_norm 0.0000 (0.0000) [2022-10-11 07:03:44 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 47 training takes 0:06:58 [2022-10-11 07:03:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.701 (3.701) Loss 1.2924 (1.2924) Acc@1 69.238 (69.238) Acc@5 89.941 (89.941) [2022-10-11 07:03:59 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.778 Acc@5 89.752 [2022-10-11 07:03:59 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-10-11 07:03:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.78% [2022-10-11 07:04:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][0/1251] eta 1:13:50 lr 0.000001 time 3.5416 (3.5416) loss 3.7991 (3.7991) grad_norm 0.0000 (0.0000) [2022-10-11 07:04:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3205 (0.3680) loss 4.1301 (4.0894) grad_norm 0.0000 (0.0000) [2022-10-11 07:05:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3216 (0.3493) loss 4.2973 (4.0963) grad_norm 0.0000 (0.0000) [2022-10-11 07:05:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3432 (0.3440) loss 4.2701 (4.1027) grad_norm 0.0000 (0.0000) [2022-10-11 07:06:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3089 (0.3407) loss 4.1340 (4.1133) grad_norm 0.0000 (0.0000) [2022-10-11 07:06:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3506 (0.3391) loss 4.3139 (4.1076) grad_norm 0.0000 (0.0000) [2022-10-11 07:07:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3486 (0.3377) loss 4.0715 (4.1130) grad_norm 0.0000 (0.0000) [2022-10-11 07:07:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3518 (0.3366) loss 4.2815 (4.1126) grad_norm 0.0000 (0.0000) [2022-10-11 07:08:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3212 (0.3359) loss 4.1268 (4.1126) grad_norm 0.0000 (0.0000) [2022-10-11 07:09:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3082 (0.3353) loss 4.3116 (4.1129) grad_norm 0.0000 (0.0000) [2022-10-11 07:09:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3321 (0.3348) loss 4.0470 (4.1152) grad_norm 0.0000 (0.0000) [2022-10-11 07:10:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3282 (0.3344) loss 4.0323 (4.1151) grad_norm 0.0000 (0.0000) [2022-10-11 07:10:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [48/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3222 (0.3341) loss 4.1936 (4.1137) grad_norm 0.0000 (0.0000) [2022-10-11 07:10:57 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 48 training takes 0:06:57 [2022-10-11 07:11:00 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.378 (3.378) Loss 1.2935 (1.2935) Acc@1 67.090 (67.090) Acc@5 90.039 (90.039) [2022-10-11 07:11:12 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 69.918 Acc@5 89.708 [2022-10-11 07:11:12 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 69.9% [2022-10-11 07:11:12 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 69.92% [2022-10-11 07:11:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][0/1251] eta 1:16:31 lr 0.000001 time 3.6702 (3.6702) loss 3.8665 (3.8665) grad_norm 0.0000 (0.0000) [2022-10-11 07:11:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3203 (0.3674) loss 3.9881 (4.0781) grad_norm 0.0000 (0.0000) [2022-10-11 07:12:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3341 (0.3504) loss 4.2344 (4.0865) grad_norm 0.0000 (0.0000) [2022-10-11 07:12:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3336 (0.3440) loss 3.6160 (4.0918) grad_norm 0.0000 (0.0000) [2022-10-11 07:13:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3216 (0.3412) loss 4.1967 (4.0940) grad_norm 0.0000 (0.0000) [2022-10-11 07:14:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3117 (0.3394) loss 4.2451 (4.1015) grad_norm 0.0000 (0.0000) [2022-10-11 07:14:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3200 (0.3378) loss 4.2720 (4.0990) grad_norm 0.0000 (0.0000) [2022-10-11 07:15:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3781 (0.3367) loss 3.9630 (4.1004) grad_norm 0.0000 (0.0000) [2022-10-11 07:15:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3368 (0.3358) loss 4.0718 (4.1008) grad_norm 0.0000 (0.0000) [2022-10-11 07:16:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3294 (0.3353) loss 3.7161 (4.1025) grad_norm 0.0000 (0.0000) [2022-10-11 07:16:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3354 (0.3350) loss 4.2641 (4.1044) grad_norm 0.0000 (0.0000) [2022-10-11 07:17:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3326 (0.3347) loss 3.8090 (4.1056) grad_norm 0.0000 (0.0000) [2022-10-11 07:17:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [49/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3423 (0.3345) loss 4.3925 (4.1053) grad_norm 0.0000 (0.0000) [2022-10-11 07:18:11 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 49 training takes 0:06:58 [2022-10-11 07:18:14 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.356 (3.356) Loss 1.3739 (1.3739) Acc@1 67.480 (67.480) Acc@5 89.062 (89.062) [2022-10-11 07:18:26 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.106 Acc@5 89.926 [2022-10-11 07:18:26 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.1% [2022-10-11 07:18:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.11% [2022-10-11 07:18:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][0/1251] eta 1:12:07 lr 0.000001 time 3.4592 (3.4592) loss 4.0561 (4.0561) grad_norm 0.0000 (0.0000) [2022-10-11 07:19:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3504 (0.3652) loss 4.2352 (4.0838) grad_norm 0.0000 (0.0000) [2022-10-11 07:19:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3394 (0.3484) loss 4.1209 (4.0826) grad_norm 0.0000 (0.0000) [2022-10-11 07:20:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3359 (0.3427) loss 3.5601 (4.0769) grad_norm 0.0000 (0.0000) [2022-10-11 07:20:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3085 (0.3395) loss 4.3033 (4.0823) grad_norm 0.0000 (0.0000) [2022-10-11 07:21:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3335 (0.3373) loss 4.0803 (4.0878) grad_norm 0.0000 (0.0000) [2022-10-11 07:21:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3313 (0.3361) loss 4.0787 (4.0954) grad_norm 0.0000 (0.0000) [2022-10-11 07:22:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3185 (0.3352) loss 4.2314 (4.0928) grad_norm 0.0000 (0.0000) [2022-10-11 07:22:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3005 (0.3349) loss 3.7800 (4.0959) grad_norm 0.0000 (0.0000) [2022-10-11 07:23:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3257 (0.3346) loss 4.0001 (4.0973) grad_norm 0.0000 (0.0000) [2022-10-11 07:24:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3328 (0.3342) loss 4.0392 (4.0972) grad_norm 0.0000 (0.0000) [2022-10-11 07:24:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3332 (0.3342) loss 4.3341 (4.0964) grad_norm 0.0000 (0.0000) [2022-10-11 07:25:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [50/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3439 (0.3340) loss 4.1601 (4.0987) grad_norm 0.0000 (0.0000) [2022-10-11 07:25:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 50 training takes 0:06:57 [2022-10-11 07:25:23 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_50 saving...... [2022-10-11 07:25:24 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_50 saved !!! [2022-10-11 07:25:27 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.963 (2.963) Loss 1.3194 (1.3194) Acc@1 68.164 (68.164) Acc@5 90.137 (90.137) [2022-10-11 07:25:39 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.040 Acc@5 89.916 [2022-10-11 07:25:39 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-10-11 07:25:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.11% [2022-10-11 07:25:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][0/1251] eta 1:11:44 lr 0.000001 time 3.4405 (3.4405) loss 3.8515 (3.8515) grad_norm 0.0000 (0.0000) [2022-10-11 07:26:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3269 (0.3681) loss 4.1915 (4.0692) grad_norm 0.0000 (0.0000) [2022-10-11 07:26:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3443 (0.3507) loss 4.1403 (4.0853) grad_norm 0.0000 (0.0000) [2022-10-11 07:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3350 (0.3447) loss 4.2395 (4.0804) grad_norm 0.0000 (0.0000) [2022-10-11 07:27:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3352 (0.3412) loss 4.2036 (4.0837) grad_norm 0.0000 (0.0000) [2022-10-11 07:28:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3349 (0.3390) loss 4.2186 (4.0815) grad_norm 0.0000 (0.0000) [2022-10-11 07:29:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3223 (0.3377) loss 4.0820 (4.0825) grad_norm 0.0000 (0.0000) [2022-10-11 07:29:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3282 (0.3367) loss 3.7410 (4.0831) grad_norm 0.0000 (0.0000) [2022-10-11 07:30:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3609 (0.3364) loss 3.9215 (4.0792) grad_norm 0.0000 (0.0000) [2022-10-11 07:30:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3244 (0.3358) loss 4.0510 (4.0804) grad_norm 0.0000 (0.0000) [2022-10-11 07:31:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3026 (0.3353) loss 4.1806 (4.0796) grad_norm 0.0000 (0.0000) [2022-10-11 07:31:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3206 (0.3351) loss 4.0414 (4.0806) grad_norm 0.0000 (0.0000) [2022-10-11 07:32:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [51/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3184 (0.3347) loss 4.0639 (4.0823) grad_norm 0.0000 (0.0000) [2022-10-11 07:32:37 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 51 training takes 0:06:58 [2022-10-11 07:32:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.302 (3.302) Loss 1.1604 (1.1604) Acc@1 71.191 (71.191) Acc@5 92.090 (92.090) [2022-10-11 07:32:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.370 Acc@5 90.160 [2022-10-11 07:32:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-10-11 07:32:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.37% [2022-10-11 07:32:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][0/1251] eta 1:13:35 lr 0.000001 time 3.5295 (3.5295) loss 3.9925 (3.9925) grad_norm 0.0000 (0.0000) [2022-10-11 07:33:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3390 (0.3677) loss 4.2514 (4.0620) grad_norm 0.0000 (0.0000) [2022-10-11 07:34:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3336 (0.3497) loss 4.1661 (4.0674) grad_norm 0.0000 (0.0000) [2022-10-11 07:34:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3305 (0.3433) loss 4.1890 (4.0650) grad_norm 0.0000 (0.0000) [2022-10-11 07:35:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3143 (0.3401) loss 4.2416 (4.0729) grad_norm 0.0000 (0.0000) [2022-10-11 07:35:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3141 (0.3379) loss 4.0838 (4.0728) grad_norm 0.0000 (0.0000) [2022-10-11 07:36:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3084 (0.3368) loss 3.9393 (4.0735) grad_norm 0.0000 (0.0000) [2022-10-11 07:36:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3289 (0.3357) loss 4.1412 (4.0727) grad_norm 0.0000 (0.0000) [2022-10-11 07:37:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3361 (0.3349) loss 4.5789 (4.0726) grad_norm 0.0000 (0.0000) [2022-10-11 07:37:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3427 (0.3343) loss 4.1957 (4.0749) grad_norm 0.0000 (0.0000) [2022-10-11 07:38:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3534 (0.3340) loss 4.1738 (4.0759) grad_norm 0.0000 (0.0000) [2022-10-11 07:39:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3688 (0.3336) loss 4.2226 (4.0767) grad_norm 0.0000 (0.0000) [2022-10-11 07:39:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [52/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3142 (0.3333) loss 3.9024 (4.0747) grad_norm 0.0000 (0.0000) [2022-10-11 07:39:49 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 52 training takes 0:06:56 [2022-10-11 07:39:53 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.554 (3.554) Loss 1.2321 (1.2321) Acc@1 70.898 (70.898) Acc@5 91.309 (91.309) [2022-10-11 07:40:05 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.468 Acc@5 90.166 [2022-10-11 07:40:05 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-10-11 07:40:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.47% [2022-10-11 07:40:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][0/1251] eta 1:12:37 lr 0.000001 time 3.4832 (3.4832) loss 3.9606 (3.9606) grad_norm 0.0000 (0.0000) [2022-10-11 07:40:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3139 (0.3661) loss 4.1718 (4.0578) grad_norm 0.0000 (0.0000) [2022-10-11 07:41:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3291 (0.3487) loss 4.1628 (4.0418) grad_norm 0.0000 (0.0000) [2022-10-11 07:41:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3366 (0.3426) loss 3.8339 (4.0512) grad_norm 0.0000 (0.0000) [2022-10-11 07:42:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3404 (0.3401) loss 3.9410 (4.0609) grad_norm 0.0000 (0.0000) [2022-10-11 07:42:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3292 (0.3382) loss 4.1524 (4.0644) grad_norm 0.0000 (0.0000) [2022-10-11 07:43:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3117 (0.3368) loss 3.9975 (4.0675) grad_norm 0.0000 (0.0000) [2022-10-11 07:44:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3205 (0.3359) loss 4.1478 (4.0635) grad_norm 0.0000 (0.0000) [2022-10-11 07:44:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3496 (0.3352) loss 3.8368 (4.0654) grad_norm 0.0000 (0.0000) [2022-10-11 07:45:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3381 (0.3351) loss 3.9780 (4.0640) grad_norm 0.0000 (0.0000) [2022-10-11 07:45:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3528 (0.3345) loss 4.1536 (4.0643) grad_norm 0.0000 (0.0000) [2022-10-11 07:46:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3291 (0.3344) loss 3.8464 (4.0658) grad_norm 0.0000 (0.0000) [2022-10-11 07:46:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [53/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3547 (0.3343) loss 4.4099 (4.0665) grad_norm 0.0000 (0.0000) [2022-10-11 07:47:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 53 training takes 0:06:57 [2022-10-11 07:47:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.220 (3.220) Loss 1.2872 (1.2872) Acc@1 71.875 (71.875) Acc@5 88.867 (88.867) [2022-10-11 07:47:18 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.508 Acc@5 90.210 [2022-10-11 07:47:18 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-10-11 07:47:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.51% [2022-10-11 07:47:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][0/1251] eta 1:04:46 lr 0.000001 time 3.1070 (3.1070) loss 3.8099 (3.8099) grad_norm 0.0000 (0.0000) [2022-10-11 07:47:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3055 (0.3663) loss 4.2830 (4.0559) grad_norm 0.0000 (0.0000) [2022-10-11 07:48:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3321 (0.3489) loss 3.9279 (4.0664) grad_norm 0.0000 (0.0000) [2022-10-11 07:49:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3466 (0.3433) loss 3.9495 (4.0730) grad_norm 0.0000 (0.0000) [2022-10-11 07:49:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3288 (0.3398) loss 4.3257 (4.0662) grad_norm 0.0000 (0.0000) [2022-10-11 07:50:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3336 (0.3377) loss 3.9787 (4.0615) grad_norm 0.0000 (0.0000) [2022-10-11 07:50:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3509 (0.3364) loss 3.9793 (4.0652) grad_norm 0.0000 (0.0000) [2022-10-11 07:51:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3194 (0.3352) loss 4.0402 (4.0672) grad_norm 0.0000 (0.0000) [2022-10-11 07:51:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3165 (0.3347) loss 4.2995 (4.0687) grad_norm 0.0000 (0.0000) [2022-10-11 07:52:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3172 (0.3342) loss 3.8996 (4.0684) grad_norm 0.0000 (0.0000) [2022-10-11 07:52:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3272 (0.3336) loss 4.0382 (4.0668) grad_norm 0.0000 (0.0000) [2022-10-11 07:53:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3581 (0.3333) loss 3.8892 (4.0695) grad_norm 0.0000 (0.0000) [2022-10-11 07:53:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [54/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3252 (0.3332) loss 3.9314 (4.0678) grad_norm 0.0000 (0.0000) [2022-10-11 07:54:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 54 training takes 0:06:56 [2022-10-11 07:54:18 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.503 (3.503) Loss 1.2974 (1.2974) Acc@1 70.605 (70.605) Acc@5 89.453 (89.453) [2022-10-11 07:54:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.394 Acc@5 90.260 [2022-10-11 07:54:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-10-11 07:54:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.51% [2022-10-11 07:54:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][0/1251] eta 1:07:25 lr 0.000001 time 3.2336 (3.2336) loss 4.0036 (4.0036) grad_norm 0.0000 (0.0000) [2022-10-11 07:55:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3470 (0.3636) loss 4.1787 (4.0143) grad_norm 0.0000 (0.0000) [2022-10-11 07:55:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3390 (0.3478) loss 3.9934 (4.0519) grad_norm 0.0000 (0.0000) [2022-10-11 07:56:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3204 (0.3419) loss 4.1972 (4.0502) grad_norm 0.0000 (0.0000) [2022-10-11 07:56:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3419 (0.3392) loss 3.9885 (4.0485) grad_norm 0.0000 (0.0000) [2022-10-11 07:57:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3357 (0.3376) loss 4.0950 (4.0539) grad_norm 0.0000 (0.0000) [2022-10-11 07:57:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3401 (0.3365) loss 4.1094 (4.0597) grad_norm 0.0000 (0.0000) [2022-10-11 07:58:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3337 (0.3357) loss 4.0812 (4.0631) grad_norm 0.0000 (0.0000) [2022-10-11 07:58:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3566 (0.3351) loss 3.9005 (4.0650) grad_norm 0.0000 (0.0000) [2022-10-11 07:59:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3296 (0.3347) loss 4.1101 (4.0677) grad_norm 0.0000 (0.0000) [2022-10-11 08:00:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3129 (0.3341) loss 4.0656 (4.0682) grad_norm 0.0000 (0.0000) [2022-10-11 08:00:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3116 (0.3338) loss 3.9459 (4.0684) grad_norm 0.0000 (0.0000) [2022-10-11 08:01:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [55/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3308 (0.3335) loss 4.1759 (4.0675) grad_norm 0.0000 (0.0000) [2022-10-11 08:01:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 55 training takes 0:06:56 [2022-10-11 08:01:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.334 (3.334) Loss 1.2279 (1.2279) Acc@1 70.312 (70.312) Acc@5 91.211 (91.211) [2022-10-11 08:01:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.668 Acc@5 90.462 [2022-10-11 08:01:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.7% [2022-10-11 08:01:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.67% [2022-10-11 08:01:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][0/1251] eta 1:07:48 lr 0.000001 time 3.2525 (3.2525) loss 4.3275 (4.3275) grad_norm 0.0000 (0.0000) [2022-10-11 08:02:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3399 (0.3671) loss 4.0044 (4.0455) grad_norm 0.0000 (0.0000) [2022-10-11 08:02:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3235 (0.3495) loss 4.2220 (4.0563) grad_norm 0.0000 (0.0000) [2022-10-11 08:03:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3295 (0.3447) loss 4.2606 (4.0628) grad_norm 0.0000 (0.0000) [2022-10-11 08:03:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3129 (0.3411) loss 4.0139 (4.0646) grad_norm 0.0000 (0.0000) [2022-10-11 08:04:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3540 (0.3392) loss 4.2609 (4.0664) grad_norm 0.0000 (0.0000) [2022-10-11 08:05:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3482 (0.3380) loss 4.0737 (4.0633) grad_norm 0.0000 (0.0000) [2022-10-11 08:05:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3299 (0.3370) loss 3.7949 (4.0665) grad_norm 0.0000 (0.0000) [2022-10-11 08:06:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3361 (0.3361) loss 3.9093 (4.0657) grad_norm 0.0000 (0.0000) [2022-10-11 08:06:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3177 (0.3355) loss 4.1913 (4.0659) grad_norm 0.0000 (0.0000) [2022-10-11 08:07:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3123 (0.3348) loss 4.2158 (4.0657) grad_norm 0.0000 (0.0000) [2022-10-11 08:07:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3299 (0.3346) loss 3.7811 (4.0645) grad_norm 0.0000 (0.0000) [2022-10-11 08:08:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [56/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3578 (0.3346) loss 4.0909 (4.0654) grad_norm 0.0000 (0.0000) [2022-10-11 08:08:40 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 56 training takes 0:06:58 [2022-10-11 08:08:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.400 (3.400) Loss 1.2396 (1.2396) Acc@1 70.801 (70.801) Acc@5 90.332 (90.332) [2022-10-11 08:08:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.596 Acc@5 90.280 [2022-10-11 08:08:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.6% [2022-10-11 08:08:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.67% [2022-10-11 08:08:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][0/1251] eta 1:15:48 lr 0.000001 time 3.6356 (3.6356) loss 3.8109 (3.8109) grad_norm 0.0000 (0.0000) [2022-10-11 08:09:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3270 (0.3668) loss 4.0983 (3.9888) grad_norm 0.0000 (0.0000) [2022-10-11 08:10:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3353 (0.3495) loss 3.7769 (4.0091) grad_norm 0.0000 (0.0000) [2022-10-11 08:10:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3262 (0.3430) loss 3.9916 (4.0258) grad_norm 0.0000 (0.0000) [2022-10-11 08:11:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3047 (0.3399) loss 3.9756 (4.0254) grad_norm 0.0000 (0.0000) [2022-10-11 08:11:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3288 (0.3377) loss 4.1372 (4.0256) grad_norm 0.0000 (0.0000) [2022-10-11 08:12:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][600/1251] eta 0:03:38 lr 0.000001 time 0.2967 (0.3364) loss 4.0524 (4.0300) grad_norm 0.0000 (0.0000) [2022-10-11 08:12:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3164 (0.3355) loss 4.2145 (4.0324) grad_norm 0.0000 (0.0000) [2022-10-11 08:13:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3521 (0.3348) loss 4.3180 (4.0364) grad_norm 0.0000 (0.0000) [2022-10-11 08:13:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3210 (0.3343) loss 3.9685 (4.0327) grad_norm 0.0000 (0.0000) [2022-10-11 08:14:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3425 (0.3338) loss 3.9956 (4.0377) grad_norm 0.0000 (0.0000) [2022-10-11 08:15:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3140 (0.3336) loss 4.1323 (4.0363) grad_norm 0.0000 (0.0000) [2022-10-11 08:15:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [57/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3397 (0.3332) loss 3.6471 (4.0362) grad_norm 0.0000 (0.0000) [2022-10-11 08:15:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 57 training takes 0:06:56 [2022-10-11 08:15:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.703 (2.703) Loss 1.1974 (1.1974) Acc@1 72.852 (72.852) Acc@5 91.016 (91.016) [2022-10-11 08:16:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 70.926 Acc@5 90.334 [2022-10-11 08:16:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-10-11 08:16:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 70.93% [2022-10-11 08:16:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][0/1251] eta 1:06:47 lr 0.000001 time 3.2034 (3.2034) loss 3.9405 (3.9405) grad_norm 0.0000 (0.0000) [2022-10-11 08:16:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3260 (0.3649) loss 4.3947 (4.0074) grad_norm 0.0000 (0.0000) [2022-10-11 08:17:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3273 (0.3476) loss 3.8423 (4.0458) grad_norm 0.0000 (0.0000) [2022-10-11 08:17:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3474 (0.3424) loss 3.9624 (4.0386) grad_norm 0.0000 (0.0000) [2022-10-11 08:18:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3262 (0.3396) loss 3.9968 (4.0418) grad_norm 0.0000 (0.0000) [2022-10-11 08:18:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3211 (0.3378) loss 3.9235 (4.0399) grad_norm 0.0000 (0.0000) [2022-10-11 08:19:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3132 (0.3364) loss 4.0252 (4.0416) grad_norm 0.0000 (0.0000) [2022-10-11 08:20:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3011 (0.3353) loss 3.9279 (4.0409) grad_norm 0.0000 (0.0000) [2022-10-11 08:20:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3382 (0.3346) loss 3.7854 (4.0432) grad_norm 0.0000 (0.0000) [2022-10-11 08:21:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3321 (0.3341) loss 3.8743 (4.0472) grad_norm 0.0000 (0.0000) [2022-10-11 08:21:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3316 (0.3337) loss 4.1934 (4.0465) grad_norm 0.0000 (0.0000) [2022-10-11 08:22:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3500 (0.3334) loss 3.7867 (4.0471) grad_norm 0.0000 (0.0000) [2022-10-11 08:22:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [58/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3266 (0.3332) loss 3.9952 (4.0458) grad_norm 0.0000 (0.0000) [2022-10-11 08:23:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 58 training takes 0:06:56 [2022-10-11 08:23:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.423 (3.423) Loss 1.1585 (1.1585) Acc@1 71.875 (71.875) Acc@5 91.406 (91.406) [2022-10-11 08:23:18 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.226 Acc@5 90.478 [2022-10-11 08:23:18 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-11 08:23:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.23% [2022-10-11 08:23:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][0/1251] eta 1:15:47 lr 0.000001 time 3.6348 (3.6348) loss 4.1725 (4.1725) grad_norm 0.0000 (0.0000) [2022-10-11 08:23:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3168 (0.3655) loss 4.0742 (4.0598) grad_norm 0.0000 (0.0000) [2022-10-11 08:24:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3131 (0.3475) loss 4.2017 (4.0521) grad_norm 0.0000 (0.0000) [2022-10-11 08:25:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3195 (0.3416) loss 4.0899 (4.0431) grad_norm 0.0000 (0.0000) [2022-10-11 08:25:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3599 (0.3388) loss 4.1988 (4.0348) grad_norm 0.0000 (0.0000) [2022-10-11 08:26:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3548 (0.3372) loss 3.7799 (4.0362) grad_norm 0.0000 (0.0000) [2022-10-11 08:26:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3355 (0.3363) loss 3.8194 (4.0381) grad_norm 0.0000 (0.0000) [2022-10-11 08:27:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3280 (0.3355) loss 4.3644 (4.0387) grad_norm 0.0000 (0.0000) [2022-10-11 08:27:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3085 (0.3347) loss 4.0312 (4.0422) grad_norm 0.0000 (0.0000) [2022-10-11 08:28:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3021 (0.3341) loss 4.0596 (4.0398) grad_norm 0.0000 (0.0000) [2022-10-11 08:28:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3490 (0.3339) loss 4.0682 (4.0426) grad_norm 0.0000 (0.0000) [2022-10-11 08:29:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3615 (0.3336) loss 3.9003 (4.0448) grad_norm 0.0000 (0.0000) [2022-10-11 08:29:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [59/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3550 (0.3332) loss 4.1898 (4.0455) grad_norm 0.0000 (0.0000) [2022-10-11 08:30:15 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 59 training takes 0:06:56 [2022-10-11 08:30:18 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.322 (3.322) Loss 1.1618 (1.1618) Acc@1 74.023 (74.023) Acc@5 90.625 (90.625) [2022-10-11 08:30:30 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.300 Acc@5 90.650 [2022-10-11 08:30:30 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.3% [2022-10-11 08:30:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.30% [2022-10-11 08:30:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][0/1251] eta 1:08:14 lr 0.000001 time 3.2728 (3.2728) loss 3.9456 (3.9456) grad_norm 0.0000 (0.0000) [2022-10-11 08:31:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3345 (0.3665) loss 4.0875 (4.0082) grad_norm 0.0000 (0.0000) [2022-10-11 08:31:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3172 (0.3489) loss 4.2240 (4.0210) grad_norm 0.0000 (0.0000) [2022-10-11 08:32:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3384 (0.3430) loss 4.2983 (4.0176) grad_norm 0.0000 (0.0000) [2022-10-11 08:32:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3083 (0.3397) loss 4.1330 (4.0270) grad_norm 0.0000 (0.0000) [2022-10-11 08:33:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3260 (0.3375) loss 3.9647 (4.0245) grad_norm 0.0000 (0.0000) [2022-10-11 08:33:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3224 (0.3368) loss 4.2314 (4.0273) grad_norm 0.0000 (0.0000) [2022-10-11 08:34:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3223 (0.3360) loss 4.1494 (4.0246) grad_norm 0.0000 (0.0000) [2022-10-11 08:34:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3217 (0.3354) loss 4.0221 (4.0267) grad_norm 0.0000 (0.0000) [2022-10-11 08:35:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3353 (0.3347) loss 4.3287 (4.0233) grad_norm 0.0000 (0.0000) [2022-10-11 08:36:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3532 (0.3343) loss 3.9378 (4.0259) grad_norm 0.0000 (0.0000) [2022-10-11 08:36:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3244 (0.3338) loss 4.0321 (4.0281) grad_norm 0.0000 (0.0000) [2022-10-11 08:37:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [60/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3576 (0.3337) loss 4.1499 (4.0286) grad_norm 0.0000 (0.0000) [2022-10-11 08:37:27 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 60 training takes 0:06:57 [2022-10-11 08:37:27 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_60 saving...... [2022-10-11 08:37:27 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_60 saved !!! [2022-10-11 08:37:30 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.087 (3.087) Loss 1.1713 (1.1713) Acc@1 71.777 (71.777) Acc@5 90.430 (90.430) [2022-10-11 08:37:42 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.296 Acc@5 90.644 [2022-10-11 08:37:42 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.3% [2022-10-11 08:37:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.30% [2022-10-11 08:37:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][0/1251] eta 1:12:37 lr 0.000001 time 3.4833 (3.4833) loss 3.9204 (3.9204) grad_norm 0.0000 (0.0000) [2022-10-11 08:38:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3249 (0.3646) loss 3.9249 (4.0037) grad_norm 0.0000 (0.0000) [2022-10-11 08:38:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3472 (0.3460) loss 3.8353 (4.0007) grad_norm 0.0000 (0.0000) [2022-10-11 08:39:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3246 (0.3407) loss 4.3152 (4.0146) grad_norm 0.0000 (0.0000) [2022-10-11 08:39:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3191 (0.3380) loss 4.1932 (4.0209) grad_norm 0.0000 (0.0000) [2022-10-11 08:40:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3325 (0.3365) loss 3.9345 (4.0236) grad_norm 0.0000 (0.0000) [2022-10-11 08:41:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3270 (0.3353) loss 3.7970 (4.0259) grad_norm 0.0000 (0.0000) [2022-10-11 08:41:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3325 (0.3345) loss 4.1137 (4.0267) grad_norm 0.0000 (0.0000) [2022-10-11 08:42:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3051 (0.3340) loss 3.7921 (4.0299) grad_norm 0.0000 (0.0000) [2022-10-11 08:42:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3371 (0.3336) loss 4.0823 (4.0255) grad_norm 0.0000 (0.0000) [2022-10-11 08:43:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3323 (0.3334) loss 3.9217 (4.0269) grad_norm 0.0000 (0.0000) [2022-10-11 08:43:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3698 (0.3332) loss 4.0829 (4.0263) grad_norm 0.0000 (0.0000) [2022-10-11 08:44:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [61/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3327 (0.3333) loss 3.8496 (4.0264) grad_norm 0.0000 (0.0000) [2022-10-11 08:44:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 61 training takes 0:06:56 [2022-10-11 08:44:42 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.230 (3.230) Loss 1.2045 (1.2045) Acc@1 71.191 (71.191) Acc@5 90.723 (90.723) [2022-10-11 08:44:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.566 Acc@5 90.734 [2022-10-11 08:44:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-10-11 08:44:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.57% [2022-10-11 08:44:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][0/1251] eta 1:12:41 lr 0.000001 time 3.4868 (3.4868) loss 3.9847 (3.9847) grad_norm 0.0000 (0.0000) [2022-10-11 08:45:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3074 (0.3656) loss 3.8975 (4.0320) grad_norm 0.0000 (0.0000) [2022-10-11 08:46:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3182 (0.3481) loss 4.3432 (4.0194) grad_norm 0.0000 (0.0000) [2022-10-11 08:46:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3333 (0.3423) loss 3.8076 (4.0170) grad_norm 0.0000 (0.0000) [2022-10-11 08:47:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3131 (0.3396) loss 4.5079 (4.0133) grad_norm 0.0000 (0.0000) [2022-10-11 08:47:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3279 (0.3382) loss 4.1998 (4.0123) grad_norm 0.0000 (0.0000) [2022-10-11 08:48:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3224 (0.3371) loss 3.9072 (4.0126) grad_norm 0.0000 (0.0000) [2022-10-11 08:48:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3783 (0.3365) loss 4.0225 (4.0110) grad_norm 0.0000 (0.0000) [2022-10-11 08:49:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3206 (0.3359) loss 4.2223 (4.0127) grad_norm 0.0000 (0.0000) [2022-10-11 08:49:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3350 (0.3355) loss 3.8078 (4.0125) grad_norm 0.0000 (0.0000) [2022-10-11 08:50:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3376 (0.3352) loss 3.9537 (4.0147) grad_norm 0.0000 (0.0000) [2022-10-11 08:51:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3270 (0.3347) loss 4.3565 (4.0156) grad_norm 0.0000 (0.0000) [2022-10-11 08:51:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [62/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3246 (0.3344) loss 3.8611 (4.0193) grad_norm 0.0000 (0.0000) [2022-10-11 08:51:52 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 62 training takes 0:06:58 [2022-10-11 08:51:56 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.541 (3.541) Loss 1.3211 (1.3211) Acc@1 70.508 (70.508) Acc@5 88.574 (88.574) [2022-10-11 08:52:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.414 Acc@5 90.744 [2022-10-11 08:52:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-10-11 08:52:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.57% [2022-10-11 08:52:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][0/1251] eta 1:14:58 lr 0.000001 time 3.5959 (3.5959) loss 4.2165 (4.2165) grad_norm 0.0000 (0.0000) [2022-10-11 08:52:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3470 (0.3679) loss 4.0420 (4.0005) grad_norm 0.0000 (0.0000) [2022-10-11 08:53:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3159 (0.3498) loss 3.9009 (4.0111) grad_norm 0.0000 (0.0000) [2022-10-11 08:53:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3560 (0.3432) loss 4.1280 (4.0082) grad_norm 0.0000 (0.0000) [2022-10-11 08:54:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3209 (0.3406) loss 4.0319 (4.0114) grad_norm 0.0000 (0.0000) [2022-10-11 08:54:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3476 (0.3387) loss 3.9624 (4.0108) grad_norm 0.0000 (0.0000) [2022-10-11 08:55:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3159 (0.3376) loss 3.9486 (4.0103) grad_norm 0.0000 (0.0000) [2022-10-11 08:56:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3224 (0.3367) loss 4.3089 (4.0149) grad_norm 0.0000 (0.0000) [2022-10-11 08:56:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3293 (0.3362) loss 4.2431 (4.0122) grad_norm 0.0000 (0.0000) [2022-10-11 08:57:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3546 (0.3359) loss 3.6365 (4.0105) grad_norm 0.0000 (0.0000) [2022-10-11 08:57:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3398 (0.3356) loss 4.0459 (4.0093) grad_norm 0.0000 (0.0000) [2022-10-11 08:58:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3113 (0.3354) loss 4.0604 (4.0102) grad_norm 0.0000 (0.0000) [2022-10-11 08:58:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [63/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3229 (0.3352) loss 3.8228 (4.0132) grad_norm 0.0000 (0.0000) [2022-10-11 08:59:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 63 training takes 0:06:59 [2022-10-11 08:59:10 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.375 (3.375) Loss 1.2294 (1.2294) Acc@1 73.047 (73.047) Acc@5 89.844 (89.844) [2022-10-11 08:59:22 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.252 Acc@5 90.632 [2022-10-11 08:59:22 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.3% [2022-10-11 08:59:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.57% [2022-10-11 08:59:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][0/1251] eta 1:16:18 lr 0.000001 time 3.6598 (3.6598) loss 3.9961 (3.9961) grad_norm 0.0000 (0.0000) [2022-10-11 08:59:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3377 (0.3664) loss 4.0235 (3.9885) grad_norm 0.0000 (0.0000) [2022-10-11 09:00:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3304 (0.3477) loss 4.2630 (3.9943) grad_norm 0.0000 (0.0000) [2022-10-11 09:01:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3206 (0.3409) loss 4.0826 (3.9990) grad_norm 0.0000 (0.0000) [2022-10-11 09:01:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3308 (0.3379) loss 4.1234 (3.9993) grad_norm 0.0000 (0.0000) [2022-10-11 09:02:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3480 (0.3362) loss 4.2302 (4.0000) grad_norm 0.0000 (0.0000) [2022-10-11 09:02:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3484 (0.3350) loss 4.2712 (4.0056) grad_norm 0.0000 (0.0000) [2022-10-11 09:03:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3662 (0.3345) loss 3.9152 (4.0066) grad_norm 0.0000 (0.0000) [2022-10-11 09:03:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3332 (0.3343) loss 4.2805 (4.0048) grad_norm 0.0000 (0.0000) [2022-10-11 09:04:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3295 (0.3339) loss 3.8132 (4.0092) grad_norm 0.0000 (0.0000) [2022-10-11 09:04:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3222 (0.3339) loss 3.9967 (4.0117) grad_norm 0.0000 (0.0000) [2022-10-11 09:05:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3219 (0.3336) loss 4.1956 (4.0124) grad_norm 0.0000 (0.0000) [2022-10-11 09:06:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [64/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3445 (0.3334) loss 4.3790 (4.0105) grad_norm 0.0000 (0.0000) [2022-10-11 09:06:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 64 training takes 0:06:56 [2022-10-11 09:06:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.914 (2.914) Loss 1.2081 (1.2081) Acc@1 70.605 (70.605) Acc@5 90.527 (90.527) [2022-10-11 09:06:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.726 Acc@5 90.938 [2022-10-11 09:06:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-10-11 09:06:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.73% [2022-10-11 09:06:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][0/1251] eta 1:14:30 lr 0.000001 time 3.5734 (3.5734) loss 4.0076 (4.0076) grad_norm 0.0000 (0.0000) [2022-10-11 09:07:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3229 (0.3662) loss 4.1857 (3.9963) grad_norm 0.0000 (0.0000) [2022-10-11 09:07:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3382 (0.3486) loss 3.7715 (3.9909) grad_norm 0.0000 (0.0000) [2022-10-11 09:08:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3536 (0.3427) loss 4.1170 (4.0026) grad_norm 0.0000 (0.0000) [2022-10-11 09:08:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3188 (0.3401) loss 4.2133 (4.0025) grad_norm 0.0000 (0.0000) [2022-10-11 09:09:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3287 (0.3386) loss 3.6553 (4.0038) grad_norm 0.0000 (0.0000) [2022-10-11 09:09:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3240 (0.3373) loss 4.0076 (4.0028) grad_norm 0.0000 (0.0000) [2022-10-11 09:10:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3143 (0.3366) loss 3.9855 (4.0023) grad_norm 0.0000 (0.0000) [2022-10-11 09:11:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3171 (0.3360) loss 4.0546 (4.0045) grad_norm 0.0000 (0.0000) [2022-10-11 09:11:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3572 (0.3356) loss 3.9910 (4.0062) grad_norm 0.0000 (0.0000) [2022-10-11 09:12:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.2958 (0.3350) loss 4.0420 (4.0068) grad_norm 0.0000 (0.0000) [2022-10-11 09:12:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3270 (0.3348) loss 3.7278 (4.0073) grad_norm 0.0000 (0.0000) [2022-10-11 09:13:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [65/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3057 (0.3348) loss 3.8962 (4.0060) grad_norm 0.0000 (0.0000) [2022-10-11 09:13:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 65 training takes 0:06:58 [2022-10-11 09:13:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.290 (3.290) Loss 1.2313 (1.2313) Acc@1 70.996 (70.996) Acc@5 89.941 (89.941) [2022-10-11 09:13:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.496 Acc@5 90.750 [2022-10-11 09:13:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-10-11 09:13:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.73% [2022-10-11 09:13:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][0/1251] eta 1:13:02 lr 0.000001 time 3.5036 (3.5036) loss 3.6032 (3.6032) grad_norm 0.0000 (0.0000) [2022-10-11 09:14:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3523 (0.3651) loss 3.9553 (4.0084) grad_norm 0.0000 (0.0000) [2022-10-11 09:14:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3225 (0.3469) loss 4.0734 (4.0117) grad_norm 0.0000 (0.0000) [2022-10-11 09:15:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3105 (0.3413) loss 4.1032 (4.0164) grad_norm 0.0000 (0.0000) [2022-10-11 09:16:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3320 (0.3382) loss 3.9430 (4.0108) grad_norm 0.0000 (0.0000) [2022-10-11 09:16:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3414 (0.3367) loss 4.1781 (4.0050) grad_norm 0.0000 (0.0000) [2022-10-11 09:17:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3369 (0.3358) loss 4.0718 (4.0074) grad_norm 0.0000 (0.0000) [2022-10-11 09:17:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3198 (0.3348) loss 4.0493 (4.0061) grad_norm 0.0000 (0.0000) [2022-10-11 09:18:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3422 (0.3345) loss 3.9702 (4.0062) grad_norm 0.0000 (0.0000) [2022-10-11 09:18:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3309 (0.3341) loss 3.8995 (4.0045) grad_norm 0.0000 (0.0000) [2022-10-11 09:19:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3153 (0.3339) loss 3.7866 (4.0058) grad_norm 0.0000 (0.0000) [2022-10-11 09:19:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3348 (0.3337) loss 4.1797 (4.0036) grad_norm 0.0000 (0.0000) [2022-10-11 09:20:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [66/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3444 (0.3335) loss 4.1470 (4.0031) grad_norm 0.0000 (0.0000) [2022-10-11 09:20:44 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 66 training takes 0:06:56 [2022-10-11 09:20:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.431 (3.431) Loss 1.1394 (1.1394) Acc@1 72.363 (72.363) Acc@5 92.676 (92.676) [2022-10-11 09:21:00 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.776 Acc@5 90.836 [2022-10-11 09:21:00 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.8% [2022-10-11 09:21:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.78% [2022-10-11 09:21:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][0/1251] eta 1:07:20 lr 0.000001 time 3.2302 (3.2302) loss 3.9910 (3.9910) grad_norm 0.0000 (0.0000) [2022-10-11 09:21:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3250 (0.3671) loss 3.8735 (3.9890) grad_norm 0.0000 (0.0000) [2022-10-11 09:22:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3242 (0.3484) loss 3.9656 (3.9911) grad_norm 0.0000 (0.0000) [2022-10-11 09:22:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3384 (0.3426) loss 3.8248 (3.9959) grad_norm 0.0000 (0.0000) [2022-10-11 09:23:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3344 (0.3399) loss 3.9582 (3.9982) grad_norm 0.0000 (0.0000) [2022-10-11 09:23:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3442 (0.3382) loss 3.9704 (3.9991) grad_norm 0.0000 (0.0000) [2022-10-11 09:24:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3262 (0.3368) loss 3.9250 (3.9987) grad_norm 0.0000 (0.0000) [2022-10-11 09:24:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3175 (0.3358) loss 4.0074 (3.9982) grad_norm 0.0000 (0.0000) [2022-10-11 09:25:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3518 (0.3351) loss 4.1178 (3.9966) grad_norm 0.0000 (0.0000) [2022-10-11 09:26:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3240 (0.3349) loss 4.0196 (4.0004) grad_norm 0.0000 (0.0000) [2022-10-11 09:26:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3097 (0.3346) loss 3.9498 (4.0023) grad_norm 0.0000 (0.0000) [2022-10-11 09:27:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3453 (0.3344) loss 3.5297 (4.0011) grad_norm 0.0000 (0.0000) [2022-10-11 09:27:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [67/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3851 (0.3342) loss 4.1483 (4.0004) grad_norm 0.0000 (0.0000) [2022-10-11 09:27:57 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 67 training takes 0:06:57 [2022-10-11 09:28:01 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.193 (3.193) Loss 1.3310 (1.3310) Acc@1 69.629 (69.629) Acc@5 88.574 (88.574) [2022-10-11 09:28:13 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.620 Acc@5 91.004 [2022-10-11 09:28:13 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-10-11 09:28:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 71.78% [2022-10-11 09:28:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][0/1251] eta 1:11:31 lr 0.000001 time 3.4304 (3.4304) loss 3.9272 (3.9272) grad_norm 0.0000 (0.0000) [2022-10-11 09:28:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3099 (0.3657) loss 3.7789 (3.9803) grad_norm 0.0000 (0.0000) [2022-10-11 09:29:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3298 (0.3485) loss 4.1759 (3.9886) grad_norm 0.0000 (0.0000) [2022-10-11 09:29:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3358 (0.3425) loss 4.0599 (3.9921) grad_norm 0.0000 (0.0000) [2022-10-11 09:30:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3392 (0.3400) loss 4.0979 (3.9942) grad_norm 0.0000 (0.0000) [2022-10-11 09:31:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3620 (0.3383) loss 4.1701 (3.9918) grad_norm 0.0000 (0.0000) [2022-10-11 09:31:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3159 (0.3372) loss 3.9936 (3.9919) grad_norm 0.0000 (0.0000) [2022-10-11 09:32:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3296 (0.3364) loss 4.1542 (3.9953) grad_norm 0.0000 (0.0000) [2022-10-11 09:32:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3307 (0.3359) loss 3.9461 (3.9927) grad_norm 0.0000 (0.0000) [2022-10-11 09:33:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3115 (0.3353) loss 3.9670 (3.9886) grad_norm 0.0000 (0.0000) [2022-10-11 09:33:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3605 (0.3352) loss 4.2604 (3.9914) grad_norm 0.0000 (0.0000) [2022-10-11 09:34:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3198 (0.3352) loss 3.6149 (3.9915) grad_norm 0.0000 (0.0000) [2022-10-11 09:34:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [68/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3300 (0.3350) loss 3.8149 (3.9941) grad_norm 0.0000 (0.0000) [2022-10-11 09:35:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 68 training takes 0:06:58 [2022-10-11 09:35:15 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.431 (3.431) Loss 1.2073 (1.2073) Acc@1 71.289 (71.289) Acc@5 90.527 (90.527) [2022-10-11 09:35:27 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.070 Acc@5 91.078 [2022-10-11 09:35:27 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-10-11 09:35:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.07% [2022-10-11 09:35:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][0/1251] eta 1:14:53 lr 0.000001 time 3.5920 (3.5920) loss 4.0900 (4.0900) grad_norm 0.0000 (0.0000) [2022-10-11 09:36:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3191 (0.3676) loss 3.9283 (3.9922) grad_norm 0.0000 (0.0000) [2022-10-11 09:36:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3592 (0.3486) loss 4.1479 (3.9879) grad_norm 0.0000 (0.0000) [2022-10-11 09:37:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3244 (0.3430) loss 4.1376 (3.9925) grad_norm 0.0000 (0.0000) [2022-10-11 09:37:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3426 (0.3407) loss 4.2035 (3.9889) grad_norm 0.0000 (0.0000) [2022-10-11 09:38:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3250 (0.3385) loss 4.1480 (3.9867) grad_norm 0.0000 (0.0000) [2022-10-11 09:38:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3040 (0.3372) loss 4.1932 (3.9895) grad_norm 0.0000 (0.0000) [2022-10-11 09:39:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3181 (0.3363) loss 4.1029 (3.9939) grad_norm 0.0000 (0.0000) [2022-10-11 09:39:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3270 (0.3360) loss 3.9650 (3.9904) grad_norm 0.0000 (0.0000) [2022-10-11 09:40:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3236 (0.3358) loss 4.0000 (3.9937) grad_norm 0.0000 (0.0000) [2022-10-11 09:41:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3222 (0.3355) loss 4.0696 (3.9953) grad_norm 0.0000 (0.0000) [2022-10-11 09:41:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3130 (0.3352) loss 3.8968 (3.9984) grad_norm 0.0000 (0.0000) [2022-10-11 09:42:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [69/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3267 (0.3349) loss 4.0365 (3.9957) grad_norm 0.0000 (0.0000) [2022-10-11 09:42:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 69 training takes 0:06:58 [2022-10-11 09:42:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.383 (3.383) Loss 1.2369 (1.2369) Acc@1 71.484 (71.484) Acc@5 89.941 (89.941) [2022-10-11 09:42:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.304 Acc@5 91.086 [2022-10-11 09:42:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.3% [2022-10-11 09:42:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.30% [2022-10-11 09:42:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][0/1251] eta 1:16:19 lr 0.000001 time 3.6607 (3.6607) loss 4.0943 (4.0943) grad_norm 0.0000 (0.0000) [2022-10-11 09:43:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3293 (0.3661) loss 3.9304 (3.9847) grad_norm 0.0000 (0.0000) [2022-10-11 09:43:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3313 (0.3484) loss 3.6178 (3.9862) grad_norm 0.0000 (0.0000) [2022-10-11 09:44:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3433 (0.3422) loss 3.7719 (3.9751) grad_norm 0.0000 (0.0000) [2022-10-11 09:44:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3100 (0.3394) loss 3.6440 (3.9783) grad_norm 0.0000 (0.0000) [2022-10-11 09:45:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3304 (0.3374) loss 3.9868 (3.9778) grad_norm 0.0000 (0.0000) [2022-10-11 09:46:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3583 (0.3362) loss 3.8705 (3.9755) grad_norm 0.0000 (0.0000) [2022-10-11 09:46:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3244 (0.3355) loss 3.8843 (3.9790) grad_norm 0.0000 (0.0000) [2022-10-11 09:47:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3123 (0.3349) loss 4.3262 (3.9790) grad_norm 0.0000 (0.0000) [2022-10-11 09:47:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3397 (0.3346) loss 4.0751 (3.9815) grad_norm 0.0000 (0.0000) [2022-10-11 09:48:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3183 (0.3339) loss 4.0848 (3.9822) grad_norm 0.0000 (0.0000) [2022-10-11 09:48:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3273 (0.3336) loss 4.3672 (3.9846) grad_norm 0.0000 (0.0000) [2022-10-11 09:49:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [70/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3311 (0.3334) loss 3.9520 (3.9852) grad_norm 0.0000 (0.0000) [2022-10-11 09:49:38 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 70 training takes 0:06:56 [2022-10-11 09:49:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_70 saving...... [2022-10-11 09:49:38 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_70 saved !!! [2022-10-11 09:49:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.738 (2.738) Loss 1.1158 (1.1158) Acc@1 71.387 (71.387) Acc@5 92.773 (92.773) [2022-10-11 09:49:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.860 Acc@5 91.176 [2022-10-11 09:49:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-10-11 09:49:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.30% [2022-10-11 09:49:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][0/1251] eta 1:13:08 lr 0.000001 time 3.5078 (3.5078) loss 3.6926 (3.6926) grad_norm 0.0000 (0.0000) [2022-10-11 09:50:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3128 (0.3659) loss 4.0579 (3.9539) grad_norm 0.0000 (0.0000) [2022-10-11 09:51:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3455 (0.3475) loss 4.0586 (3.9595) grad_norm 0.0000 (0.0000) [2022-10-11 09:51:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3416 (0.3416) loss 3.7369 (3.9673) grad_norm 0.0000 (0.0000) [2022-10-11 09:52:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3528 (0.3392) loss 3.8244 (3.9632) grad_norm 0.0000 (0.0000) [2022-10-11 09:52:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3456 (0.3375) loss 4.0232 (3.9638) grad_norm 0.0000 (0.0000) [2022-10-11 09:53:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3161 (0.3364) loss 4.0878 (3.9663) grad_norm 0.0000 (0.0000) [2022-10-11 09:53:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3356 (0.3355) loss 3.8708 (3.9691) grad_norm 0.0000 (0.0000) [2022-10-11 09:54:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3444 (0.3351) loss 3.7876 (3.9696) grad_norm 0.0000 (0.0000) [2022-10-11 09:54:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3533 (0.3348) loss 4.0094 (3.9688) grad_norm 0.0000 (0.0000) [2022-10-11 09:55:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3085 (0.3341) loss 3.8692 (3.9709) grad_norm 0.0000 (0.0000) [2022-10-11 09:56:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3270 (0.3338) loss 4.1936 (3.9715) grad_norm 0.0000 (0.0000) [2022-10-11 09:56:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [71/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3368 (0.3338) loss 4.1133 (3.9736) grad_norm 0.0000 (0.0000) [2022-10-11 09:56:50 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 71 training takes 0:06:57 [2022-10-11 09:56:53 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.348 (3.348) Loss 1.1846 (1.1846) Acc@1 70.898 (70.898) Acc@5 90.625 (90.625) [2022-10-11 09:57:05 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 71.846 Acc@5 91.046 [2022-10-11 09:57:05 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 71.8% [2022-10-11 09:57:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.30% [2022-10-11 09:57:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][0/1251] eta 1:09:01 lr 0.000001 time 3.3107 (3.3107) loss 3.9525 (3.9525) grad_norm 0.0000 (0.0000) [2022-10-11 09:57:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3209 (0.3663) loss 3.8183 (3.9503) grad_norm 0.0000 (0.0000) [2022-10-11 09:58:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3048 (0.3476) loss 4.1705 (3.9597) grad_norm 0.0000 (0.0000) [2022-10-11 09:58:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3381 (0.3419) loss 4.0625 (3.9639) grad_norm 0.0000 (0.0000) [2022-10-11 09:59:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3285 (0.3386) loss 3.8320 (3.9669) grad_norm 0.0000 (0.0000) [2022-10-11 09:59:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3241 (0.3372) loss 4.0224 (3.9693) grad_norm 0.0000 (0.0000) [2022-10-11 10:00:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3241 (0.3362) loss 4.2459 (3.9672) grad_norm 0.0000 (0.0000) [2022-10-11 10:01:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3291 (0.3356) loss 4.1259 (3.9698) grad_norm 0.0000 (0.0000) [2022-10-11 10:01:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3362 (0.3351) loss 4.2018 (3.9719) grad_norm 0.0000 (0.0000) [2022-10-11 10:02:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3450 (0.3346) loss 3.6699 (3.9722) grad_norm 0.0000 (0.0000) [2022-10-11 10:02:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3356 (0.3343) loss 3.7977 (3.9716) grad_norm 0.0000 (0.0000) [2022-10-11 10:03:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3137 (0.3340) loss 4.2086 (3.9722) grad_norm 0.0000 (0.0000) [2022-10-11 10:03:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [72/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3021 (0.3337) loss 3.9196 (3.9739) grad_norm 0.0000 (0.0000) [2022-10-11 10:04:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 72 training takes 0:06:57 [2022-10-11 10:04:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.081 (3.081) Loss 1.2333 (1.2333) Acc@1 71.582 (71.582) Acc@5 90.430 (90.430) [2022-10-11 10:04:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.372 Acc@5 91.316 [2022-10-11 10:04:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-10-11 10:04:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.37% [2022-10-11 10:04:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][0/1251] eta 1:09:29 lr 0.000001 time 3.3332 (3.3332) loss 3.7152 (3.7152) grad_norm 0.0000 (0.0000) [2022-10-11 10:04:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3189 (0.3681) loss 3.8745 (3.9440) grad_norm 0.0000 (0.0000) [2022-10-11 10:05:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3115 (0.3505) loss 3.6145 (3.9495) grad_norm 0.0000 (0.0000) [2022-10-11 10:06:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3157 (0.3443) loss 4.3450 (3.9500) grad_norm 0.0000 (0.0000) [2022-10-11 10:06:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3364 (0.3411) loss 3.7595 (3.9500) grad_norm 0.0000 (0.0000) [2022-10-11 10:07:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3185 (0.3391) loss 4.0943 (3.9470) grad_norm 0.0000 (0.0000) [2022-10-11 10:07:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3445 (0.3382) loss 4.2997 (3.9520) grad_norm 0.0000 (0.0000) [2022-10-11 10:08:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3244 (0.3373) loss 4.0555 (3.9558) grad_norm 0.0000 (0.0000) [2022-10-11 10:08:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3319 (0.3368) loss 3.8764 (3.9551) grad_norm 0.0000 (0.0000) [2022-10-11 10:09:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3077 (0.3364) loss 4.0932 (3.9589) grad_norm 0.0000 (0.0000) [2022-10-11 10:09:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3485 (0.3360) loss 3.8819 (3.9594) grad_norm 0.0000 (0.0000) [2022-10-11 10:10:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3259 (0.3357) loss 3.8279 (3.9622) grad_norm 0.0000 (0.0000) [2022-10-11 10:11:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [73/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3473 (0.3354) loss 3.9365 (3.9643) grad_norm 0.0000 (0.0000) [2022-10-11 10:11:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 73 training takes 0:06:59 [2022-10-11 10:11:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.460 (3.460) Loss 1.1748 (1.1748) Acc@1 71.777 (71.777) Acc@5 91.016 (91.016) [2022-10-11 10:11:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.486 Acc@5 91.298 [2022-10-11 10:11:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-11 10:11:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.49% [2022-10-11 10:11:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][0/1251] eta 1:10:42 lr 0.000001 time 3.3912 (3.3912) loss 3.7123 (3.7123) grad_norm 0.0000 (0.0000) [2022-10-11 10:12:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3291 (0.3678) loss 4.1255 (3.9461) grad_norm 0.0000 (0.0000) [2022-10-11 10:12:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3281 (0.3496) loss 3.6947 (3.9578) grad_norm 0.0000 (0.0000) [2022-10-11 10:13:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3494 (0.3443) loss 3.9765 (3.9608) grad_norm 0.0000 (0.0000) [2022-10-11 10:13:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3152 (0.3405) loss 4.0786 (3.9665) grad_norm 0.0000 (0.0000) [2022-10-11 10:14:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3367 (0.3385) loss 4.2867 (3.9656) grad_norm 0.0000 (0.0000) [2022-10-11 10:14:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3360 (0.3374) loss 4.1137 (3.9685) grad_norm 0.0000 (0.0000) [2022-10-11 10:15:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3388 (0.3365) loss 4.2084 (3.9702) grad_norm 0.0000 (0.0000) [2022-10-11 10:16:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3177 (0.3361) loss 4.1098 (3.9682) grad_norm 0.0000 (0.0000) [2022-10-11 10:16:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3351 (0.3357) loss 4.1680 (3.9690) grad_norm 0.0000 (0.0000) [2022-10-11 10:17:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3404 (0.3351) loss 4.0829 (3.9693) grad_norm 0.0000 (0.0000) [2022-10-11 10:17:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3363 (0.3348) loss 3.8593 (3.9679) grad_norm 0.0000 (0.0000) [2022-10-11 10:18:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [74/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3418 (0.3347) loss 3.7918 (3.9691) grad_norm 0.0000 (0.0000) [2022-10-11 10:18:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 74 training takes 0:06:58 [2022-10-11 10:18:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.133 (3.133) Loss 1.1988 (1.1988) Acc@1 71.387 (71.387) Acc@5 90.820 (90.820) [2022-10-11 10:18:45 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.406 Acc@5 91.248 [2022-10-11 10:18:45 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-10-11 10:18:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.49% [2022-10-11 10:18:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][0/1251] eta 1:16:25 lr 0.000001 time 3.6653 (3.6653) loss 3.9022 (3.9022) grad_norm 0.0000 (0.0000) [2022-10-11 10:19:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3138 (0.3647) loss 4.2452 (3.9469) grad_norm 0.0000 (0.0000) [2022-10-11 10:19:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3269 (0.3477) loss 4.0993 (3.9517) grad_norm 0.0000 (0.0000) [2022-10-11 10:20:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3422 (0.3420) loss 3.7312 (3.9486) grad_norm 0.0000 (0.0000) [2022-10-11 10:21:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3093 (0.3387) loss 4.0722 (3.9584) grad_norm 0.0000 (0.0000) [2022-10-11 10:21:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3483 (0.3374) loss 3.8725 (3.9566) grad_norm 0.0000 (0.0000) [2022-10-11 10:22:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3242 (0.3364) loss 3.8678 (3.9567) grad_norm 0.0000 (0.0000) [2022-10-11 10:22:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3148 (0.3353) loss 3.9515 (3.9612) grad_norm 0.0000 (0.0000) [2022-10-11 10:23:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3245 (0.3347) loss 3.4319 (3.9631) grad_norm 0.0000 (0.0000) [2022-10-11 10:23:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3248 (0.3341) loss 3.9996 (3.9643) grad_norm 0.0000 (0.0000) [2022-10-11 10:24:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3107 (0.3336) loss 3.7701 (3.9607) grad_norm 0.0000 (0.0000) [2022-10-11 10:24:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3384 (0.3334) loss 4.1694 (3.9617) grad_norm 0.0000 (0.0000) [2022-10-11 10:25:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [75/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3542 (0.3332) loss 3.8543 (3.9612) grad_norm 0.0000 (0.0000) [2022-10-11 10:25:42 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 75 training takes 0:06:56 [2022-10-11 10:25:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.181 (3.181) Loss 1.2698 (1.2698) Acc@1 70.215 (70.215) Acc@5 89.648 (89.648) [2022-10-11 10:25:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.238 Acc@5 91.134 [2022-10-11 10:25:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-10-11 10:25:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.49% [2022-10-11 10:26:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][0/1251] eta 1:07:20 lr 0.000001 time 3.2300 (3.2300) loss 3.6987 (3.6987) grad_norm 0.0000 (0.0000) [2022-10-11 10:26:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3244 (0.3661) loss 4.0798 (3.9426) grad_norm 0.0000 (0.0000) [2022-10-11 10:27:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3464 (0.3482) loss 3.7509 (3.9425) grad_norm 0.0000 (0.0000) [2022-10-11 10:27:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3183 (0.3425) loss 4.1322 (3.9477) grad_norm 0.0000 (0.0000) [2022-10-11 10:28:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3421 (0.3400) loss 3.8414 (3.9474) grad_norm 0.0000 (0.0000) [2022-10-11 10:28:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3227 (0.3382) loss 3.9691 (3.9523) grad_norm 0.0000 (0.0000) [2022-10-11 10:29:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3136 (0.3369) loss 3.9072 (3.9529) grad_norm 0.0000 (0.0000) [2022-10-11 10:29:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3053 (0.3362) loss 4.3727 (3.9544) grad_norm 0.0000 (0.0000) [2022-10-11 10:30:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3464 (0.3357) loss 4.0203 (3.9524) grad_norm 0.0000 (0.0000) [2022-10-11 10:30:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3204 (0.3354) loss 4.2219 (3.9528) grad_norm 0.0000 (0.0000) [2022-10-11 10:31:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3172 (0.3354) loss 3.9745 (3.9543) grad_norm 0.0000 (0.0000) [2022-10-11 10:32:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3115 (0.3350) loss 4.0092 (3.9532) grad_norm 0.0000 (0.0000) [2022-10-11 10:32:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [76/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3189 (0.3347) loss 4.0622 (3.9527) grad_norm 0.0000 (0.0000) [2022-10-11 10:32:55 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 76 training takes 0:06:58 [2022-10-11 10:32:58 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.937 (2.937) Loss 1.0689 (1.0689) Acc@1 74.902 (74.902) Acc@5 90.820 (90.820) [2022-10-11 10:33:10 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.816 Acc@5 91.462 [2022-10-11 10:33:10 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-11 10:33:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.82% [2022-10-11 10:33:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][0/1251] eta 1:14:26 lr 0.000001 time 3.5707 (3.5707) loss 3.7093 (3.7093) grad_norm 0.0000 (0.0000) [2022-10-11 10:33:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3328 (0.3646) loss 3.7396 (3.9746) grad_norm 0.0000 (0.0000) [2022-10-11 10:34:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3385 (0.3477) loss 3.7804 (3.9602) grad_norm 0.0000 (0.0000) [2022-10-11 10:34:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3128 (0.3416) loss 3.7820 (3.9619) grad_norm 0.0000 (0.0000) [2022-10-11 10:35:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3604 (0.3387) loss 4.1145 (3.9570) grad_norm 0.0000 (0.0000) [2022-10-11 10:35:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3456 (0.3370) loss 3.5014 (3.9585) grad_norm 0.0000 (0.0000) [2022-10-11 10:36:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3147 (0.3354) loss 3.8007 (3.9570) grad_norm 0.0000 (0.0000) [2022-10-11 10:37:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3182 (0.3346) loss 3.8793 (3.9579) grad_norm 0.0000 (0.0000) [2022-10-11 10:37:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3317 (0.3341) loss 3.5962 (3.9609) grad_norm 0.0000 (0.0000) [2022-10-11 10:38:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3159 (0.3337) loss 3.8960 (3.9595) grad_norm 0.0000 (0.0000) [2022-10-11 10:38:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3145 (0.3332) loss 3.7364 (3.9571) grad_norm 0.0000 (0.0000) [2022-10-11 10:39:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3491 (0.3333) loss 4.2226 (3.9587) grad_norm 0.0000 (0.0000) [2022-10-11 10:39:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [77/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3309 (0.3333) loss 3.9695 (3.9584) grad_norm 0.0000 (0.0000) [2022-10-11 10:40:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 77 training takes 0:06:56 [2022-10-11 10:40:10 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.386 (3.386) Loss 1.2222 (1.2222) Acc@1 71.387 (71.387) Acc@5 91.113 (91.113) [2022-10-11 10:40:22 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.594 Acc@5 91.292 [2022-10-11 10:40:22 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-10-11 10:40:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.82% [2022-10-11 10:40:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][0/1251] eta 1:14:39 lr 0.000001 time 3.5804 (3.5804) loss 3.9245 (3.9245) grad_norm 0.0000 (0.0000) [2022-10-11 10:40:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3203 (0.3669) loss 4.1113 (3.9280) grad_norm 0.0000 (0.0000) [2022-10-11 10:41:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3627 (0.3495) loss 3.8729 (3.9297) grad_norm 0.0000 (0.0000) [2022-10-11 10:42:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3117 (0.3434) loss 3.7364 (3.9289) grad_norm 0.0000 (0.0000) [2022-10-11 10:42:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3273 (0.3405) loss 4.2473 (3.9306) grad_norm 0.0000 (0.0000) [2022-10-11 10:43:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3255 (0.3386) loss 4.2630 (3.9309) grad_norm 0.0000 (0.0000) [2022-10-11 10:43:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3032 (0.3370) loss 3.6833 (3.9310) grad_norm 0.0000 (0.0000) [2022-10-11 10:44:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3575 (0.3361) loss 4.0562 (3.9348) grad_norm 0.0000 (0.0000) [2022-10-11 10:44:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3386 (0.3354) loss 3.4759 (3.9370) grad_norm 0.0000 (0.0000) [2022-10-11 10:45:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3289 (0.3348) loss 3.7740 (3.9376) grad_norm 0.0000 (0.0000) [2022-10-11 10:45:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3238 (0.3344) loss 3.7304 (3.9386) grad_norm 0.0000 (0.0000) [2022-10-11 10:46:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3603 (0.3342) loss 3.7902 (3.9412) grad_norm 0.0000 (0.0000) [2022-10-11 10:47:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [78/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3303 (0.3338) loss 4.1150 (3.9420) grad_norm 0.0000 (0.0000) [2022-10-11 10:47:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 78 training takes 0:06:57 [2022-10-11 10:47:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.358 (3.358) Loss 1.1130 (1.1130) Acc@1 74.219 (74.219) Acc@5 92.090 (92.090) [2022-10-11 10:47:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.726 Acc@5 91.488 [2022-10-11 10:47:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-11 10:47:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.82% [2022-10-11 10:47:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][0/1251] eta 1:15:20 lr 0.000001 time 3.6135 (3.6135) loss 3.9187 (3.9187) grad_norm 0.0000 (0.0000) [2022-10-11 10:48:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3252 (0.3656) loss 3.7142 (3.8913) grad_norm 0.0000 (0.0000) [2022-10-11 10:48:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3377 (0.3487) loss 3.9451 (3.9071) grad_norm 0.0000 (0.0000) [2022-10-11 10:49:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3266 (0.3428) loss 4.0312 (3.9081) grad_norm 0.0000 (0.0000) [2022-10-11 10:49:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3279 (0.3395) loss 3.7987 (3.9151) grad_norm 0.0000 (0.0000) [2022-10-11 10:50:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3086 (0.3373) loss 3.8429 (3.9242) grad_norm 0.0000 (0.0000) [2022-10-11 10:50:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3188 (0.3361) loss 3.8560 (3.9291) grad_norm 0.0000 (0.0000) [2022-10-11 10:51:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3384 (0.3351) loss 3.9991 (3.9340) grad_norm 0.0000 (0.0000) [2022-10-11 10:52:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3189 (0.3346) loss 4.1720 (3.9359) grad_norm 0.0000 (0.0000) [2022-10-11 10:52:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3264 (0.3342) loss 3.5727 (3.9400) grad_norm 0.0000 (0.0000) [2022-10-11 10:53:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3266 (0.3340) loss 4.0500 (3.9420) grad_norm 0.0000 (0.0000) [2022-10-11 10:53:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3309 (0.3338) loss 4.0567 (3.9432) grad_norm 0.0000 (0.0000) [2022-10-11 10:54:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [79/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3222 (0.3336) loss 3.9958 (3.9435) grad_norm 0.0000 (0.0000) [2022-10-11 10:54:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 79 training takes 0:06:57 [2022-10-11 10:54:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.316 (3.316) Loss 1.1002 (1.1002) Acc@1 74.219 (74.219) Acc@5 93.359 (93.359) [2022-10-11 10:54:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.832 Acc@5 91.504 [2022-10-11 10:54:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-11 10:54:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.83% [2022-10-11 10:54:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][0/1251] eta 1:16:45 lr 0.000001 time 3.6814 (3.6814) loss 3.9139 (3.9139) grad_norm 0.0000 (0.0000) [2022-10-11 10:55:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3108 (0.3651) loss 4.0780 (3.9123) grad_norm 0.0000 (0.0000) [2022-10-11 10:55:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3064 (0.3478) loss 4.1036 (3.9266) grad_norm 0.0000 (0.0000) [2022-10-11 10:56:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3374 (0.3423) loss 3.7679 (3.9230) grad_norm 0.0000 (0.0000) [2022-10-11 10:57:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3307 (0.3389) loss 4.0338 (3.9243) grad_norm 0.0000 (0.0000) [2022-10-11 10:57:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3180 (0.3369) loss 4.1283 (3.9255) grad_norm 0.0000 (0.0000) [2022-10-11 10:58:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3274 (0.3358) loss 3.9789 (3.9258) grad_norm 0.0000 (0.0000) [2022-10-11 10:58:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3055 (0.3350) loss 3.7414 (3.9243) grad_norm 0.0000 (0.0000) [2022-10-11 10:59:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3301 (0.3345) loss 3.8219 (3.9263) grad_norm 0.0000 (0.0000) [2022-10-11 10:59:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3237 (0.3342) loss 3.8905 (3.9230) grad_norm 0.0000 (0.0000) [2022-10-11 11:00:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3409 (0.3338) loss 3.9859 (3.9247) grad_norm 0.0000 (0.0000) [2022-10-11 11:00:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3354 (0.3334) loss 3.9703 (3.9242) grad_norm 0.0000 (0.0000) [2022-10-11 11:01:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [80/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3491 (0.3330) loss 4.1993 (3.9272) grad_norm 0.0000 (0.0000) [2022-10-11 11:01:43 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 80 training takes 0:06:56 [2022-10-11 11:01:43 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_80 saving...... [2022-10-11 11:01:43 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_80 saved !!! [2022-10-11 11:01:46 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.029 (3.029) Loss 1.0971 (1.0971) Acc@1 74.707 (74.707) Acc@5 92.285 (92.285) [2022-10-11 11:01:58 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.960 Acc@5 91.484 [2022-10-11 11:01:58 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-11 11:01:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.96% [2022-10-11 11:02:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][0/1251] eta 1:09:27 lr 0.000001 time 3.3310 (3.3310) loss 4.0352 (4.0352) grad_norm 0.0000 (0.0000) [2022-10-11 11:02:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3378 (0.3645) loss 4.0026 (3.9166) grad_norm 0.0000 (0.0000) [2022-10-11 11:03:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3136 (0.3479) loss 3.7258 (3.9189) grad_norm 0.0000 (0.0000) [2022-10-11 11:03:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3452 (0.3423) loss 3.9715 (3.9165) grad_norm 0.0000 (0.0000) [2022-10-11 11:04:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3192 (0.3398) loss 4.1297 (3.9220) grad_norm 0.0000 (0.0000) [2022-10-11 11:04:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3011 (0.3380) loss 4.0318 (3.9265) grad_norm 0.0000 (0.0000) [2022-10-11 11:05:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3591 (0.3368) loss 3.9914 (3.9268) grad_norm 0.0000 (0.0000) [2022-10-11 11:05:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3355 (0.3358) loss 3.8714 (3.9284) grad_norm 0.0000 (0.0000) [2022-10-11 11:06:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3245 (0.3354) loss 3.9863 (3.9291) grad_norm 0.0000 (0.0000) [2022-10-11 11:06:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3643 (0.3348) loss 4.1647 (3.9290) grad_norm 0.0000 (0.0000) [2022-10-11 11:07:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3618 (0.3346) loss 3.7193 (3.9289) grad_norm 0.0000 (0.0000) [2022-10-11 11:08:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3359 (0.3345) loss 4.1292 (3.9284) grad_norm 0.0000 (0.0000) [2022-10-11 11:08:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [81/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3401 (0.3346) loss 3.8758 (3.9298) grad_norm 0.0000 (0.0000) [2022-10-11 11:08:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 81 training takes 0:06:58 [2022-10-11 11:08:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.029 (3.029) Loss 1.1112 (1.1112) Acc@1 73.633 (73.633) Acc@5 91.992 (91.992) [2022-10-11 11:09:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.750 Acc@5 91.418 [2022-10-11 11:09:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-11 11:09:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.96% [2022-10-11 11:09:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][0/1251] eta 1:09:47 lr 0.000001 time 3.3474 (3.3474) loss 3.6895 (3.6895) grad_norm 0.0000 (0.0000) [2022-10-11 11:09:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3118 (0.3649) loss 3.9570 (3.9064) grad_norm 0.0000 (0.0000) [2022-10-11 11:10:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3442 (0.3489) loss 3.6164 (3.9136) grad_norm 0.0000 (0.0000) [2022-10-11 11:10:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3507 (0.3440) loss 4.0441 (3.9171) grad_norm 0.0000 (0.0000) [2022-10-11 11:11:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3110 (0.3406) loss 3.6880 (3.9237) grad_norm 0.0000 (0.0000) [2022-10-11 11:12:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3452 (0.3387) loss 3.7767 (3.9264) grad_norm 0.0000 (0.0000) [2022-10-11 11:12:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3458 (0.3376) loss 3.8758 (3.9234) grad_norm 0.0000 (0.0000) [2022-10-11 11:13:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3406 (0.3365) loss 4.0644 (3.9249) grad_norm 0.0000 (0.0000) [2022-10-11 11:13:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3364 (0.3358) loss 3.5943 (3.9272) grad_norm 0.0000 (0.0000) [2022-10-11 11:14:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3358 (0.3354) loss 4.1029 (3.9286) grad_norm 0.0000 (0.0000) [2022-10-11 11:14:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3085 (0.3351) loss 3.6424 (3.9324) grad_norm 0.0000 (0.0000) [2022-10-11 11:15:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3293 (0.3349) loss 3.7937 (3.9328) grad_norm 0.0000 (0.0000) [2022-10-11 11:15:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [82/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3396 (0.3347) loss 3.9904 (3.9339) grad_norm 0.0000 (0.0000) [2022-10-11 11:16:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 82 training takes 0:06:58 [2022-10-11 11:16:13 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.363 (3.363) Loss 1.1982 (1.1982) Acc@1 73.047 (73.047) Acc@5 91.309 (91.309) [2022-10-11 11:16:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.894 Acc@5 91.396 [2022-10-11 11:16:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-10-11 11:16:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.96% [2022-10-11 11:16:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][0/1251] eta 1:11:07 lr 0.000001 time 3.4111 (3.4111) loss 3.7996 (3.7996) grad_norm 0.0000 (0.0000) [2022-10-11 11:17:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3511 (0.3653) loss 3.8454 (3.9153) grad_norm 0.0000 (0.0000) [2022-10-11 11:17:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3193 (0.3484) loss 4.0880 (3.9172) grad_norm 0.0000 (0.0000) [2022-10-11 11:18:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3389 (0.3425) loss 4.0431 (3.9190) grad_norm 0.0000 (0.0000) [2022-10-11 11:18:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3620 (0.3397) loss 3.9068 (3.9176) grad_norm 0.0000 (0.0000) [2022-10-11 11:19:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3031 (0.3381) loss 3.8901 (3.9162) grad_norm 0.0000 (0.0000) [2022-10-11 11:19:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3079 (0.3366) loss 3.6200 (3.9106) grad_norm 0.0000 (0.0000) [2022-10-11 11:20:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3346 (0.3357) loss 3.8422 (3.9169) grad_norm 0.0000 (0.0000) [2022-10-11 11:20:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3111 (0.3352) loss 3.8131 (3.9168) grad_norm 0.0000 (0.0000) [2022-10-11 11:21:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3112 (0.3347) loss 3.9562 (3.9200) grad_norm 0.0000 (0.0000) [2022-10-11 11:21:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3522 (0.3343) loss 3.7323 (3.9201) grad_norm 0.0000 (0.0000) [2022-10-11 11:22:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3371 (0.3342) loss 4.0734 (3.9214) grad_norm 0.0000 (0.0000) [2022-10-11 11:23:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [83/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3221 (0.3338) loss 3.5385 (3.9234) grad_norm 0.0000 (0.0000) [2022-10-11 11:23:22 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 83 training takes 0:06:57 [2022-10-11 11:23:25 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.282 (3.282) Loss 1.1835 (1.1835) Acc@1 72.266 (72.266) Acc@5 92.188 (92.188) [2022-10-11 11:23:37 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.924 Acc@5 91.596 [2022-10-11 11:23:37 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-10-11 11:23:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 72.96% [2022-10-11 11:23:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][0/1251] eta 1:09:27 lr 0.000001 time 3.3310 (3.3310) loss 4.0548 (4.0548) grad_norm 0.0000 (0.0000) [2022-10-11 11:24:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3307 (0.3657) loss 3.7181 (3.9120) grad_norm 0.0000 (0.0000) [2022-10-11 11:24:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3154 (0.3493) loss 4.1755 (3.9090) grad_norm 0.0000 (0.0000) [2022-10-11 11:25:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3096 (0.3432) loss 3.7170 (3.9127) grad_norm 0.0000 (0.0000) [2022-10-11 11:25:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3261 (0.3404) loss 4.0774 (3.9162) grad_norm 0.0000 (0.0000) [2022-10-11 11:26:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3320 (0.3388) loss 3.9802 (3.9218) grad_norm 0.0000 (0.0000) [2022-10-11 11:27:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3162 (0.3376) loss 4.0135 (3.9222) grad_norm 0.0000 (0.0000) [2022-10-11 11:27:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3379 (0.3366) loss 3.6711 (3.9234) grad_norm 0.0000 (0.0000) [2022-10-11 11:28:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3361 (0.3357) loss 3.9701 (3.9213) grad_norm 0.0000 (0.0000) [2022-10-11 11:28:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3313 (0.3352) loss 4.3066 (3.9264) grad_norm 0.0000 (0.0000) [2022-10-11 11:29:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3211 (0.3347) loss 3.5525 (3.9261) grad_norm 0.0000 (0.0000) [2022-10-11 11:29:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3490 (0.3343) loss 3.6674 (3.9223) grad_norm 0.0000 (0.0000) [2022-10-11 11:30:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [84/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3281 (0.3341) loss 4.0634 (3.9220) grad_norm 0.0000 (0.0000) [2022-10-11 11:30:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 84 training takes 0:06:57 [2022-10-11 11:30:38 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.100 (3.100) Loss 1.1132 (1.1132) Acc@1 73.633 (73.633) Acc@5 91.797 (91.797) [2022-10-11 11:30:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.050 Acc@5 91.690 [2022-10-11 11:30:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-11 11:30:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.05% [2022-10-11 11:30:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][0/1251] eta 1:17:23 lr 0.000001 time 3.7121 (3.7121) loss 3.7705 (3.7705) grad_norm 0.0000 (0.0000) [2022-10-11 11:31:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3236 (0.3666) loss 3.8576 (3.9002) grad_norm 0.0000 (0.0000) [2022-10-11 11:32:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3722 (0.3496) loss 3.8909 (3.9049) grad_norm 0.0000 (0.0000) [2022-10-11 11:32:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3192 (0.3436) loss 4.0661 (3.9014) grad_norm 0.0000 (0.0000) [2022-10-11 11:33:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3191 (0.3404) loss 4.0656 (3.9009) grad_norm 0.0000 (0.0000) [2022-10-11 11:33:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3197 (0.3382) loss 3.6128 (3.9050) grad_norm 0.0000 (0.0000) [2022-10-11 11:34:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3442 (0.3370) loss 3.9415 (3.9069) grad_norm 0.0000 (0.0000) [2022-10-11 11:34:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3264 (0.3361) loss 4.2252 (3.9085) grad_norm 0.0000 (0.0000) [2022-10-11 11:35:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3320 (0.3351) loss 3.9432 (3.9097) grad_norm 0.0000 (0.0000) [2022-10-11 11:35:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3343 (0.3345) loss 3.7068 (3.9140) grad_norm 0.0000 (0.0000) [2022-10-11 11:36:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3122 (0.3340) loss 4.0254 (3.9114) grad_norm 0.0000 (0.0000) [2022-10-11 11:36:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3218 (0.3336) loss 3.7492 (3.9135) grad_norm 0.0000 (0.0000) [2022-10-11 11:37:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [85/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3115 (0.3334) loss 4.0473 (3.9146) grad_norm 0.0000 (0.0000) [2022-10-11 11:37:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 85 training takes 0:06:56 [2022-10-11 11:37:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.164 (3.164) Loss 1.1976 (1.1976) Acc@1 72.070 (72.070) Acc@5 90.332 (90.332) [2022-10-11 11:38:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.064 Acc@5 91.550 [2022-10-11 11:38:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.1% [2022-10-11 11:38:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.06% [2022-10-11 11:38:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][0/1251] eta 1:09:51 lr 0.000001 time 3.3506 (3.3506) loss 4.0280 (4.0280) grad_norm 0.0000 (0.0000) [2022-10-11 11:38:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3150 (0.3645) loss 3.9642 (3.8996) grad_norm 0.0000 (0.0000) [2022-10-11 11:39:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3288 (0.3482) loss 3.7172 (3.9039) grad_norm 0.0000 (0.0000) [2022-10-11 11:39:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3315 (0.3420) loss 4.0174 (3.9148) grad_norm 0.0000 (0.0000) [2022-10-11 11:40:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3467 (0.3393) loss 3.9033 (3.9088) grad_norm 0.0000 (0.0000) [2022-10-11 11:40:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3489 (0.3376) loss 3.6041 (3.9079) grad_norm 0.0000 (0.0000) [2022-10-11 11:41:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3165 (0.3363) loss 4.1991 (3.9112) grad_norm 0.0000 (0.0000) [2022-10-11 11:41:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3158 (0.3357) loss 3.5620 (3.9112) grad_norm 0.0000 (0.0000) [2022-10-11 11:42:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3315 (0.3351) loss 3.7186 (3.9135) grad_norm 0.0000 (0.0000) [2022-10-11 11:43:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3285 (0.3348) loss 3.9687 (3.9125) grad_norm 0.0000 (0.0000) [2022-10-11 11:43:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3280 (0.3345) loss 3.8675 (3.9116) grad_norm 0.0000 (0.0000) [2022-10-11 11:44:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3144 (0.3344) loss 3.9396 (3.9108) grad_norm 0.0000 (0.0000) [2022-10-11 11:44:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [86/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3304 (0.3342) loss 4.2243 (3.9112) grad_norm 0.0000 (0.0000) [2022-10-11 11:44:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 86 training takes 0:06:57 [2022-10-11 11:45:02 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.176 (3.176) Loss 1.0577 (1.0577) Acc@1 76.465 (76.465) Acc@5 92.480 (92.480) [2022-10-11 11:45:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.084 Acc@5 91.652 [2022-10-11 11:45:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.1% [2022-10-11 11:45:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.08% [2022-10-11 11:45:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][0/1251] eta 1:10:05 lr 0.000001 time 3.3621 (3.3621) loss 3.9065 (3.9065) grad_norm 0.0000 (0.0000) [2022-10-11 11:45:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3217 (0.3636) loss 4.0435 (3.8827) grad_norm 0.0000 (0.0000) [2022-10-11 11:46:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3331 (0.3473) loss 3.9584 (3.9024) grad_norm 0.0000 (0.0000) [2022-10-11 11:46:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3211 (0.3410) loss 3.8134 (3.9033) grad_norm 0.0000 (0.0000) [2022-10-11 11:47:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3230 (0.3377) loss 3.4469 (3.9002) grad_norm 0.0000 (0.0000) [2022-10-11 11:48:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3185 (0.3362) loss 4.0855 (3.9048) grad_norm 0.0000 (0.0000) [2022-10-11 11:48:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3432 (0.3357) loss 3.7015 (3.9053) grad_norm 0.0000 (0.0000) [2022-10-11 11:49:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3283 (0.3347) loss 3.7746 (3.9066) grad_norm 0.0000 (0.0000) [2022-10-11 11:49:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3286 (0.3341) loss 4.0700 (3.9067) grad_norm 0.0000 (0.0000) [2022-10-11 11:50:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3723 (0.3337) loss 3.7713 (3.9044) grad_norm 0.0000 (0.0000) [2022-10-11 11:50:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3310 (0.3334) loss 3.7058 (3.9060) grad_norm 0.0000 (0.0000) [2022-10-11 11:51:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3223 (0.3332) loss 3.7785 (3.9075) grad_norm 0.0000 (0.0000) [2022-10-11 11:51:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [87/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3171 (0.3330) loss 3.9653 (3.9100) grad_norm 0.0000 (0.0000) [2022-10-11 11:52:10 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 87 training takes 0:06:56 [2022-10-11 11:52:13 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.372 (3.372) Loss 1.1549 (1.1549) Acc@1 73.535 (73.535) Acc@5 90.918 (90.918) [2022-10-11 11:52:25 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.078 Acc@5 91.542 [2022-10-11 11:52:25 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.1% [2022-10-11 11:52:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.08% [2022-10-11 11:52:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][0/1251] eta 1:02:31 lr 0.000001 time 2.9989 (2.9989) loss 3.8518 (3.8518) grad_norm 0.0000 (0.0000) [2022-10-11 11:53:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][100/1251] eta 0:06:56 lr 0.000001 time 0.3350 (0.3618) loss 3.9688 (3.8807) grad_norm 0.0000 (0.0000) [2022-10-11 11:53:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3543 (0.3456) loss 4.0159 (3.8829) grad_norm 0.0000 (0.0000) [2022-10-11 11:54:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3474 (0.3408) loss 3.9995 (3.8848) grad_norm 0.0000 (0.0000) [2022-10-11 11:54:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3277 (0.3384) loss 3.7405 (3.8865) grad_norm 0.0000 (0.0000) [2022-10-11 11:55:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3112 (0.3370) loss 3.9277 (3.8892) grad_norm 0.0000 (0.0000) [2022-10-11 11:55:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3752 (0.3358) loss 4.0247 (3.8922) grad_norm 0.0000 (0.0000) [2022-10-11 11:56:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3343 (0.3350) loss 4.0988 (3.8939) grad_norm 0.0000 (0.0000) [2022-10-11 11:56:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3222 (0.3344) loss 3.7572 (3.8950) grad_norm 0.0000 (0.0000) [2022-10-11 11:57:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3194 (0.3340) loss 3.7208 (3.8986) grad_norm 0.0000 (0.0000) [2022-10-11 11:57:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3368 (0.3338) loss 3.9570 (3.9011) grad_norm 0.0000 (0.0000) [2022-10-11 11:58:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3177 (0.3335) loss 3.8004 (3.9025) grad_norm 0.0000 (0.0000) [2022-10-11 11:59:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [88/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3061 (0.3330) loss 3.7349 (3.9036) grad_norm 0.0000 (0.0000) [2022-10-11 11:59:21 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 88 training takes 0:06:56 [2022-10-11 11:59:24 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.261 (3.261) Loss 1.1348 (1.1348) Acc@1 72.656 (72.656) Acc@5 92.383 (92.383) [2022-10-11 11:59:36 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 72.976 Acc@5 91.754 [2022-10-11 11:59:36 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-11 11:59:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.08% [2022-10-11 11:59:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][0/1251] eta 1:11:37 lr 0.000001 time 3.4351 (3.4351) loss 4.0151 (4.0151) grad_norm 0.0000 (0.0000) [2022-10-11 12:00:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3332 (0.3682) loss 3.8147 (3.8737) grad_norm 0.0000 (0.0000) [2022-10-11 12:00:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3183 (0.3493) loss 3.7557 (3.8899) grad_norm 0.0000 (0.0000) [2022-10-11 12:01:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3668 (0.3444) loss 3.8818 (3.8939) grad_norm 0.0000 (0.0000) [2022-10-11 12:01:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3272 (0.3410) loss 3.9182 (3.9020) grad_norm 0.0000 (0.0000) [2022-10-11 12:02:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3232 (0.3386) loss 4.2026 (3.9011) grad_norm 0.0000 (0.0000) [2022-10-11 12:02:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3330 (0.3373) loss 4.0832 (3.9018) grad_norm 0.0000 (0.0000) [2022-10-11 12:03:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3099 (0.3363) loss 3.9301 (3.9013) grad_norm 0.0000 (0.0000) [2022-10-11 12:04:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3615 (0.3358) loss 3.6808 (3.9008) grad_norm 0.0000 (0.0000) [2022-10-11 12:04:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3218 (0.3354) loss 4.3087 (3.9027) grad_norm 0.0000 (0.0000) [2022-10-11 12:05:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3517 (0.3349) loss 3.7308 (3.9022) grad_norm 0.0000 (0.0000) [2022-10-11 12:05:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3198 (0.3346) loss 3.9645 (3.9030) grad_norm 0.0000 (0.0000) [2022-10-11 12:06:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [89/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3331 (0.3344) loss 3.8943 (3.9009) grad_norm 0.0000 (0.0000) [2022-10-11 12:06:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 89 training takes 0:06:58 [2022-10-11 12:06:38 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.344 (3.344) Loss 1.0826 (1.0826) Acc@1 75.000 (75.000) Acc@5 92.871 (92.871) [2022-10-11 12:06:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.126 Acc@5 91.630 [2022-10-11 12:06:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.1% [2022-10-11 12:06:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.13% [2022-10-11 12:06:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][0/1251] eta 1:08:38 lr 0.000001 time 3.2918 (3.2918) loss 3.9416 (3.9416) grad_norm 0.0000 (0.0000) [2022-10-11 12:07:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3446 (0.3659) loss 3.9989 (3.8696) grad_norm 0.0000 (0.0000) [2022-10-11 12:08:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3311 (0.3497) loss 3.9197 (3.8899) grad_norm 0.0000 (0.0000) [2022-10-11 12:08:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3463 (0.3437) loss 3.6859 (3.8883) grad_norm 0.0000 (0.0000) [2022-10-11 12:09:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3230 (0.3407) loss 3.9595 (3.8928) grad_norm 0.0000 (0.0000) [2022-10-11 12:09:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3483 (0.3391) loss 3.7796 (3.8990) grad_norm 0.0000 (0.0000) [2022-10-11 12:10:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3310 (0.3375) loss 3.7379 (3.9000) grad_norm 0.0000 (0.0000) [2022-10-11 12:10:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3241 (0.3365) loss 3.7483 (3.9028) grad_norm 0.0000 (0.0000) [2022-10-11 12:11:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3162 (0.3361) loss 3.7722 (3.9013) grad_norm 0.0000 (0.0000) [2022-10-11 12:11:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3390 (0.3356) loss 3.8032 (3.9038) grad_norm 0.0000 (0.0000) [2022-10-11 12:12:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3698 (0.3354) loss 3.9404 (3.9046) grad_norm 0.0000 (0.0000) [2022-10-11 12:12:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3387 (0.3351) loss 4.1300 (3.9055) grad_norm 0.0000 (0.0000) [2022-10-11 12:13:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [90/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3258 (0.3347) loss 3.6861 (3.9058) grad_norm 0.0000 (0.0000) [2022-10-11 12:13:48 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 90 training takes 0:06:58 [2022-10-11 12:13:48 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_90 saving...... [2022-10-11 12:13:48 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_90 saved !!! [2022-10-11 12:13:51 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.979 (2.979) Loss 1.1646 (1.1646) Acc@1 71.973 (71.973) Acc@5 91.602 (91.602) [2022-10-11 12:14:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.374 Acc@5 91.910 [2022-10-11 12:14:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-11 12:14:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.37% [2022-10-11 12:14:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][0/1251] eta 1:07:45 lr 0.000001 time 3.2494 (3.2494) loss 3.9744 (3.9744) grad_norm 0.0000 (0.0000) [2022-10-11 12:14:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3348 (0.3641) loss 3.8660 (3.8929) grad_norm 0.0000 (0.0000) [2022-10-11 12:15:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3431 (0.3476) loss 3.9649 (3.8961) grad_norm 0.0000 (0.0000) [2022-10-11 12:15:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3114 (0.3423) loss 4.0897 (3.8951) grad_norm 0.0000 (0.0000) [2022-10-11 12:16:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3332 (0.3393) loss 3.6864 (3.9004) grad_norm 0.0000 (0.0000) [2022-10-11 12:16:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3607 (0.3380) loss 3.5908 (3.8978) grad_norm 0.0000 (0.0000) [2022-10-11 12:17:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3571 (0.3367) loss 4.0764 (3.9040) grad_norm 0.0000 (0.0000) [2022-10-11 12:17:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3165 (0.3358) loss 3.9001 (3.9001) grad_norm 0.0000 (0.0000) [2022-10-11 12:18:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3295 (0.3352) loss 3.9715 (3.9003) grad_norm 0.0000 (0.0000) [2022-10-11 12:19:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3411 (0.3346) loss 4.0732 (3.9006) grad_norm 0.0000 (0.0000) [2022-10-11 12:19:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3264 (0.3341) loss 4.0011 (3.9030) grad_norm 0.0000 (0.0000) [2022-10-11 12:20:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3592 (0.3338) loss 3.9145 (3.9053) grad_norm 0.0000 (0.0000) [2022-10-11 12:20:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [91/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3209 (0.3334) loss 3.6295 (3.9042) grad_norm 0.0000 (0.0000) [2022-10-11 12:20:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 91 training takes 0:06:56 [2022-10-11 12:21:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.393 (3.393) Loss 1.1131 (1.1131) Acc@1 74.609 (74.609) Acc@5 91.602 (91.602) [2022-10-11 12:21:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.534 Acc@5 91.906 [2022-10-11 12:21:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-11 12:21:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.53% [2022-10-11 12:21:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][0/1251] eta 1:10:54 lr 0.000001 time 3.4012 (3.4012) loss 3.8287 (3.8287) grad_norm 0.0000 (0.0000) [2022-10-11 12:21:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3281 (0.3659) loss 3.9772 (3.8759) grad_norm 0.0000 (0.0000) [2022-10-11 12:22:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3476 (0.3490) loss 3.9140 (3.8853) grad_norm 0.0000 (0.0000) [2022-10-11 12:22:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3349 (0.3434) loss 3.9623 (3.8895) grad_norm 0.0000 (0.0000) [2022-10-11 12:23:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3141 (0.3409) loss 3.7453 (3.8878) grad_norm 0.0000 (0.0000) [2022-10-11 12:24:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3471 (0.3388) loss 4.1440 (3.8929) grad_norm 0.0000 (0.0000) [2022-10-11 12:24:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3084 (0.3370) loss 4.0986 (3.8908) grad_norm 0.0000 (0.0000) [2022-10-11 12:25:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3373 (0.3360) loss 3.8933 (3.8889) grad_norm 0.0000 (0.0000) [2022-10-11 12:25:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3548 (0.3356) loss 4.1963 (3.8852) grad_norm 0.0000 (0.0000) [2022-10-11 12:26:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3136 (0.3351) loss 4.2419 (3.8888) grad_norm 0.0000 (0.0000) [2022-10-11 12:26:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3430 (0.3346) loss 4.1875 (3.8880) grad_norm 0.0000 (0.0000) [2022-10-11 12:27:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3501 (0.3344) loss 3.8533 (3.8864) grad_norm 0.0000 (0.0000) [2022-10-11 12:27:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [92/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3215 (0.3340) loss 3.6822 (3.8877) grad_norm 0.0000 (0.0000) [2022-10-11 12:28:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 92 training takes 0:06:57 [2022-10-11 12:28:15 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.127 (3.127) Loss 1.1262 (1.1262) Acc@1 73.242 (73.242) Acc@5 91.797 (91.797) [2022-10-11 12:28:27 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.674 Acc@5 91.962 [2022-10-11 12:28:27 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-10-11 12:28:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.67% [2022-10-11 12:28:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][0/1251] eta 1:12:08 lr 0.000001 time 3.4600 (3.4600) loss 3.9831 (3.9831) grad_norm 0.0000 (0.0000) [2022-10-11 12:29:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3007 (0.3635) loss 4.1386 (3.9023) grad_norm 0.0000 (0.0000) [2022-10-11 12:29:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3281 (0.3465) loss 3.9735 (3.8923) grad_norm 0.0000 (0.0000) [2022-10-11 12:30:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3170 (0.3412) loss 3.8970 (3.8882) grad_norm 0.0000 (0.0000) [2022-10-11 12:30:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3406 (0.3384) loss 4.1363 (3.8826) grad_norm 0.0000 (0.0000) [2022-10-11 12:31:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3057 (0.3369) loss 3.8532 (3.8848) grad_norm 0.0000 (0.0000) [2022-10-11 12:31:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3549 (0.3360) loss 3.9188 (3.8844) grad_norm 0.0000 (0.0000) [2022-10-11 12:32:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3251 (0.3351) loss 4.0218 (3.8849) grad_norm 0.0000 (0.0000) [2022-10-11 12:32:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3372 (0.3345) loss 3.6711 (3.8860) grad_norm 0.0000 (0.0000) [2022-10-11 12:33:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3543 (0.3339) loss 3.9655 (3.8879) grad_norm 0.0000 (0.0000) [2022-10-11 12:34:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3351 (0.3334) loss 4.0997 (3.8870) grad_norm 0.0000 (0.0000) [2022-10-11 12:34:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3184 (0.3331) loss 3.5764 (3.8868) grad_norm 0.0000 (0.0000) [2022-10-11 12:35:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [93/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3199 (0.3329) loss 3.8413 (3.8869) grad_norm 0.0000 (0.0000) [2022-10-11 12:35:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 93 training takes 0:06:56 [2022-10-11 12:35:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.871 (2.871) Loss 1.0731 (1.0731) Acc@1 74.707 (74.707) Acc@5 92.773 (92.773) [2022-10-11 12:35:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.566 Acc@5 91.758 [2022-10-11 12:35:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-11 12:35:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.67% [2022-10-11 12:35:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][0/1251] eta 1:12:56 lr 0.000001 time 3.4987 (3.4987) loss 3.9644 (3.9644) grad_norm 0.0000 (0.0000) [2022-10-11 12:36:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3510 (0.3682) loss 3.9556 (3.8421) grad_norm 0.0000 (0.0000) [2022-10-11 12:36:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3142 (0.3505) loss 3.5405 (3.8539) grad_norm 0.0000 (0.0000) [2022-10-11 12:37:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3267 (0.3445) loss 3.9196 (3.8562) grad_norm 0.0000 (0.0000) [2022-10-11 12:37:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3297 (0.3412) loss 3.9602 (3.8705) grad_norm 0.0000 (0.0000) [2022-10-11 12:38:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3295 (0.3390) loss 4.1717 (3.8655) grad_norm 0.0000 (0.0000) [2022-10-11 12:39:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3352 (0.3374) loss 3.9758 (3.8713) grad_norm 0.0000 (0.0000) [2022-10-11 12:39:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3054 (0.3365) loss 3.9664 (3.8741) grad_norm 0.0000 (0.0000) [2022-10-11 12:40:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3420 (0.3359) loss 3.9011 (3.8746) grad_norm 0.0000 (0.0000) [2022-10-11 12:40:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3375 (0.3353) loss 3.7830 (3.8765) grad_norm 0.0000 (0.0000) [2022-10-11 12:41:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3308 (0.3348) loss 3.8774 (3.8763) grad_norm 0.0000 (0.0000) [2022-10-11 12:41:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3590 (0.3345) loss 3.7006 (3.8763) grad_norm 0.0000 (0.0000) [2022-10-11 12:42:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [94/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3473 (0.3340) loss 4.0775 (3.8755) grad_norm 0.0000 (0.0000) [2022-10-11 12:42:35 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 94 training takes 0:06:57 [2022-10-11 12:42:39 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.341 (3.341) Loss 1.0694 (1.0694) Acc@1 75.195 (75.195) Acc@5 92.773 (92.773) [2022-10-11 12:42:50 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.538 Acc@5 91.924 [2022-10-11 12:42:50 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-11 12:42:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 73.67% [2022-10-11 12:42:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][0/1251] eta 1:11:16 lr 0.000001 time 3.4183 (3.4183) loss 4.0443 (4.0443) grad_norm 0.0000 (0.0000) [2022-10-11 12:43:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3671 (0.3658) loss 3.7134 (3.8684) grad_norm 0.0000 (0.0000) [2022-10-11 12:44:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3399 (0.3490) loss 3.6258 (3.8788) grad_norm 0.0000 (0.0000) [2022-10-11 12:44:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3073 (0.3429) loss 3.8521 (3.8767) grad_norm 0.0000 (0.0000) [2022-10-11 12:45:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3075 (0.3402) loss 4.2619 (3.8840) grad_norm 0.0000 (0.0000) [2022-10-11 12:45:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3030 (0.3382) loss 3.8989 (3.8878) grad_norm 0.0000 (0.0000) [2022-10-11 12:46:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3236 (0.3371) loss 3.6221 (3.8871) grad_norm 0.0000 (0.0000) [2022-10-11 12:46:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3342 (0.3364) loss 3.8836 (3.8855) grad_norm 0.0000 (0.0000) [2022-10-11 12:47:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3323 (0.3352) loss 4.0048 (3.8844) grad_norm 0.0000 (0.0000) [2022-10-11 12:47:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3314 (0.3346) loss 3.6864 (3.8817) grad_norm 0.0000 (0.0000) [2022-10-11 12:48:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3139 (0.3340) loss 3.9018 (3.8815) grad_norm 0.0000 (0.0000) [2022-10-11 12:48:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3304 (0.3337) loss 4.0534 (3.8814) grad_norm 0.0000 (0.0000) [2022-10-11 12:49:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [95/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3353 (0.3334) loss 3.6729 (3.8805) grad_norm 0.0000 (0.0000) [2022-10-11 12:49:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 95 training takes 0:06:56 [2022-10-11 12:49:51 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.942 (3.942) Loss 1.1604 (1.1604) Acc@1 73.145 (73.145) Acc@5 90.625 (90.625) [2022-10-11 12:50:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.004 Acc@5 92.080 [2022-10-11 12:50:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-11 12:50:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 12:50:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][0/1251] eta 1:17:31 lr 0.000001 time 3.7181 (3.7181) loss 4.2176 (4.2176) grad_norm 0.0000 (0.0000) [2022-10-11 12:50:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3309 (0.3680) loss 4.2116 (3.8525) grad_norm 0.0000 (0.0000) [2022-10-11 12:51:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3017 (0.3496) loss 3.7981 (3.8778) grad_norm 0.0000 (0.0000) [2022-10-11 12:51:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3127 (0.3439) loss 3.5044 (3.8684) grad_norm 0.0000 (0.0000) [2022-10-11 12:52:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3358 (0.3409) loss 3.7275 (3.8738) grad_norm 0.0000 (0.0000) [2022-10-11 12:52:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3461 (0.3389) loss 3.5872 (3.8674) grad_norm 0.0000 (0.0000) [2022-10-11 12:53:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3667 (0.3380) loss 4.1270 (3.8702) grad_norm 0.0000 (0.0000) [2022-10-11 12:53:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3518 (0.3371) loss 3.7043 (3.8738) grad_norm 0.0000 (0.0000) [2022-10-11 12:54:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3110 (0.3363) loss 3.9915 (3.8763) grad_norm 0.0000 (0.0000) [2022-10-11 12:55:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3200 (0.3356) loss 3.8758 (3.8760) grad_norm 0.0000 (0.0000) [2022-10-11 12:55:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3316 (0.3351) loss 3.8753 (3.8764) grad_norm 0.0000 (0.0000) [2022-10-11 12:56:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3386 (0.3347) loss 3.8528 (3.8773) grad_norm 0.0000 (0.0000) [2022-10-11 12:56:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [96/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3273 (0.3343) loss 3.6811 (3.8776) grad_norm 0.0000 (0.0000) [2022-10-11 12:57:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 96 training takes 0:06:57 [2022-10-11 12:57:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.296 (3.296) Loss 1.1577 (1.1577) Acc@1 73.438 (73.438) Acc@5 90.430 (90.430) [2022-10-11 12:57:15 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.680 Acc@5 91.974 [2022-10-11 12:57:15 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-10-11 12:57:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 12:57:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][0/1251] eta 1:13:38 lr 0.000001 time 3.5319 (3.5319) loss 3.6595 (3.6595) grad_norm 0.0000 (0.0000) [2022-10-11 12:57:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3272 (0.3661) loss 3.5993 (3.8495) grad_norm 0.0000 (0.0000) [2022-10-11 12:58:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3108 (0.3484) loss 3.9017 (3.8678) grad_norm 0.0000 (0.0000) [2022-10-11 12:58:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3326 (0.3424) loss 4.2064 (3.8703) grad_norm 0.0000 (0.0000) [2022-10-11 12:59:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3429 (0.3392) loss 3.9654 (3.8633) grad_norm 0.0000 (0.0000) [2022-10-11 13:00:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3477 (0.3371) loss 3.8644 (3.8617) grad_norm 0.0000 (0.0000) [2022-10-11 13:00:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3349 (0.3359) loss 4.0321 (3.8650) grad_norm 0.0000 (0.0000) [2022-10-11 13:01:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3350 (0.3351) loss 3.6642 (3.8657) grad_norm 0.0000 (0.0000) [2022-10-11 13:01:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3565 (0.3347) loss 4.0249 (3.8658) grad_norm 0.0000 (0.0000) [2022-10-11 13:02:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3529 (0.3343) loss 4.1551 (3.8686) grad_norm 0.0000 (0.0000) [2022-10-11 13:02:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3657 (0.3340) loss 3.9388 (3.8704) grad_norm 0.0000 (0.0000) [2022-10-11 13:03:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3327 (0.3337) loss 4.0334 (3.8727) grad_norm 0.0000 (0.0000) [2022-10-11 13:03:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [97/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3404 (0.3335) loss 3.9180 (3.8747) grad_norm 0.0000 (0.0000) [2022-10-11 13:04:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 97 training takes 0:06:57 [2022-10-11 13:04:15 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.140 (3.140) Loss 1.1353 (1.1353) Acc@1 73.047 (73.047) Acc@5 92.969 (92.969) [2022-10-11 13:04:27 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.594 Acc@5 91.902 [2022-10-11 13:04:27 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-11 13:04:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 13:04:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][0/1251] eta 1:22:42 lr 0.000001 time 3.9672 (3.9672) loss 3.9748 (3.9748) grad_norm 0.0000 (0.0000) [2022-10-11 13:05:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3319 (0.3635) loss 4.0111 (3.8582) grad_norm 0.0000 (0.0000) [2022-10-11 13:05:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3280 (0.3473) loss 3.7905 (3.8653) grad_norm 0.0000 (0.0000) [2022-10-11 13:06:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3311 (0.3412) loss 3.6956 (3.8633) grad_norm 0.0000 (0.0000) [2022-10-11 13:06:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3069 (0.3379) loss 3.6677 (3.8638) grad_norm 0.0000 (0.0000) [2022-10-11 13:07:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3712 (0.3361) loss 3.8816 (3.8633) grad_norm 0.0000 (0.0000) [2022-10-11 13:07:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3298 (0.3353) loss 3.8639 (3.8689) grad_norm 0.0000 (0.0000) [2022-10-11 13:08:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3362 (0.3345) loss 3.7377 (3.8668) grad_norm 0.0000 (0.0000) [2022-10-11 13:08:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3019 (0.3338) loss 3.7158 (3.8645) grad_norm 0.0000 (0.0000) [2022-10-11 13:09:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3141 (0.3334) loss 3.7550 (3.8633) grad_norm 0.0000 (0.0000) [2022-10-11 13:10:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3200 (0.3330) loss 3.6146 (3.8630) grad_norm 0.0000 (0.0000) [2022-10-11 13:10:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3253 (0.3327) loss 4.0147 (3.8625) grad_norm 0.0000 (0.0000) [2022-10-11 13:11:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [98/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3395 (0.3325) loss 3.8119 (3.8652) grad_norm 0.0000 (0.0000) [2022-10-11 13:11:22 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 98 training takes 0:06:55 [2022-10-11 13:11:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.235 (3.235) Loss 1.1028 (1.1028) Acc@1 74.414 (74.414) Acc@5 91.895 (91.895) [2022-10-11 13:11:37 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.784 Acc@5 91.954 [2022-10-11 13:11:37 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-11 13:11:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 13:11:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][0/1251] eta 1:10:26 lr 0.000001 time 3.3782 (3.3782) loss 3.9484 (3.9484) grad_norm 0.0000 (0.0000) [2022-10-11 13:12:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3104 (0.3647) loss 3.5323 (3.8152) grad_norm 0.0000 (0.0000) [2022-10-11 13:12:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3579 (0.3492) loss 3.5564 (3.8338) grad_norm 0.0000 (0.0000) [2022-10-11 13:13:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3220 (0.3426) loss 3.8963 (3.8495) grad_norm 0.0000 (0.0000) [2022-10-11 13:13:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3287 (0.3403) loss 3.9573 (3.8535) grad_norm 0.0000 (0.0000) [2022-10-11 13:14:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3228 (0.3380) loss 3.7881 (3.8560) grad_norm 0.0000 (0.0000) [2022-10-11 13:15:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3548 (0.3368) loss 3.6870 (3.8585) grad_norm 0.0000 (0.0000) [2022-10-11 13:15:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3296 (0.3360) loss 3.9713 (3.8599) grad_norm 0.0000 (0.0000) [2022-10-11 13:16:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3208 (0.3353) loss 3.9503 (3.8614) grad_norm 0.0000 (0.0000) [2022-10-11 13:16:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3262 (0.3348) loss 3.7991 (3.8621) grad_norm 0.0000 (0.0000) [2022-10-11 13:17:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3412 (0.3343) loss 3.7484 (3.8645) grad_norm 0.0000 (0.0000) [2022-10-11 13:17:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3360 (0.3340) loss 3.8846 (3.8663) grad_norm 0.0000 (0.0000) [2022-10-11 13:18:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [99/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3639 (0.3335) loss 3.8377 (3.8669) grad_norm 0.0000 (0.0000) [2022-10-11 13:18:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 99 training takes 0:06:56 [2022-10-11 13:18:38 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.381 (3.381) Loss 1.1746 (1.1746) Acc@1 73.535 (73.535) Acc@5 91.895 (91.895) [2022-10-11 13:18:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.652 Acc@5 91.946 [2022-10-11 13:18:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-10-11 13:18:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 13:18:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][0/1251] eta 1:14:23 lr 0.000001 time 3.5677 (3.5677) loss 3.5877 (3.5877) grad_norm 0.0000 (0.0000) [2022-10-11 13:19:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3260 (0.3633) loss 3.5817 (3.8770) grad_norm 0.0000 (0.0000) [2022-10-11 13:19:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3367 (0.3461) loss 4.0682 (3.8558) grad_norm 0.0000 (0.0000) [2022-10-11 13:20:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][300/1251] eta 0:05:23 lr 0.000001 time 0.3217 (0.3403) loss 3.8650 (3.8546) grad_norm 0.0000 (0.0000) [2022-10-11 13:21:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3314 (0.3378) loss 3.5837 (3.8536) grad_norm 0.0000 (0.0000) [2022-10-11 13:21:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3278 (0.3365) loss 3.8735 (3.8549) grad_norm 0.0000 (0.0000) [2022-10-11 13:22:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3516 (0.3355) loss 3.8403 (3.8548) grad_norm 0.0000 (0.0000) [2022-10-11 13:22:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3399 (0.3347) loss 4.0916 (3.8562) grad_norm 0.0000 (0.0000) [2022-10-11 13:23:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3287 (0.3339) loss 3.7507 (3.8561) grad_norm 0.0000 (0.0000) [2022-10-11 13:23:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3454 (0.3335) loss 3.9200 (3.8534) grad_norm 0.0000 (0.0000) [2022-10-11 13:24:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3376 (0.3330) loss 4.0368 (3.8564) grad_norm 0.0000 (0.0000) [2022-10-11 13:24:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3336 (0.3326) loss 3.8502 (3.8587) grad_norm 0.0000 (0.0000) [2022-10-11 13:25:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [100/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3219 (0.3323) loss 3.4854 (3.8605) grad_norm 0.0000 (0.0000) [2022-10-11 13:25:45 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 100 training takes 0:06:55 [2022-10-11 13:25:45 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_100 saving...... [2022-10-11 13:25:45 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_100 saved !!! [2022-10-11 13:25:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.818 (2.818) Loss 1.1751 (1.1751) Acc@1 71.582 (71.582) Acc@5 91.895 (91.895) [2022-10-11 13:26:00 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.766 Acc@5 92.088 [2022-10-11 13:26:00 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-11 13:26:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.00% [2022-10-11 13:26:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][0/1251] eta 1:04:37 lr 0.000001 time 3.0993 (3.0993) loss 3.7419 (3.7419) grad_norm 0.0000 (0.0000) [2022-10-11 13:26:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3313 (0.3640) loss 3.9114 (3.8798) grad_norm 0.0000 (0.0000) [2022-10-11 13:27:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3149 (0.3473) loss 3.8566 (3.8724) grad_norm 0.0000 (0.0000) [2022-10-11 13:27:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3269 (0.3411) loss 3.8241 (3.8551) grad_norm 0.0000 (0.0000) [2022-10-11 13:28:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3269 (0.3386) loss 3.6869 (3.8577) grad_norm 0.0000 (0.0000) [2022-10-11 13:28:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3313 (0.3369) loss 3.5345 (3.8599) grad_norm 0.0000 (0.0000) [2022-10-11 13:29:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3606 (0.3361) loss 3.8380 (3.8579) grad_norm 0.0000 (0.0000) [2022-10-11 13:29:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3374 (0.3349) loss 3.7075 (3.8585) grad_norm 0.0000 (0.0000) [2022-10-11 13:30:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3058 (0.3341) loss 3.9051 (3.8617) grad_norm 0.0000 (0.0000) [2022-10-11 13:31:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3276 (0.3335) loss 3.7803 (3.8583) grad_norm 0.0000 (0.0000) [2022-10-11 13:31:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3174 (0.3333) loss 3.9190 (3.8602) grad_norm 0.0000 (0.0000) [2022-10-11 13:32:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3370 (0.3329) loss 4.0111 (3.8613) grad_norm 0.0000 (0.0000) [2022-10-11 13:32:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [101/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3309 (0.3327) loss 3.9013 (3.8613) grad_norm 0.0000 (0.0000) [2022-10-11 13:32:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 101 training takes 0:06:55 [2022-10-11 13:32:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.101 (3.101) Loss 1.0420 (1.0420) Acc@1 74.219 (74.219) Acc@5 92.676 (92.676) [2022-10-11 13:33:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.030 Acc@5 92.012 [2022-10-11 13:33:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-11 13:33:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.03% [2022-10-11 13:33:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][0/1251] eta 1:14:19 lr 0.000001 time 3.5649 (3.5649) loss 3.7789 (3.7789) grad_norm 0.0000 (0.0000) [2022-10-11 13:33:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3601 (0.3649) loss 4.1911 (3.8150) grad_norm 0.0000 (0.0000) [2022-10-11 13:34:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3329 (0.3474) loss 4.1703 (3.8296) grad_norm 0.0000 (0.0000) [2022-10-11 13:34:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3398 (0.3410) loss 3.8191 (3.8382) grad_norm 0.0000 (0.0000) [2022-10-11 13:35:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3585 (0.3383) loss 3.8949 (3.8430) grad_norm 0.0000 (0.0000) [2022-10-11 13:35:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3271 (0.3364) loss 3.6952 (3.8408) grad_norm 0.0000 (0.0000) [2022-10-11 13:36:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3195 (0.3355) loss 3.8613 (3.8449) grad_norm 0.0000 (0.0000) [2022-10-11 13:37:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3174 (0.3345) loss 3.7729 (3.8463) grad_norm 0.0000 (0.0000) [2022-10-11 13:37:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3079 (0.3337) loss 3.6582 (3.8476) grad_norm 0.0000 (0.0000) [2022-10-11 13:38:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][900/1251] eta 0:01:56 lr 0.000001 time 0.3318 (0.3331) loss 3.8738 (3.8493) grad_norm 0.0000 (0.0000) [2022-10-11 13:38:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3020 (0.3328) loss 3.8849 (3.8522) grad_norm 0.0000 (0.0000) [2022-10-11 13:39:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3318 (0.3327) loss 3.8188 (3.8515) grad_norm 0.0000 (0.0000) [2022-10-11 13:39:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [102/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3183 (0.3325) loss 3.6822 (3.8513) grad_norm 0.0000 (0.0000) [2022-10-11 13:40:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 102 training takes 0:06:55 [2022-10-11 13:40:10 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.206 (3.206) Loss 1.0455 (1.0455) Acc@1 76.855 (76.855) Acc@5 92.578 (92.578) [2022-10-11 13:40:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 73.724 Acc@5 91.962 [2022-10-11 13:40:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-10-11 13:40:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.03% [2022-10-11 13:40:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][0/1251] eta 1:13:17 lr 0.000001 time 3.5154 (3.5154) loss 3.8504 (3.8504) grad_norm 0.0000 (0.0000) [2022-10-11 13:40:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3313 (0.3650) loss 3.8875 (3.8073) grad_norm 0.0000 (0.0000) [2022-10-11 13:41:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3235 (0.3475) loss 4.0375 (3.8164) grad_norm 0.0000 (0.0000) [2022-10-11 13:42:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3184 (0.3418) loss 3.6001 (3.8256) grad_norm 0.0000 (0.0000) [2022-10-11 13:42:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3208 (0.3389) loss 3.7166 (3.8303) grad_norm 0.0000 (0.0000) [2022-10-11 13:43:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3258 (0.3374) loss 3.9016 (3.8352) grad_norm 0.0000 (0.0000) [2022-10-11 13:43:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3037 (0.3364) loss 3.4882 (3.8424) grad_norm 0.0000 (0.0000) [2022-10-11 13:44:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3222 (0.3354) loss 3.6835 (3.8455) grad_norm 0.0000 (0.0000) [2022-10-11 13:44:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3377 (0.3345) loss 4.1430 (3.8470) grad_norm 0.0000 (0.0000) [2022-10-11 13:45:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3262 (0.3340) loss 3.7532 (3.8476) grad_norm 0.0000 (0.0000) [2022-10-11 13:45:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3155 (0.3335) loss 3.9020 (3.8528) grad_norm 0.0000 (0.0000) [2022-10-11 13:46:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3121 (0.3331) loss 4.0404 (3.8521) grad_norm 0.0000 (0.0000) [2022-10-11 13:47:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [103/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3216 (0.3329) loss 3.7945 (3.8540) grad_norm 0.0000 (0.0000) [2022-10-11 13:47:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 103 training takes 0:06:56 [2022-10-11 13:47:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.124 (3.124) Loss 1.1011 (1.1011) Acc@1 72.070 (72.070) Acc@5 92.578 (92.578) [2022-10-11 13:47:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.106 Acc@5 92.228 [2022-10-11 13:47:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-10-11 13:47:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.11% [2022-10-11 13:47:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][0/1251] eta 1:11:16 lr 0.000001 time 3.4185 (3.4185) loss 3.5921 (3.5921) grad_norm 0.0000 (0.0000) [2022-10-11 13:48:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3312 (0.3662) loss 3.6659 (3.8487) grad_norm 0.0000 (0.0000) [2022-10-11 13:48:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3256 (0.3481) loss 3.8329 (3.8395) grad_norm 0.0000 (0.0000) [2022-10-11 13:49:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3039 (0.3423) loss 3.8190 (3.8388) grad_norm 0.0000 (0.0000) [2022-10-11 13:49:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3250 (0.3392) loss 3.7624 (3.8427) grad_norm 0.0000 (0.0000) [2022-10-11 13:50:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3288 (0.3370) loss 4.0102 (3.8493) grad_norm 0.0000 (0.0000) [2022-10-11 13:50:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3374 (0.3357) loss 3.7499 (3.8501) grad_norm 0.0000 (0.0000) [2022-10-11 13:51:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3141 (0.3348) loss 3.9331 (3.8461) grad_norm 0.0000 (0.0000) [2022-10-11 13:52:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3415 (0.3343) loss 4.1011 (3.8500) grad_norm 0.0000 (0.0000) [2022-10-11 13:52:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3080 (0.3337) loss 3.8158 (3.8495) grad_norm 0.0000 (0.0000) [2022-10-11 13:53:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3197 (0.3334) loss 3.9722 (3.8501) grad_norm 0.0000 (0.0000) [2022-10-11 13:53:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3535 (0.3331) loss 3.8055 (3.8488) grad_norm 0.0000 (0.0000) [2022-10-11 13:54:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [104/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3197 (0.3329) loss 3.9378 (3.8513) grad_norm 0.0000 (0.0000) [2022-10-11 13:54:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 104 training takes 0:06:56 [2022-10-11 13:54:32 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.100 (3.100) Loss 1.1126 (1.1126) Acc@1 75.000 (75.000) Acc@5 91.699 (91.699) [2022-10-11 13:54:44 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.144 Acc@5 92.204 [2022-10-11 13:54:44 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-10-11 13:54:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.14% [2022-10-11 13:54:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][0/1251] eta 1:11:30 lr 0.000001 time 3.4296 (3.4296) loss 3.6953 (3.6953) grad_norm 0.0000 (0.0000) [2022-10-11 13:55:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3446 (0.3655) loss 3.8015 (3.8083) grad_norm 0.0000 (0.0000) [2022-10-11 13:55:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3294 (0.3482) loss 3.8090 (3.8112) grad_norm 0.0000 (0.0000) [2022-10-11 13:56:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3151 (0.3425) loss 3.8654 (3.8231) grad_norm 0.0000 (0.0000) [2022-10-11 13:57:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3287 (0.3397) loss 3.9377 (3.8314) grad_norm 0.0000 (0.0000) [2022-10-11 13:57:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3193 (0.3381) loss 3.8584 (3.8297) grad_norm 0.0000 (0.0000) [2022-10-11 13:58:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3369 (0.3369) loss 3.6377 (3.8238) grad_norm 0.0000 (0.0000) [2022-10-11 13:58:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3416 (0.3354) loss 4.0610 (3.8287) grad_norm 0.0000 (0.0000) [2022-10-11 13:59:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3181 (0.3347) loss 3.9149 (3.8315) grad_norm 0.0000 (0.0000) [2022-10-11 13:59:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3239 (0.3340) loss 3.8670 (3.8364) grad_norm 0.0000 (0.0000) [2022-10-11 14:00:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3232 (0.3337) loss 3.8560 (3.8348) grad_norm 0.0000 (0.0000) [2022-10-11 14:00:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3334 (0.3333) loss 3.7933 (3.8362) grad_norm 0.0000 (0.0000) [2022-10-11 14:01:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [105/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3207 (0.3330) loss 3.9917 (3.8383) grad_norm 0.0000 (0.0000) [2022-10-11 14:01:40 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 105 training takes 0:06:56 [2022-10-11 14:01:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.172 (3.172) Loss 1.0674 (1.0674) Acc@1 74.902 (74.902) Acc@5 92.480 (92.480) [2022-10-11 14:01:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.236 Acc@5 92.302 [2022-10-11 14:01:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-10-11 14:01:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.24% [2022-10-11 14:01:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][0/1251] eta 1:07:06 lr 0.000001 time 3.2184 (3.2184) loss 3.9188 (3.9188) grad_norm 0.0000 (0.0000) [2022-10-11 14:02:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3518 (0.3635) loss 3.7744 (3.8174) grad_norm 0.0000 (0.0000) [2022-10-11 14:03:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3527 (0.3476) loss 4.0624 (3.8290) grad_norm 0.0000 (0.0000) [2022-10-11 14:03:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3357 (0.3419) loss 4.0035 (3.8255) grad_norm 0.0000 (0.0000) [2022-10-11 14:04:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3507 (0.3393) loss 4.0200 (3.8278) grad_norm 0.0000 (0.0000) [2022-10-11 14:04:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3221 (0.3374) loss 3.9706 (3.8350) grad_norm 0.0000 (0.0000) [2022-10-11 14:05:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3331 (0.3362) loss 4.0081 (3.8394) grad_norm 0.0000 (0.0000) [2022-10-11 14:05:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3248 (0.3353) loss 3.5282 (3.8424) grad_norm 0.0000 (0.0000) [2022-10-11 14:06:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3392 (0.3346) loss 3.8702 (3.8431) grad_norm 0.0000 (0.0000) [2022-10-11 14:06:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3249 (0.3343) loss 3.8173 (3.8496) grad_norm 0.0000 (0.0000) [2022-10-11 14:07:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3327 (0.3338) loss 3.8893 (3.8485) grad_norm 0.0000 (0.0000) [2022-10-11 14:08:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3344 (0.3335) loss 4.0285 (3.8469) grad_norm 0.0000 (0.0000) [2022-10-11 14:08:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [106/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3149 (0.3333) loss 3.9582 (3.8476) grad_norm 0.0000 (0.0000) [2022-10-11 14:08:52 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 106 training takes 0:06:56 [2022-10-11 14:08:55 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.460 (3.460) Loss 1.0885 (1.0885) Acc@1 76.074 (76.074) Acc@5 92.188 (92.188) [2022-10-11 14:09:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.028 Acc@5 92.114 [2022-10-11 14:09:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-11 14:09:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.24% [2022-10-11 14:09:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][0/1251] eta 1:12:07 lr 0.000001 time 3.4591 (3.4591) loss 3.8713 (3.8713) grad_norm 0.0000 (0.0000) [2022-10-11 14:09:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3087 (0.3648) loss 3.9354 (3.8094) grad_norm 0.0000 (0.0000) [2022-10-11 14:10:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3242 (0.3480) loss 3.5937 (3.8279) grad_norm 0.0000 (0.0000) [2022-10-11 14:10:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3175 (0.3427) loss 3.8161 (3.8347) grad_norm 0.0000 (0.0000) [2022-10-11 14:11:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3045 (0.3396) loss 4.0653 (3.8323) grad_norm 0.0000 (0.0000) [2022-10-11 14:11:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3131 (0.3378) loss 3.4896 (3.8340) grad_norm 0.0000 (0.0000) [2022-10-11 14:12:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3170 (0.3365) loss 3.3911 (3.8329) grad_norm 0.0000 (0.0000) [2022-10-11 14:13:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3860 (0.3354) loss 3.9569 (3.8329) grad_norm 0.0000 (0.0000) [2022-10-11 14:13:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3288 (0.3345) loss 3.8217 (3.8339) grad_norm 0.0000 (0.0000) [2022-10-11 14:14:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3261 (0.3340) loss 3.6814 (3.8366) grad_norm 0.0000 (0.0000) [2022-10-11 14:14:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3342 (0.3334) loss 3.5038 (3.8395) grad_norm 0.0000 (0.0000) [2022-10-11 14:15:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3322 (0.3329) loss 3.7816 (3.8390) grad_norm 0.0000 (0.0000) [2022-10-11 14:15:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [107/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3419 (0.3326) loss 4.0732 (3.8403) grad_norm 0.0000 (0.0000) [2022-10-11 14:16:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 107 training takes 0:06:55 [2022-10-11 14:16:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.317 (3.317) Loss 1.1255 (1.1255) Acc@1 73.047 (73.047) Acc@5 92.676 (92.676) [2022-10-11 14:16:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.436 Acc@5 92.264 [2022-10-11 14:16:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-10-11 14:16:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.44% [2022-10-11 14:16:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][0/1251] eta 0:58:56 lr 0.000001 time 2.8268 (2.8268) loss 3.8512 (3.8512) grad_norm 0.0000 (0.0000) [2022-10-11 14:16:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3494 (0.3624) loss 3.9171 (3.8306) grad_norm 0.0000 (0.0000) [2022-10-11 14:17:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3356 (0.3465) loss 4.1139 (3.8293) grad_norm 0.0000 (0.0000) [2022-10-11 14:18:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3133 (0.3413) loss 4.0140 (3.8333) grad_norm 0.0000 (0.0000) [2022-10-11 14:18:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3082 (0.3389) loss 3.6511 (3.8254) grad_norm 0.0000 (0.0000) [2022-10-11 14:19:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3372 (0.3373) loss 4.0079 (3.8284) grad_norm 0.0000 (0.0000) [2022-10-11 14:19:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3121 (0.3363) loss 3.7814 (3.8292) grad_norm 0.0000 (0.0000) [2022-10-11 14:20:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3213 (0.3352) loss 3.4582 (3.8285) grad_norm 0.0000 (0.0000) [2022-10-11 14:20:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3583 (0.3344) loss 3.9794 (3.8283) grad_norm 0.0000 (0.0000) [2022-10-11 14:21:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3172 (0.3337) loss 3.7985 (3.8281) grad_norm 0.0000 (0.0000) [2022-10-11 14:21:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3385 (0.3333) loss 3.6295 (3.8286) grad_norm 0.0000 (0.0000) [2022-10-11 14:22:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3181 (0.3331) loss 3.7872 (3.8282) grad_norm 0.0000 (0.0000) [2022-10-11 14:22:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [108/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3347 (0.3328) loss 3.5916 (3.8293) grad_norm 0.0000 (0.0000) [2022-10-11 14:23:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 108 training takes 0:06:56 [2022-10-11 14:23:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.084 (3.084) Loss 1.0755 (1.0755) Acc@1 74.219 (74.219) Acc@5 93.359 (93.359) [2022-10-11 14:23:28 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.280 Acc@5 92.362 [2022-10-11 14:23:28 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-11 14:23:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.44% [2022-10-11 14:23:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][0/1251] eta 1:15:25 lr 0.000001 time 3.6179 (3.6179) loss 3.9118 (3.9118) grad_norm 0.0000 (0.0000) [2022-10-11 14:24:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3064 (0.3628) loss 4.0048 (3.8362) grad_norm 0.0000 (0.0000) [2022-10-11 14:24:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3169 (0.3464) loss 4.1129 (3.8367) grad_norm 0.0000 (0.0000) [2022-10-11 14:25:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3448 (0.3408) loss 3.5870 (3.8266) grad_norm 0.0000 (0.0000) [2022-10-11 14:25:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][400/1251] eta 0:04:47 lr 0.000001 time 0.2967 (0.3381) loss 3.4878 (3.8300) grad_norm 0.0000 (0.0000) [2022-10-11 14:26:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3234 (0.3367) loss 3.7743 (3.8290) grad_norm 0.0000 (0.0000) [2022-10-11 14:26:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3179 (0.3354) loss 3.8598 (3.8272) grad_norm 0.0000 (0.0000) [2022-10-11 14:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][700/1251] eta 0:03:04 lr 0.000001 time 0.4077 (0.3350) loss 3.8618 (3.8271) grad_norm 0.0000 (0.0000) [2022-10-11 14:27:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][800/1251] eta 0:02:30 lr 0.000001 time 0.4031 (0.3342) loss 3.8825 (3.8272) grad_norm 0.0000 (0.0000) [2022-10-11 14:28:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3448 (0.3334) loss 3.6203 (3.8276) grad_norm 0.0000 (0.0000) [2022-10-11 14:29:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3144 (0.3331) loss 3.7133 (3.8292) grad_norm 0.0000 (0.0000) [2022-10-11 14:29:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3228 (0.3329) loss 3.6747 (3.8285) grad_norm 0.0000 (0.0000) [2022-10-11 14:30:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [109/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3500 (0.3328) loss 3.8483 (3.8285) grad_norm 0.0000 (0.0000) [2022-10-11 14:30:24 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 109 training takes 0:06:56 [2022-10-11 14:30:28 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.395 (3.395) Loss 1.1068 (1.1068) Acc@1 73.535 (73.535) Acc@5 92.480 (92.480) [2022-10-11 14:30:40 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.300 Acc@5 92.182 [2022-10-11 14:30:40 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-11 14:30:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.44% [2022-10-11 14:30:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][0/1251] eta 1:14:53 lr 0.000001 time 3.5923 (3.5923) loss 3.9916 (3.9916) grad_norm 0.0000 (0.0000) [2022-10-11 14:31:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3557 (0.3667) loss 3.7931 (3.8176) grad_norm 0.0000 (0.0000) [2022-10-11 14:31:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3319 (0.3485) loss 3.6933 (3.8087) grad_norm 0.0000 (0.0000) [2022-10-11 14:32:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3261 (0.3427) loss 3.8050 (3.8163) grad_norm 0.0000 (0.0000) [2022-10-11 14:32:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3455 (0.3399) loss 3.7999 (3.8198) grad_norm 0.0000 (0.0000) [2022-10-11 14:33:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3452 (0.3383) loss 3.8645 (3.8181) grad_norm 0.0000 (0.0000) [2022-10-11 14:34:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3404 (0.3367) loss 3.9134 (3.8186) grad_norm 0.0000 (0.0000) [2022-10-11 14:34:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3148 (0.3355) loss 4.0707 (3.8199) grad_norm 0.0000 (0.0000) [2022-10-11 14:35:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3363 (0.3347) loss 3.9618 (3.8209) grad_norm 0.0000 (0.0000) [2022-10-11 14:35:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3345 (0.3342) loss 3.4434 (3.8208) grad_norm 0.0000 (0.0000) [2022-10-11 14:36:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3307 (0.3337) loss 3.9792 (3.8196) grad_norm 0.0000 (0.0000) [2022-10-11 14:36:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3282 (0.3332) loss 3.7581 (3.8218) grad_norm 0.0000 (0.0000) [2022-10-11 14:37:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [110/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3315 (0.3329) loss 3.7728 (3.8215) grad_norm 0.0000 (0.0000) [2022-10-11 14:37:36 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 110 training takes 0:06:56 [2022-10-11 14:37:36 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_110 saving...... [2022-10-11 14:37:36 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_110 saved !!! [2022-10-11 14:37:39 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.024 (3.024) Loss 1.1468 (1.1468) Acc@1 74.512 (74.512) Acc@5 91.211 (91.211) [2022-10-11 14:37:51 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.290 Acc@5 92.274 [2022-10-11 14:37:51 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-11 14:37:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.44% [2022-10-11 14:37:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][0/1251] eta 1:08:27 lr 0.000001 time 3.2831 (3.2831) loss 3.8705 (3.8705) grad_norm 0.0000 (0.0000) [2022-10-11 14:38:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3135 (0.3641) loss 3.8514 (3.7975) grad_norm 0.0000 (0.0000) [2022-10-11 14:39:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3253 (0.3465) loss 3.7877 (3.8000) grad_norm 0.0000 (0.0000) [2022-10-11 14:39:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3388 (0.3411) loss 3.7579 (3.8063) grad_norm 0.0000 (0.0000) [2022-10-11 14:40:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3141 (0.3381) loss 3.5988 (3.8097) grad_norm 0.0000 (0.0000) [2022-10-11 14:40:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3172 (0.3364) loss 3.9274 (3.8128) grad_norm 0.0000 (0.0000) [2022-10-11 14:41:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3405 (0.3354) loss 3.7738 (3.8146) grad_norm 0.0000 (0.0000) [2022-10-11 14:41:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3396 (0.3347) loss 3.8246 (3.8161) grad_norm 0.0000 (0.0000) [2022-10-11 14:42:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3317 (0.3341) loss 3.6481 (3.8153) grad_norm 0.0000 (0.0000) [2022-10-11 14:42:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3508 (0.3337) loss 3.6619 (3.8173) grad_norm 0.0000 (0.0000) [2022-10-11 14:43:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3354 (0.3333) loss 3.7490 (3.8175) grad_norm 0.0000 (0.0000) [2022-10-11 14:43:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3365 (0.3331) loss 3.6945 (3.8185) grad_norm 0.0000 (0.0000) [2022-10-11 14:44:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [111/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3156 (0.3328) loss 3.8751 (3.8186) grad_norm 0.0000 (0.0000) [2022-10-11 14:44:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 111 training takes 0:06:56 [2022-10-11 14:44:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.038 (3.038) Loss 1.0986 (1.0986) Acc@1 74.023 (74.023) Acc@5 93.262 (93.262) [2022-10-11 14:45:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.298 Acc@5 92.162 [2022-10-11 14:45:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-11 14:45:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.44% [2022-10-11 14:45:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][0/1251] eta 1:13:29 lr 0.000001 time 3.5250 (3.5250) loss 3.8735 (3.8735) grad_norm 0.0000 (0.0000) [2022-10-11 14:45:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3708 (0.3658) loss 3.7472 (3.7924) grad_norm 0.0000 (0.0000) [2022-10-11 14:46:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3360 (0.3484) loss 3.7476 (3.8113) grad_norm 0.0000 (0.0000) [2022-10-11 14:46:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3085 (0.3422) loss 3.7844 (3.8106) grad_norm 0.0000 (0.0000) [2022-10-11 14:47:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3067 (0.3392) loss 3.9159 (3.8158) grad_norm 0.0000 (0.0000) [2022-10-11 14:47:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3347 (0.3376) loss 3.8893 (3.8184) grad_norm 0.0000 (0.0000) [2022-10-11 14:48:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3335 (0.3360) loss 3.9147 (3.8219) grad_norm 0.0000 (0.0000) [2022-10-11 14:48:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3165 (0.3351) loss 3.9920 (3.8221) grad_norm 0.0000 (0.0000) [2022-10-11 14:49:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3314 (0.3344) loss 4.0282 (3.8216) grad_norm 0.0000 (0.0000) [2022-10-11 14:50:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3382 (0.3336) loss 3.8788 (3.8191) grad_norm 0.0000 (0.0000) [2022-10-11 14:50:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3384 (0.3333) loss 3.7258 (3.8209) grad_norm 0.0000 (0.0000) [2022-10-11 14:51:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3291 (0.3330) loss 4.0849 (3.8209) grad_norm 0.0000 (0.0000) [2022-10-11 14:51:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [112/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3001 (0.3326) loss 3.7684 (3.8199) grad_norm 0.0000 (0.0000) [2022-10-11 14:51:58 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 112 training takes 0:06:55 [2022-10-11 14:52:01 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.018 (3.018) Loss 1.0506 (1.0506) Acc@1 75.488 (75.488) Acc@5 92.090 (92.090) [2022-10-11 14:52:13 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.548 Acc@5 92.426 [2022-10-11 14:52:13 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-11 14:52:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.55% [2022-10-11 14:52:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][0/1251] eta 1:12:45 lr 0.000001 time 3.4899 (3.4899) loss 3.8485 (3.8485) grad_norm 0.0000 (0.0000) [2022-10-11 14:52:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3286 (0.3655) loss 3.6057 (3.7839) grad_norm 0.0000 (0.0000) [2022-10-11 14:53:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3089 (0.3478) loss 3.7444 (3.7997) grad_norm 0.0000 (0.0000) [2022-10-11 14:53:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3464 (0.3421) loss 3.6125 (3.8118) grad_norm 0.0000 (0.0000) [2022-10-11 14:54:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3296 (0.3395) loss 3.7724 (3.8140) grad_norm 0.0000 (0.0000) [2022-10-11 14:55:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3070 (0.3375) loss 3.9163 (3.8150) grad_norm 0.0000 (0.0000) [2022-10-11 14:55:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3136 (0.3359) loss 3.6703 (3.8134) grad_norm 0.0000 (0.0000) [2022-10-11 14:56:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3108 (0.3352) loss 3.7501 (3.8135) grad_norm 0.0000 (0.0000) [2022-10-11 14:56:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3323 (0.3345) loss 3.6012 (3.8142) grad_norm 0.0000 (0.0000) [2022-10-11 14:57:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3249 (0.3340) loss 3.6648 (3.8138) grad_norm 0.0000 (0.0000) [2022-10-11 14:57:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3533 (0.3336) loss 3.6870 (3.8134) grad_norm 0.0000 (0.0000) [2022-10-11 14:58:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3311 (0.3334) loss 4.0326 (3.8135) grad_norm 0.0000 (0.0000) [2022-10-11 14:58:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [113/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3294 (0.3330) loss 3.8838 (3.8141) grad_norm 0.0000 (0.0000) [2022-10-11 14:59:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 113 training takes 0:06:56 [2022-10-11 14:59:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.651 (3.651) Loss 1.1889 (1.1889) Acc@1 72.656 (72.656) Acc@5 92.188 (92.188) [2022-10-11 14:59:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.156 Acc@5 92.248 [2022-10-11 14:59:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-10-11 14:59:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.55% [2022-10-11 14:59:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][0/1251] eta 1:13:44 lr 0.000001 time 3.5367 (3.5367) loss 3.8842 (3.8842) grad_norm 0.0000 (0.0000) [2022-10-11 15:00:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3175 (0.3630) loss 3.9570 (3.7772) grad_norm 0.0000 (0.0000) [2022-10-11 15:00:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3222 (0.3473) loss 3.9518 (3.8042) grad_norm 0.0000 (0.0000) [2022-10-11 15:01:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3244 (0.3410) loss 3.6622 (3.8094) grad_norm 0.0000 (0.0000) [2022-10-11 15:01:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3455 (0.3388) loss 3.6514 (3.8109) grad_norm 0.0000 (0.0000) [2022-10-11 15:02:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3264 (0.3367) loss 3.5317 (3.8082) grad_norm 0.0000 (0.0000) [2022-10-11 15:02:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][600/1251] eta 0:03:38 lr 0.000001 time 0.2976 (0.3353) loss 3.9284 (3.8089) grad_norm 0.0000 (0.0000) [2022-10-11 15:03:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3218 (0.3343) loss 3.5456 (3.8113) grad_norm 0.0000 (0.0000) [2022-10-11 15:03:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3316 (0.3337) loss 4.0372 (3.8121) grad_norm 0.0000 (0.0000) [2022-10-11 15:04:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][900/1251] eta 0:01:56 lr 0.000001 time 0.3386 (0.3333) loss 3.7676 (3.8094) grad_norm 0.0000 (0.0000) [2022-10-11 15:04:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3313 (0.3331) loss 3.9177 (3.8117) grad_norm 0.0000 (0.0000) [2022-10-11 15:05:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3436 (0.3329) loss 3.8698 (3.8139) grad_norm 0.0000 (0.0000) [2022-10-11 15:06:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [114/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3414 (0.3327) loss 3.7041 (3.8134) grad_norm 0.0000 (0.0000) [2022-10-11 15:06:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 114 training takes 0:06:56 [2022-10-11 15:06:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.319 (3.319) Loss 1.1311 (1.1311) Acc@1 73.828 (73.828) Acc@5 91.797 (91.797) [2022-10-11 15:06:35 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.630 Acc@5 92.422 [2022-10-11 15:06:35 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-10-11 15:06:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.63% [2022-10-11 15:06:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][0/1251] eta 1:13:04 lr 0.000001 time 3.5051 (3.5051) loss 3.9127 (3.9127) grad_norm 0.0000 (0.0000) [2022-10-11 15:07:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3656 (0.3647) loss 3.5815 (3.7818) grad_norm 0.0000 (0.0000) [2022-10-11 15:07:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3326 (0.3476) loss 3.8167 (3.8017) grad_norm 0.0000 (0.0000) [2022-10-11 15:08:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3603 (0.3420) loss 3.8480 (3.8067) grad_norm 0.0000 (0.0000) [2022-10-11 15:08:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3484 (0.3390) loss 3.8323 (3.8034) grad_norm 0.0000 (0.0000) [2022-10-11 15:09:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3111 (0.3371) loss 3.7297 (3.8042) grad_norm 0.0000 (0.0000) [2022-10-11 15:09:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3070 (0.3359) loss 3.6433 (3.8001) grad_norm 0.0000 (0.0000) [2022-10-11 15:10:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3230 (0.3347) loss 3.7266 (3.8015) grad_norm 0.0000 (0.0000) [2022-10-11 15:11:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3386 (0.3345) loss 3.7517 (3.8011) grad_norm 0.0000 (0.0000) [2022-10-11 15:11:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3226 (0.3339) loss 3.9142 (3.7993) grad_norm 0.0000 (0.0000) [2022-10-11 15:12:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3236 (0.3335) loss 3.9381 (3.7995) grad_norm 0.0000 (0.0000) [2022-10-11 15:12:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3482 (0.3330) loss 3.9169 (3.8030) grad_norm 0.0000 (0.0000) [2022-10-11 15:13:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [115/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3286 (0.3329) loss 3.7984 (3.8024) grad_norm 0.0000 (0.0000) [2022-10-11 15:13:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 115 training takes 0:06:56 [2022-10-11 15:13:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.199 (3.199) Loss 1.0762 (1.0762) Acc@1 74.219 (74.219) Acc@5 92.090 (92.090) [2022-10-11 15:13:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.800 Acc@5 92.498 [2022-10-11 15:13:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-11 15:13:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.80% [2022-10-11 15:13:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][0/1251] eta 1:14:00 lr 0.000001 time 3.5495 (3.5495) loss 3.6234 (3.6234) grad_norm 0.0000 (0.0000) [2022-10-11 15:14:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3473 (0.3640) loss 3.8760 (3.7876) grad_norm 0.0000 (0.0000) [2022-10-11 15:14:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3138 (0.3458) loss 3.8620 (3.7933) grad_norm 0.0000 (0.0000) [2022-10-11 15:15:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3681 (0.3408) loss 3.9091 (3.7979) grad_norm 0.0000 (0.0000) [2022-10-11 15:16:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3308 (0.3383) loss 3.3663 (3.7969) grad_norm 0.0000 (0.0000) [2022-10-11 15:16:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3379 (0.3364) loss 3.9432 (3.8053) grad_norm 0.0000 (0.0000) [2022-10-11 15:17:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3143 (0.3353) loss 4.0006 (3.8043) grad_norm 0.0000 (0.0000) [2022-10-11 15:17:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3260 (0.3344) loss 3.7625 (3.8019) grad_norm 0.0000 (0.0000) [2022-10-11 15:18:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3143 (0.3337) loss 3.9819 (3.8065) grad_norm 0.0000 (0.0000) [2022-10-11 15:18:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3330 (0.3334) loss 3.9342 (3.8076) grad_norm 0.0000 (0.0000) [2022-10-11 15:19:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3175 (0.3330) loss 3.7534 (3.8041) grad_norm 0.0000 (0.0000) [2022-10-11 15:19:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3361 (0.3327) loss 3.8426 (3.8023) grad_norm 0.0000 (0.0000) [2022-10-11 15:20:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [116/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3173 (0.3324) loss 3.9059 (3.8035) grad_norm 0.0000 (0.0000) [2022-10-11 15:20:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 116 training takes 0:06:55 [2022-10-11 15:20:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.239 (3.239) Loss 1.0454 (1.0454) Acc@1 75.488 (75.488) Acc@5 92.188 (92.188) [2022-10-11 15:20:56 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.436 Acc@5 92.346 [2022-10-11 15:20:56 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-10-11 15:20:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.80% [2022-10-11 15:21:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][0/1251] eta 1:15:12 lr 0.000001 time 3.6069 (3.6069) loss 3.9521 (3.9521) grad_norm 0.0000 (0.0000) [2022-10-11 15:21:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3185 (0.3665) loss 3.8379 (3.7665) grad_norm 0.0000 (0.0000) [2022-10-11 15:22:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3325 (0.3493) loss 3.6641 (3.7932) grad_norm 0.0000 (0.0000) [2022-10-11 15:22:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3318 (0.3435) loss 3.5938 (3.7856) grad_norm 0.0000 (0.0000) [2022-10-11 15:23:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3419 (0.3403) loss 3.6535 (3.7929) grad_norm 0.0000 (0.0000) [2022-10-11 15:23:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3051 (0.3379) loss 3.6528 (3.7934) grad_norm 0.0000 (0.0000) [2022-10-11 15:24:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3577 (0.3365) loss 3.8760 (3.7930) grad_norm 0.0000 (0.0000) [2022-10-11 15:24:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3501 (0.3354) loss 3.6944 (3.7931) grad_norm 0.0000 (0.0000) [2022-10-11 15:25:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3564 (0.3349) loss 3.5393 (3.7940) grad_norm 0.0000 (0.0000) [2022-10-11 15:25:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3298 (0.3344) loss 3.8120 (3.7971) grad_norm 0.0000 (0.0000) [2022-10-11 15:26:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3083 (0.3339) loss 3.7223 (3.8005) grad_norm 0.0000 (0.0000) [2022-10-11 15:27:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3306 (0.3336) loss 3.9388 (3.8021) grad_norm 0.0000 (0.0000) [2022-10-11 15:27:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [117/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3193 (0.3333) loss 3.8334 (3.8005) grad_norm 0.0000 (0.0000) [2022-10-11 15:27:53 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 117 training takes 0:06:56 [2022-10-11 15:27:56 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.344 (3.344) Loss 1.0977 (1.0977) Acc@1 75.488 (75.488) Acc@5 92.188 (92.188) [2022-10-11 15:28:08 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.928 Acc@5 92.444 [2022-10-11 15:28:08 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-11 15:28:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.93% [2022-10-11 15:28:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][0/1251] eta 1:08:26 lr 0.000001 time 3.2824 (3.2824) loss 3.6692 (3.6692) grad_norm 0.0000 (0.0000) [2022-10-11 15:28:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3462 (0.3660) loss 3.7598 (3.7822) grad_norm 0.0000 (0.0000) [2022-10-11 15:29:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3227 (0.3493) loss 3.7731 (3.8058) grad_norm 0.0000 (0.0000) [2022-10-11 15:29:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3347 (0.3436) loss 3.9207 (3.7950) grad_norm 0.0000 (0.0000) [2022-10-11 15:30:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3184 (0.3407) loss 3.6824 (3.7945) grad_norm 0.0000 (0.0000) [2022-10-11 15:30:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3606 (0.3390) loss 3.7607 (3.7930) grad_norm 0.0000 (0.0000) [2022-10-11 15:31:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3197 (0.3376) loss 3.7530 (3.7945) grad_norm 0.0000 (0.0000) [2022-10-11 15:32:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3324 (0.3365) loss 3.5026 (3.7969) grad_norm 0.0000 (0.0000) [2022-10-11 15:32:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3443 (0.3358) loss 3.7249 (3.7994) grad_norm 0.0000 (0.0000) [2022-10-11 15:33:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3333 (0.3352) loss 3.7529 (3.7981) grad_norm 0.0000 (0.0000) [2022-10-11 15:33:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3382 (0.3349) loss 3.8624 (3.7925) grad_norm 0.0000 (0.0000) [2022-10-11 15:34:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3199 (0.3345) loss 3.8064 (3.7932) grad_norm 0.0000 (0.0000) [2022-10-11 15:34:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [118/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3300 (0.3343) loss 4.0451 (3.7960) grad_norm 0.0000 (0.0000) [2022-10-11 15:35:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 118 training takes 0:06:57 [2022-10-11 15:35:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.137 (3.137) Loss 1.0992 (1.0992) Acc@1 75.000 (75.000) Acc@5 92.090 (92.090) [2022-10-11 15:35:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.616 Acc@5 92.194 [2022-10-11 15:35:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-10-11 15:35:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.93% [2022-10-11 15:35:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][0/1251] eta 1:08:46 lr 0.000001 time 3.2983 (3.2983) loss 3.8656 (3.8656) grad_norm 0.0000 (0.0000) [2022-10-11 15:35:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3423 (0.3630) loss 3.8594 (3.8050) grad_norm 0.0000 (0.0000) [2022-10-11 15:36:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3295 (0.3471) loss 3.8646 (3.7896) grad_norm 0.0000 (0.0000) [2022-10-11 15:37:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3133 (0.3419) loss 3.9113 (3.7846) grad_norm 0.0000 (0.0000) [2022-10-11 15:37:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3313 (0.3391) loss 3.6997 (3.7854) grad_norm 0.0000 (0.0000) [2022-10-11 15:38:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3283 (0.3374) loss 3.6054 (3.7828) grad_norm 0.0000 (0.0000) [2022-10-11 15:38:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3359 (0.3362) loss 4.0387 (3.7785) grad_norm 0.0000 (0.0000) [2022-10-11 15:39:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3139 (0.3355) loss 3.8543 (3.7813) grad_norm 0.0000 (0.0000) [2022-10-11 15:39:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3240 (0.3350) loss 3.8303 (3.7837) grad_norm 0.0000 (0.0000) [2022-10-11 15:40:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3293 (0.3345) loss 3.6880 (3.7910) grad_norm 0.0000 (0.0000) [2022-10-11 15:40:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3387 (0.3339) loss 3.7956 (3.7910) grad_norm 0.0000 (0.0000) [2022-10-11 15:41:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3564 (0.3337) loss 3.8115 (3.7931) grad_norm 0.0000 (0.0000) [2022-10-11 15:42:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [119/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3022 (0.3334) loss 4.0154 (3.7949) grad_norm 0.0000 (0.0000) [2022-10-11 15:42:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 119 training takes 0:06:56 [2022-10-11 15:42:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.103 (3.103) Loss 1.0782 (1.0782) Acc@1 73.535 (73.535) Acc@5 92.871 (92.871) [2022-10-11 15:42:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.806 Acc@5 92.464 [2022-10-11 15:42:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-11 15:42:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 74.93% [2022-10-11 15:42:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][0/1251] eta 1:12:34 lr 0.000001 time 3.4810 (3.4810) loss 3.5663 (3.5663) grad_norm 0.0000 (0.0000) [2022-10-11 15:43:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][100/1251] eta 0:06:58 lr 0.000001 time 0.3326 (0.3634) loss 3.6982 (3.7616) grad_norm 0.0000 (0.0000) [2022-10-11 15:43:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][200/1251] eta 0:06:03 lr 0.000001 time 0.3174 (0.3462) loss 3.7701 (3.7679) grad_norm 0.0000 (0.0000) [2022-10-11 15:44:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3239 (0.3409) loss 3.8146 (3.7672) grad_norm 0.0000 (0.0000) [2022-10-11 15:44:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3216 (0.3383) loss 3.6135 (3.7741) grad_norm 0.0000 (0.0000) [2022-10-11 15:45:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3497 (0.3361) loss 3.8490 (3.7752) grad_norm 0.0000 (0.0000) [2022-10-11 15:45:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3329 (0.3351) loss 3.4373 (3.7788) grad_norm 0.0000 (0.0000) [2022-10-11 15:46:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3517 (0.3344) loss 3.8830 (3.7768) grad_norm 0.0000 (0.0000) [2022-10-11 15:47:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3663 (0.3341) loss 4.0319 (3.7776) grad_norm 0.0000 (0.0000) [2022-10-11 15:47:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3033 (0.3334) loss 3.6320 (3.7789) grad_norm 0.0000 (0.0000) [2022-10-11 15:48:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3373 (0.3332) loss 3.9074 (3.7777) grad_norm 0.0000 (0.0000) [2022-10-11 15:48:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.2995 (0.3330) loss 4.0945 (3.7797) grad_norm 0.0000 (0.0000) [2022-10-11 15:49:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [120/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3191 (0.3328) loss 3.9096 (3.7790) grad_norm 0.0000 (0.0000) [2022-10-11 15:49:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 120 training takes 0:06:56 [2022-10-11 15:49:29 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_120 saving...... [2022-10-11 15:49:29 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_120 saved !!! [2022-10-11 15:49:32 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.027 (3.027) Loss 1.1230 (1.1230) Acc@1 73.633 (73.633) Acc@5 91.699 (91.699) [2022-10-11 15:49:44 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.058 Acc@5 92.532 [2022-10-11 15:49:44 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-10-11 15:49:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.06% [2022-10-11 15:49:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][0/1251] eta 1:04:46 lr 0.000001 time 3.1067 (3.1067) loss 3.7884 (3.7884) grad_norm 0.0000 (0.0000) [2022-10-11 15:50:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3537 (0.3650) loss 3.7442 (3.7644) grad_norm 0.0000 (0.0000) [2022-10-11 15:50:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3179 (0.3479) loss 3.6208 (3.7634) grad_norm 0.0000 (0.0000) [2022-10-11 15:51:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3194 (0.3420) loss 3.9243 (3.7696) grad_norm 0.0000 (0.0000) [2022-10-11 15:52:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3330 (0.3394) loss 3.6857 (3.7705) grad_norm 0.0000 (0.0000) [2022-10-11 15:52:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3532 (0.3373) loss 4.1403 (3.7732) grad_norm 0.0000 (0.0000) [2022-10-11 15:53:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3344 (0.3362) loss 3.7189 (3.7759) grad_norm 0.0000 (0.0000) [2022-10-11 15:53:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][700/1251] eta 0:03:04 lr 0.000001 time 0.2996 (0.3354) loss 3.8335 (3.7775) grad_norm 0.0000 (0.0000) [2022-10-11 15:54:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3035 (0.3348) loss 3.9135 (3.7774) grad_norm 0.0000 (0.0000) [2022-10-11 15:54:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3332 (0.3344) loss 4.1014 (3.7787) grad_norm 0.0000 (0.0000) [2022-10-11 15:55:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3117 (0.3340) loss 3.8974 (3.7802) grad_norm 0.0000 (0.0000) [2022-10-11 15:55:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3183 (0.3338) loss 3.8973 (3.7824) grad_norm 0.0000 (0.0000) [2022-10-11 15:56:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [121/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3527 (0.3336) loss 3.5301 (3.7816) grad_norm 0.0000 (0.0000) [2022-10-11 15:56:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 121 training takes 0:06:57 [2022-10-11 15:56:44 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.274 (3.274) Loss 1.0638 (1.0638) Acc@1 75.586 (75.586) Acc@5 92.871 (92.871) [2022-10-11 15:56:56 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.176 Acc@5 92.642 [2022-10-11 15:56:56 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-11 15:56:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.18% [2022-10-11 15:56:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][0/1251] eta 1:11:29 lr 0.000001 time 3.4286 (3.4286) loss 3.7736 (3.7736) grad_norm 0.0000 (0.0000) [2022-10-11 15:57:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3351 (0.3648) loss 3.9996 (3.7757) grad_norm 0.0000 (0.0000) [2022-10-11 15:58:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3084 (0.3467) loss 3.9074 (3.7602) grad_norm 0.0000 (0.0000) [2022-10-11 15:58:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3404 (0.3419) loss 3.5559 (3.7759) grad_norm 0.0000 (0.0000) [2022-10-11 15:59:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3254 (0.3394) loss 3.6088 (3.7650) grad_norm 0.0000 (0.0000) [2022-10-11 15:59:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3459 (0.3374) loss 3.9943 (3.7689) grad_norm 0.0000 (0.0000) [2022-10-11 16:00:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3292 (0.3360) loss 3.5681 (3.7675) grad_norm 0.0000 (0.0000) [2022-10-11 16:00:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3213 (0.3352) loss 3.7915 (3.7736) grad_norm 0.0000 (0.0000) [2022-10-11 16:01:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3184 (0.3344) loss 3.6842 (3.7743) grad_norm 0.0000 (0.0000) [2022-10-11 16:01:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3391 (0.3337) loss 4.0278 (3.7792) grad_norm 0.0000 (0.0000) [2022-10-11 16:02:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3706 (0.3336) loss 3.7710 (3.7794) grad_norm 0.0000 (0.0000) [2022-10-11 16:03:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3155 (0.3332) loss 3.8819 (3.7771) grad_norm 0.0000 (0.0000) [2022-10-11 16:03:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [122/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3257 (0.3328) loss 3.7879 (3.7784) grad_norm 0.0000 (0.0000) [2022-10-11 16:03:52 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 122 training takes 0:06:56 [2022-10-11 16:03:56 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.315 (3.315) Loss 1.1125 (1.1125) Acc@1 74.316 (74.316) Acc@5 91.699 (91.699) [2022-10-11 16:04:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.200 Acc@5 92.542 [2022-10-11 16:04:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-11 16:04:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.20% [2022-10-11 16:04:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][0/1251] eta 1:13:46 lr 0.000001 time 3.5383 (3.5383) loss 3.7014 (3.7014) grad_norm 0.0000 (0.0000) [2022-10-11 16:04:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][100/1251] eta 0:06:56 lr 0.000001 time 0.3391 (0.3617) loss 3.7481 (3.7733) grad_norm 0.0000 (0.0000) [2022-10-11 16:05:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][200/1251] eta 0:06:02 lr 0.000001 time 0.3381 (0.3453) loss 3.7496 (3.7719) grad_norm 0.0000 (0.0000) [2022-10-11 16:05:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][300/1251] eta 0:05:23 lr 0.000001 time 0.3379 (0.3402) loss 3.8437 (3.7659) grad_norm 0.0000 (0.0000) [2022-10-11 16:06:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3038 (0.3375) loss 3.7884 (3.7706) grad_norm 0.0000 (0.0000) [2022-10-11 16:06:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3303 (0.3362) loss 3.8836 (3.7777) grad_norm 0.0000 (0.0000) [2022-10-11 16:07:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3106 (0.3356) loss 3.9842 (3.7784) grad_norm 0.0000 (0.0000) [2022-10-11 16:08:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3260 (0.3347) loss 3.5653 (3.7802) grad_norm 0.0000 (0.0000) [2022-10-11 16:08:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3318 (0.3343) loss 3.9781 (3.7813) grad_norm 0.0000 (0.0000) [2022-10-11 16:09:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3210 (0.3339) loss 3.5506 (3.7813) grad_norm 0.0000 (0.0000) [2022-10-11 16:09:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3369 (0.3340) loss 3.8752 (3.7827) grad_norm 0.0000 (0.0000) [2022-10-11 16:10:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3302 (0.3336) loss 3.9292 (3.7846) grad_norm 0.0000 (0.0000) [2022-10-11 16:10:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [123/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3263 (0.3336) loss 3.6844 (3.7857) grad_norm 0.0000 (0.0000) [2022-10-11 16:11:04 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 123 training takes 0:06:57 [2022-10-11 16:11:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.279 (3.279) Loss 1.1273 (1.1273) Acc@1 73.926 (73.926) Acc@5 91.992 (91.992) [2022-10-11 16:11:19 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 74.976 Acc@5 92.620 [2022-10-11 16:11:19 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-11 16:11:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.20% [2022-10-11 16:11:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][0/1251] eta 1:15:42 lr 0.000001 time 3.6314 (3.6314) loss 3.7139 (3.7139) grad_norm 0.0000 (0.0000) [2022-10-11 16:11:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3139 (0.3672) loss 3.7988 (3.7417) grad_norm 0.0000 (0.0000) [2022-10-11 16:12:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3066 (0.3493) loss 4.0906 (3.7604) grad_norm 0.0000 (0.0000) [2022-10-11 16:13:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3419 (0.3434) loss 3.6846 (3.7625) grad_norm 0.0000 (0.0000) [2022-10-11 16:13:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3204 (0.3408) loss 3.7194 (3.7687) grad_norm 0.0000 (0.0000) [2022-10-11 16:14:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3499 (0.3390) loss 3.6513 (3.7720) grad_norm 0.0000 (0.0000) [2022-10-11 16:14:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3164 (0.3380) loss 3.8331 (3.7686) grad_norm 0.0000 (0.0000) [2022-10-11 16:15:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3465 (0.3370) loss 3.4549 (3.7648) grad_norm 0.0000 (0.0000) [2022-10-11 16:15:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3322 (0.3364) loss 3.9317 (3.7638) grad_norm 0.0000 (0.0000) [2022-10-11 16:16:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3174 (0.3358) loss 4.0164 (3.7652) grad_norm 0.0000 (0.0000) [2022-10-11 16:16:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3396 (0.3356) loss 3.8346 (3.7685) grad_norm 0.0000 (0.0000) [2022-10-11 16:17:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3292 (0.3354) loss 3.9995 (3.7699) grad_norm 0.0000 (0.0000) [2022-10-11 16:18:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [124/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3264 (0.3352) loss 3.6986 (3.7716) grad_norm 0.0000 (0.0000) [2022-10-11 16:18:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 124 training takes 0:06:59 [2022-10-11 16:18:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.211 (3.211) Loss 1.0818 (1.0818) Acc@1 74.023 (74.023) Acc@5 92.578 (92.578) [2022-10-11 16:18:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.054 Acc@5 92.704 [2022-10-11 16:18:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-10-11 16:18:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.20% [2022-10-11 16:18:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][0/1251] eta 1:13:18 lr 0.000001 time 3.5163 (3.5163) loss 3.3900 (3.3900) grad_norm 0.0000 (0.0000) [2022-10-11 16:19:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3389 (0.3670) loss 3.7013 (3.7468) grad_norm 0.0000 (0.0000) [2022-10-11 16:19:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3333 (0.3486) loss 3.9140 (3.7399) grad_norm 0.0000 (0.0000) [2022-10-11 16:20:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3375 (0.3429) loss 3.7270 (3.7523) grad_norm 0.0000 (0.0000) [2022-10-11 16:20:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3398 (0.3405) loss 3.8466 (3.7578) grad_norm 0.0000 (0.0000) [2022-10-11 16:21:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3253 (0.3388) loss 3.8074 (3.7501) grad_norm 0.0000 (0.0000) [2022-10-11 16:21:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3268 (0.3380) loss 3.8054 (3.7547) grad_norm 0.0000 (0.0000) [2022-10-11 16:22:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3498 (0.3375) loss 3.6276 (3.7548) grad_norm 0.0000 (0.0000) [2022-10-11 16:23:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3069 (0.3367) loss 3.8959 (3.7540) grad_norm 0.0000 (0.0000) [2022-10-11 16:23:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3338 (0.3362) loss 4.0154 (3.7559) grad_norm 0.0000 (0.0000) [2022-10-11 16:24:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3316 (0.3359) loss 3.7166 (3.7575) grad_norm 0.0000 (0.0000) [2022-10-11 16:24:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3193 (0.3356) loss 3.8280 (3.7602) grad_norm 0.0000 (0.0000) [2022-10-11 16:25:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [125/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3516 (0.3354) loss 3.7456 (3.7621) grad_norm 0.0000 (0.0000) [2022-10-11 16:25:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 125 training takes 0:06:59 [2022-10-11 16:25:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.468 (3.468) Loss 1.0619 (1.0619) Acc@1 74.414 (74.414) Acc@5 93.066 (93.066) [2022-10-11 16:25:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.290 Acc@5 92.782 [2022-10-11 16:25:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-11 16:25:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.29% [2022-10-11 16:25:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][0/1251] eta 1:08:39 lr 0.000001 time 3.2927 (3.2927) loss 3.9641 (3.9641) grad_norm 0.0000 (0.0000) [2022-10-11 16:26:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3448 (0.3661) loss 4.1353 (3.7823) grad_norm 0.0000 (0.0000) [2022-10-11 16:26:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][200/1251] eta 0:06:04 lr 0.000001 time 0.3387 (0.3467) loss 3.9744 (3.7669) grad_norm 0.0000 (0.0000) [2022-10-11 16:27:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][300/1251] eta 0:05:24 lr 0.000001 time 0.4067 (0.3410) loss 3.6659 (3.7662) grad_norm 0.0000 (0.0000) [2022-10-11 16:28:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3753 (0.3387) loss 3.5676 (3.7679) grad_norm 0.0000 (0.0000) [2022-10-11 16:28:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3481 (0.3373) loss 3.6205 (3.7629) grad_norm 0.0000 (0.0000) [2022-10-11 16:29:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3217 (0.3362) loss 3.9534 (3.7626) grad_norm 0.0000 (0.0000) [2022-10-11 16:29:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3397 (0.3358) loss 3.5836 (3.7586) grad_norm 0.0000 (0.0000) [2022-10-11 16:30:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3157 (0.3351) loss 3.8247 (3.7571) grad_norm 0.0000 (0.0000) [2022-10-11 16:30:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3113 (0.3348) loss 3.8282 (3.7606) grad_norm 0.0000 (0.0000) [2022-10-11 16:31:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3223 (0.3346) loss 3.4898 (3.7625) grad_norm 0.0000 (0.0000) [2022-10-11 16:31:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3424 (0.3345) loss 3.7543 (3.7617) grad_norm 0.0000 (0.0000) [2022-10-11 16:32:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [126/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3187 (0.3343) loss 3.7967 (3.7626) grad_norm 0.0000 (0.0000) [2022-10-11 16:32:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 126 training takes 0:06:58 [2022-10-11 16:32:49 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.221 (3.221) Loss 0.9809 (0.9809) Acc@1 75.879 (75.879) Acc@5 94.043 (94.043) [2022-10-11 16:33:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.258 Acc@5 92.844 [2022-10-11 16:33:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-11 16:33:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.29% [2022-10-11 16:33:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][0/1251] eta 1:09:19 lr 0.000001 time 3.3252 (3.3252) loss 3.8147 (3.8147) grad_norm 0.0000 (0.0000) [2022-10-11 16:33:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][100/1251] eta 0:07:01 lr 0.000001 time 0.4299 (0.3663) loss 3.8392 (3.7369) grad_norm 0.0000 (0.0000) [2022-10-11 16:34:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3202 (0.3486) loss 3.6163 (3.7503) grad_norm 0.0000 (0.0000) [2022-10-11 16:34:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3047 (0.3433) loss 3.7247 (3.7506) grad_norm 0.0000 (0.0000) [2022-10-11 16:35:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3394 (0.3408) loss 3.8459 (3.7502) grad_norm 0.0000 (0.0000) [2022-10-11 16:35:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3184 (0.3387) loss 3.9661 (3.7557) grad_norm 0.0000 (0.0000) [2022-10-11 16:36:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3290 (0.3377) loss 3.7207 (3.7571) grad_norm 0.0000 (0.0000) [2022-10-11 16:36:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3082 (0.3370) loss 3.7411 (3.7554) grad_norm 0.0000 (0.0000) [2022-10-11 16:37:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3453 (0.3364) loss 3.6014 (3.7553) grad_norm 0.0000 (0.0000) [2022-10-11 16:38:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3128 (0.3359) loss 3.7811 (3.7545) grad_norm 0.0000 (0.0000) [2022-10-11 16:38:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3363 (0.3356) loss 3.5621 (3.7535) grad_norm 0.0000 (0.0000) [2022-10-11 16:39:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3120 (0.3355) loss 3.8341 (3.7533) grad_norm 0.0000 (0.0000) [2022-10-11 16:39:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [127/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3300 (0.3351) loss 3.7650 (3.7546) grad_norm 0.0000 (0.0000) [2022-10-11 16:40:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 127 training takes 0:06:58 [2022-10-11 16:40:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.889 (2.889) Loss 1.0054 (1.0054) Acc@1 75.684 (75.684) Acc@5 93.164 (93.164) [2022-10-11 16:40:15 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.238 Acc@5 92.748 [2022-10-11 16:40:15 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-11 16:40:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.29% [2022-10-11 16:40:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][0/1251] eta 1:08:16 lr 0.000001 time 3.2749 (3.2749) loss 3.6785 (3.6785) grad_norm 0.0000 (0.0000) [2022-10-11 16:40:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3183 (0.3684) loss 3.6872 (3.7372) grad_norm 0.0000 (0.0000) [2022-10-11 16:41:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3145 (0.3495) loss 3.9447 (3.7393) grad_norm 0.0000 (0.0000) [2022-10-11 16:41:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3117 (0.3435) loss 3.7850 (3.7423) grad_norm 0.0000 (0.0000) [2022-10-11 16:42:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3423 (0.3410) loss 3.5612 (3.7512) grad_norm 0.0000 (0.0000) [2022-10-11 16:43:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3341 (0.3394) loss 3.7648 (3.7519) grad_norm 0.0000 (0.0000) [2022-10-11 16:43:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3297 (0.3384) loss 3.6343 (3.7540) grad_norm 0.0000 (0.0000) [2022-10-11 16:44:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3284 (0.3375) loss 3.7144 (3.7570) grad_norm 0.0000 (0.0000) [2022-10-11 16:44:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3167 (0.3369) loss 3.6088 (3.7552) grad_norm 0.0000 (0.0000) [2022-10-11 16:45:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3202 (0.3365) loss 3.5799 (3.7579) grad_norm 0.0000 (0.0000) [2022-10-11 16:45:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3254 (0.3362) loss 3.9142 (3.7583) grad_norm 0.0000 (0.0000) [2022-10-11 16:46:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3236 (0.3361) loss 3.6944 (3.7589) grad_norm 0.0000 (0.0000) [2022-10-11 16:46:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [128/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3829 (0.3359) loss 3.3447 (3.7617) grad_norm 0.0000 (0.0000) [2022-10-11 16:47:15 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 128 training takes 0:06:59 [2022-10-11 16:47:19 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.339 (3.339) Loss 0.9765 (0.9765) Acc@1 77.051 (77.051) Acc@5 93.848 (93.848) [2022-10-11 16:47:31 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.182 Acc@5 92.608 [2022-10-11 16:47:31 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-11 16:47:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.29% [2022-10-11 16:47:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][0/1251] eta 1:15:03 lr 0.000001 time 3.6000 (3.6000) loss 3.6873 (3.6873) grad_norm 0.0000 (0.0000) [2022-10-11 16:48:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3422 (0.3694) loss 3.8650 (3.7140) grad_norm 0.0000 (0.0000) [2022-10-11 16:48:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3204 (0.3510) loss 3.9428 (3.7355) grad_norm 0.0000 (0.0000) [2022-10-11 16:49:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3347 (0.3449) loss 3.6725 (3.7447) grad_norm 0.0000 (0.0000) [2022-10-11 16:49:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3350 (0.3417) loss 3.9838 (3.7470) grad_norm 0.0000 (0.0000) [2022-10-11 16:50:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3117 (0.3399) loss 3.8180 (3.7486) grad_norm 0.0000 (0.0000) [2022-10-11 16:50:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3302 (0.3388) loss 4.0107 (3.7516) grad_norm 0.0000 (0.0000) [2022-10-11 16:51:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3337 (0.3380) loss 3.9090 (3.7507) grad_norm 0.0000 (0.0000) [2022-10-11 16:52:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3353 (0.3373) loss 3.7545 (3.7492) grad_norm 0.0000 (0.0000) [2022-10-11 16:52:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3402 (0.3367) loss 3.7395 (3.7486) grad_norm 0.0000 (0.0000) [2022-10-11 16:53:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3336 (0.3362) loss 3.7819 (3.7473) grad_norm 0.0000 (0.0000) [2022-10-11 16:53:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3322 (0.3357) loss 4.0115 (3.7515) grad_norm 0.0000 (0.0000) [2022-10-11 16:54:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [129/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3209 (0.3355) loss 3.7394 (3.7517) grad_norm 0.0000 (0.0000) [2022-10-11 16:54:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 129 training takes 0:06:59 [2022-10-11 16:54:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.460 (3.460) Loss 1.0792 (1.0792) Acc@1 74.316 (74.316) Acc@5 92.090 (92.090) [2022-10-11 16:54:45 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.386 Acc@5 92.834 [2022-10-11 16:54:45 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-11 16:54:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.39% [2022-10-11 16:54:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][0/1251] eta 1:15:54 lr 0.000001 time 3.6410 (3.6410) loss 3.7818 (3.7818) grad_norm 0.0000 (0.0000) [2022-10-11 16:55:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3263 (0.3679) loss 3.8450 (3.7391) grad_norm 0.0000 (0.0000) [2022-10-11 16:55:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3518 (0.3506) loss 3.8783 (3.7395) grad_norm 0.0000 (0.0000) [2022-10-11 16:56:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3026 (0.3452) loss 3.8134 (3.7372) grad_norm 0.0000 (0.0000) [2022-10-11 16:57:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3259 (0.3426) loss 3.6960 (3.7386) grad_norm 0.0000 (0.0000) [2022-10-11 16:57:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3114 (0.3404) loss 3.9597 (3.7453) grad_norm 0.0000 (0.0000) [2022-10-11 16:58:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3356 (0.3391) loss 3.8047 (3.7498) grad_norm 0.0000 (0.0000) [2022-10-11 16:58:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3214 (0.3382) loss 3.7992 (3.7502) grad_norm 0.0000 (0.0000) [2022-10-11 16:59:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3133 (0.3373) loss 3.8843 (3.7514) grad_norm 0.0000 (0.0000) [2022-10-11 16:59:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3252 (0.3366) loss 3.6875 (3.7485) grad_norm 0.0000 (0.0000) [2022-10-11 17:00:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3341 (0.3364) loss 3.2960 (3.7505) grad_norm 0.0000 (0.0000) [2022-10-11 17:00:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3256 (0.3360) loss 3.7809 (3.7522) grad_norm 0.0000 (0.0000) [2022-10-11 17:01:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [130/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3377 (0.3358) loss 3.6063 (3.7540) grad_norm 0.0000 (0.0000) [2022-10-11 17:01:45 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 130 training takes 0:06:59 [2022-10-11 17:01:45 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_130 saving...... [2022-10-11 17:01:46 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_130 saved !!! [2022-10-11 17:01:49 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.189 (3.189) Loss 0.9692 (0.9692) Acc@1 77.441 (77.441) Acc@5 94.336 (94.336) [2022-10-11 17:02:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.234 Acc@5 92.802 [2022-10-11 17:02:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-11 17:02:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.39% [2022-10-11 17:02:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][0/1251] eta 1:07:39 lr 0.000001 time 3.2450 (3.2450) loss 3.7031 (3.7031) grad_norm 0.0000 (0.0000) [2022-10-11 17:02:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3589 (0.3658) loss 3.5367 (3.7378) grad_norm 0.0000 (0.0000) [2022-10-11 17:03:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3274 (0.3490) loss 3.4815 (3.7320) grad_norm 0.0000 (0.0000) [2022-10-11 17:03:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3305 (0.3432) loss 3.9521 (3.7411) grad_norm 0.0000 (0.0000) [2022-10-11 17:04:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3590 (0.3398) loss 3.9507 (3.7397) grad_norm 0.0000 (0.0000) [2022-10-11 17:04:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3225 (0.3380) loss 3.5795 (3.7398) grad_norm 0.0000 (0.0000) [2022-10-11 17:05:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3205 (0.3370) loss 3.8767 (3.7417) grad_norm 0.0000 (0.0000) [2022-10-11 17:05:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3288 (0.3365) loss 3.7533 (3.7428) grad_norm 0.0000 (0.0000) [2022-10-11 17:06:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3175 (0.3359) loss 3.8235 (3.7412) grad_norm 0.0000 (0.0000) [2022-10-11 17:07:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3266 (0.3353) loss 3.5541 (3.7415) grad_norm 0.0000 (0.0000) [2022-10-11 17:07:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3343 (0.3350) loss 3.8282 (3.7412) grad_norm 0.0000 (0.0000) [2022-10-11 17:08:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3309 (0.3347) loss 3.7859 (3.7422) grad_norm 0.0000 (0.0000) [2022-10-11 17:08:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [131/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3110 (0.3345) loss 3.2847 (3.7427) grad_norm 0.0000 (0.0000) [2022-10-11 17:08:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 131 training takes 0:06:58 [2022-10-11 17:09:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.550 (3.550) Loss 1.0094 (1.0094) Acc@1 76.367 (76.367) Acc@5 93.262 (93.262) [2022-10-11 17:09:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.640 Acc@5 92.900 [2022-10-11 17:09:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-11 17:09:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.64% [2022-10-11 17:09:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][0/1251] eta 1:14:50 lr 0.000001 time 3.5896 (3.5896) loss 3.7774 (3.7774) grad_norm 0.0000 (0.0000) [2022-10-11 17:09:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3475 (0.3679) loss 3.8325 (3.7318) grad_norm 0.0000 (0.0000) [2022-10-11 17:10:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3386 (0.3495) loss 3.6742 (3.7474) grad_norm 0.0000 (0.0000) [2022-10-11 17:10:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3350 (0.3440) loss 4.0285 (3.7429) grad_norm 0.0000 (0.0000) [2022-10-11 17:11:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3734 (0.3405) loss 3.8668 (3.7438) grad_norm 0.0000 (0.0000) [2022-10-11 17:12:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3244 (0.3391) loss 3.6063 (3.7405) grad_norm 0.0000 (0.0000) [2022-10-11 17:12:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3444 (0.3381) loss 3.7503 (3.7432) grad_norm 0.0000 (0.0000) [2022-10-11 17:13:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3234 (0.3373) loss 3.6463 (3.7417) grad_norm 0.0000 (0.0000) [2022-10-11 17:13:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3051 (0.3366) loss 3.5574 (3.7399) grad_norm 0.0000 (0.0000) [2022-10-11 17:14:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3245 (0.3363) loss 3.7262 (3.7423) grad_norm 0.0000 (0.0000) [2022-10-11 17:14:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3288 (0.3359) loss 3.6817 (3.7424) grad_norm 0.0000 (0.0000) [2022-10-11 17:15:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3146 (0.3357) loss 3.7912 (3.7426) grad_norm 0.0000 (0.0000) [2022-10-11 17:15:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [132/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3395 (0.3356) loss 3.7519 (3.7460) grad_norm 0.0000 (0.0000) [2022-10-11 17:16:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 132 training takes 0:06:59 [2022-10-11 17:16:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.400 (3.400) Loss 1.0437 (1.0437) Acc@1 75.098 (75.098) Acc@5 92.480 (92.480) [2022-10-11 17:16:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.600 Acc@5 92.766 [2022-10-11 17:16:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-11 17:16:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.64% [2022-10-11 17:16:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][0/1251] eta 1:13:02 lr 0.000001 time 3.5029 (3.5029) loss 3.8520 (3.8520) grad_norm 0.0000 (0.0000) [2022-10-11 17:17:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3076 (0.3691) loss 3.9284 (3.7275) grad_norm 0.0000 (0.0000) [2022-10-11 17:17:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3523 (0.3501) loss 3.5760 (3.7394) grad_norm 0.0000 (0.0000) [2022-10-11 17:18:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3148 (0.3437) loss 3.6517 (3.7314) grad_norm 0.0000 (0.0000) [2022-10-11 17:18:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3143 (0.3413) loss 3.7768 (3.7311) grad_norm 0.0000 (0.0000) [2022-10-11 17:19:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3410 (0.3392) loss 3.6228 (3.7282) grad_norm 0.0000 (0.0000) [2022-10-11 17:19:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3283 (0.3381) loss 3.8735 (3.7304) grad_norm 0.0000 (0.0000) [2022-10-11 17:20:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3602 (0.3374) loss 3.7605 (3.7263) grad_norm 0.0000 (0.0000) [2022-10-11 17:20:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3354 (0.3366) loss 3.8438 (3.7313) grad_norm 0.0000 (0.0000) [2022-10-11 17:21:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3277 (0.3361) loss 3.6400 (3.7334) grad_norm 0.0000 (0.0000) [2022-10-11 17:22:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.2985 (0.3358) loss 3.9214 (3.7349) grad_norm 0.0000 (0.0000) [2022-10-11 17:22:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3232 (0.3355) loss 4.0714 (3.7365) grad_norm 0.0000 (0.0000) [2022-10-11 17:23:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [133/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3335 (0.3352) loss 3.3534 (3.7373) grad_norm 0.0000 (0.0000) [2022-10-11 17:23:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 133 training takes 0:06:59 [2022-10-11 17:23:32 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.238 (3.238) Loss 1.0308 (1.0308) Acc@1 74.707 (74.707) Acc@5 92.480 (92.480) [2022-10-11 17:23:44 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.668 Acc@5 92.940 [2022-10-11 17:23:44 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-10-11 17:23:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.67% [2022-10-11 17:23:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][0/1251] eta 1:10:55 lr 0.000001 time 3.4014 (3.4014) loss 3.8099 (3.8099) grad_norm 0.0000 (0.0000) [2022-10-11 17:24:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3111 (0.3666) loss 3.7133 (3.7314) grad_norm 0.0000 (0.0000) [2022-10-11 17:24:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3874 (0.3491) loss 3.8009 (3.7274) grad_norm 0.0000 (0.0000) [2022-10-11 17:25:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3248 (0.3430) loss 4.0101 (3.7274) grad_norm 0.0000 (0.0000) [2022-10-11 17:26:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3356 (0.3401) loss 3.8283 (3.7297) grad_norm 0.0000 (0.0000) [2022-10-11 17:26:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3401 (0.3381) loss 3.6812 (3.7225) grad_norm 0.0000 (0.0000) [2022-10-11 17:27:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3436 (0.3366) loss 3.6108 (3.7233) grad_norm 0.0000 (0.0000) [2022-10-11 17:27:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3148 (0.3358) loss 3.5806 (3.7269) grad_norm 0.0000 (0.0000) [2022-10-11 17:28:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3306 (0.3353) loss 3.7127 (3.7313) grad_norm 0.0000 (0.0000) [2022-10-11 17:28:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3401 (0.3350) loss 3.7715 (3.7326) grad_norm 0.0000 (0.0000) [2022-10-11 17:29:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3305 (0.3348) loss 3.8886 (3.7311) grad_norm 0.0000 (0.0000) [2022-10-11 17:29:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3295 (0.3345) loss 3.9087 (3.7353) grad_norm 0.0000 (0.0000) [2022-10-11 17:30:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [134/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3328 (0.3341) loss 3.5770 (3.7350) grad_norm 0.0000 (0.0000) [2022-10-11 17:30:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 134 training takes 0:06:57 [2022-10-11 17:30:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.422 (3.422) Loss 1.1413 (1.1413) Acc@1 72.363 (72.363) Acc@5 91.992 (91.992) [2022-10-11 17:30:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.444 Acc@5 92.862 [2022-10-11 17:30:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-11 17:30:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.67% [2022-10-11 17:31:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][0/1251] eta 1:13:55 lr 0.000001 time 3.5452 (3.5452) loss 3.7767 (3.7767) grad_norm 0.0000 (0.0000) [2022-10-11 17:31:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3314 (0.3657) loss 3.4969 (3.7249) grad_norm 0.0000 (0.0000) [2022-10-11 17:32:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3111 (0.3483) loss 3.7594 (3.7285) grad_norm 0.0000 (0.0000) [2022-10-11 17:32:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3153 (0.3422) loss 3.8441 (3.7220) grad_norm 0.0000 (0.0000) [2022-10-11 17:33:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][400/1251] eta 0:04:49 lr 0.000001 time 0.2937 (0.3402) loss 3.5957 (3.7203) grad_norm 0.0000 (0.0000) [2022-10-11 17:33:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3440 (0.3388) loss 3.4941 (3.7234) grad_norm 0.0000 (0.0000) [2022-10-11 17:34:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3322 (0.3380) loss 3.8079 (3.7203) grad_norm 0.0000 (0.0000) [2022-10-11 17:34:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3398 (0.3372) loss 3.7217 (3.7214) grad_norm 0.0000 (0.0000) [2022-10-11 17:35:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3212 (0.3366) loss 3.6207 (3.7199) grad_norm 0.0000 (0.0000) [2022-10-11 17:36:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3098 (0.3362) loss 3.7189 (3.7204) grad_norm 0.0000 (0.0000) [2022-10-11 17:36:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3082 (0.3358) loss 3.4977 (3.7211) grad_norm 0.0000 (0.0000) [2022-10-11 17:37:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3221 (0.3355) loss 4.0615 (3.7259) grad_norm 0.0000 (0.0000) [2022-10-11 17:37:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [135/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3097 (0.3351) loss 3.6126 (3.7256) grad_norm 0.0000 (0.0000) [2022-10-11 17:37:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 135 training takes 0:06:58 [2022-10-11 17:37:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.157 (3.157) Loss 0.9696 (0.9696) Acc@1 75.391 (75.391) Acc@5 93.652 (93.652) [2022-10-11 17:38:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.438 Acc@5 92.788 [2022-10-11 17:38:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-11 17:38:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.67% [2022-10-11 17:38:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][0/1251] eta 1:16:38 lr 0.000001 time 3.6755 (3.6755) loss 3.8251 (3.8251) grad_norm 0.0000 (0.0000) [2022-10-11 17:38:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3287 (0.3684) loss 3.6707 (3.7200) grad_norm 0.0000 (0.0000) [2022-10-11 17:39:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3322 (0.3503) loss 3.8698 (3.7163) grad_norm 0.0000 (0.0000) [2022-10-11 17:39:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3341 (0.3442) loss 3.9725 (3.7208) grad_norm 0.0000 (0.0000) [2022-10-11 17:40:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3219 (0.3407) loss 3.7872 (3.7221) grad_norm 0.0000 (0.0000) [2022-10-11 17:41:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3347 (0.3387) loss 3.9010 (3.7295) grad_norm 0.0000 (0.0000) [2022-10-11 17:41:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3269 (0.3377) loss 3.7757 (3.7266) grad_norm 0.0000 (0.0000) [2022-10-11 17:42:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3212 (0.3367) loss 3.8900 (3.7244) grad_norm 0.0000 (0.0000) [2022-10-11 17:42:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3199 (0.3359) loss 3.6668 (3.7253) grad_norm 0.0000 (0.0000) [2022-10-11 17:43:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3251 (0.3354) loss 3.8696 (3.7254) grad_norm 0.0000 (0.0000) [2022-10-11 17:43:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3361 (0.3349) loss 3.6669 (3.7245) grad_norm 0.0000 (0.0000) [2022-10-11 17:44:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3258 (0.3345) loss 3.8852 (3.7233) grad_norm 0.0000 (0.0000) [2022-10-11 17:44:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [136/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3296 (0.3342) loss 3.7143 (3.7244) grad_norm 0.0000 (0.0000) [2022-10-11 17:45:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 136 training takes 0:06:57 [2022-10-11 17:45:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.310 (3.310) Loss 1.0474 (1.0474) Acc@1 75.684 (75.684) Acc@5 91.602 (91.602) [2022-10-11 17:45:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.784 Acc@5 92.878 [2022-10-11 17:45:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-11 17:45:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.78% [2022-10-11 17:45:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][0/1251] eta 1:14:49 lr 0.000001 time 3.5885 (3.5885) loss 3.6170 (3.6170) grad_norm 0.0000 (0.0000) [2022-10-11 17:46:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3463 (0.3698) loss 3.8595 (3.7299) grad_norm 0.0000 (0.0000) [2022-10-11 17:46:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3072 (0.3509) loss 3.7333 (3.7287) grad_norm 0.0000 (0.0000) [2022-10-11 17:47:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3171 (0.3443) loss 3.6863 (3.7302) grad_norm 0.0000 (0.0000) [2022-10-11 17:47:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3330 (0.3416) loss 3.7643 (3.7303) grad_norm 0.0000 (0.0000) [2022-10-11 17:48:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3425 (0.3399) loss 3.8687 (3.7279) grad_norm 0.0000 (0.0000) [2022-10-11 17:48:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3268 (0.3387) loss 3.9447 (3.7279) grad_norm 0.0000 (0.0000) [2022-10-11 17:49:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3187 (0.3378) loss 3.4770 (3.7271) grad_norm 0.0000 (0.0000) [2022-10-11 17:49:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3348 (0.3371) loss 3.5745 (3.7285) grad_norm 0.0000 (0.0000) [2022-10-11 17:50:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3557 (0.3366) loss 3.9090 (3.7243) grad_norm 0.0000 (0.0000) [2022-10-11 17:51:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3416 (0.3360) loss 3.5725 (3.7270) grad_norm 0.0000 (0.0000) [2022-10-11 17:51:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3451 (0.3356) loss 3.5612 (3.7267) grad_norm 0.0000 (0.0000) [2022-10-11 17:52:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [137/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3137 (0.3352) loss 3.3752 (3.7263) grad_norm 0.0000 (0.0000) [2022-10-11 17:52:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 137 training takes 0:06:59 [2022-10-11 17:52:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.519 (3.519) Loss 1.0622 (1.0622) Acc@1 75.293 (75.293) Acc@5 92.871 (92.871) [2022-10-11 17:52:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.570 Acc@5 92.834 [2022-10-11 17:52:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-11 17:52:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.78% [2022-10-11 17:52:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][0/1251] eta 1:15:10 lr 0.000001 time 3.6054 (3.6054) loss 3.9097 (3.9097) grad_norm 0.0000 (0.0000) [2022-10-11 17:53:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3347 (0.3665) loss 3.7674 (3.6630) grad_norm 0.0000 (0.0000) [2022-10-11 17:53:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3174 (0.3499) loss 3.7916 (3.6950) grad_norm 0.0000 (0.0000) [2022-10-11 17:54:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3264 (0.3434) loss 3.4035 (3.7036) grad_norm 0.0000 (0.0000) [2022-10-11 17:54:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3603 (0.3405) loss 3.4187 (3.7071) grad_norm 0.0000 (0.0000) [2022-10-11 17:55:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3756 (0.3388) loss 3.8664 (3.7090) grad_norm 0.0000 (0.0000) [2022-10-11 17:56:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3158 (0.3375) loss 3.8380 (3.7103) grad_norm 0.0000 (0.0000) [2022-10-11 17:56:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3633 (0.3369) loss 3.5805 (3.7117) grad_norm 0.0000 (0.0000) [2022-10-11 17:57:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3715 (0.3363) loss 3.6253 (3.7140) grad_norm 0.0000 (0.0000) [2022-10-11 17:57:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3113 (0.3358) loss 3.3135 (3.7151) grad_norm 0.0000 (0.0000) [2022-10-11 17:58:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3087 (0.3355) loss 3.8416 (3.7180) grad_norm 0.0000 (0.0000) [2022-10-11 17:58:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3144 (0.3351) loss 3.7763 (3.7202) grad_norm 0.0000 (0.0000) [2022-10-11 17:59:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [138/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3310 (0.3348) loss 3.7851 (3.7222) grad_norm 0.0000 (0.0000) [2022-10-11 17:59:37 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 138 training takes 0:06:58 [2022-10-11 17:59:40 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.124 (3.124) Loss 0.9849 (0.9849) Acc@1 75.684 (75.684) Acc@5 93.848 (93.848) [2022-10-11 17:59:52 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.894 Acc@5 93.008 [2022-10-11 17:59:52 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-11 17:59:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.89% [2022-10-11 17:59:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][0/1251] eta 1:12:27 lr 0.000001 time 3.4753 (3.4753) loss 3.3674 (3.3674) grad_norm 0.0000 (0.0000) [2022-10-11 18:00:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][100/1251] eta 0:07:02 lr 0.000001 time 0.4032 (0.3672) loss 3.8459 (3.6915) grad_norm 0.0000 (0.0000) [2022-10-11 18:01:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3201 (0.3493) loss 3.7309 (3.6890) grad_norm 0.0000 (0.0000) [2022-10-11 18:01:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3337 (0.3433) loss 3.8550 (3.7102) grad_norm 0.0000 (0.0000) [2022-10-11 18:02:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3323 (0.3403) loss 3.6606 (3.7077) grad_norm 0.0000 (0.0000) [2022-10-11 18:02:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3199 (0.3388) loss 3.8040 (3.7066) grad_norm 0.0000 (0.0000) [2022-10-11 18:03:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3151 (0.3377) loss 3.7652 (3.7089) grad_norm 0.0000 (0.0000) [2022-10-11 18:03:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3446 (0.3370) loss 3.6748 (3.7088) grad_norm 0.0000 (0.0000) [2022-10-11 18:04:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3416 (0.3363) loss 3.5795 (3.7100) grad_norm 0.0000 (0.0000) [2022-10-11 18:04:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3259 (0.3358) loss 4.0337 (3.7101) grad_norm 0.0000 (0.0000) [2022-10-11 18:05:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3216 (0.3355) loss 3.7039 (3.7127) grad_norm 0.0000 (0.0000) [2022-10-11 18:06:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3625 (0.3353) loss 3.7635 (3.7153) grad_norm 0.0000 (0.0000) [2022-10-11 18:06:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [139/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3023 (0.3351) loss 3.6940 (3.7146) grad_norm 0.0000 (0.0000) [2022-10-11 18:06:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 139 training takes 0:06:59 [2022-10-11 18:06:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.814 (2.814) Loss 1.0468 (1.0468) Acc@1 74.414 (74.414) Acc@5 92.090 (92.090) [2022-10-11 18:07:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.942 Acc@5 93.004 [2022-10-11 18:07:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-11 18:07:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.94% [2022-10-11 18:07:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][0/1251] eta 1:06:44 lr 0.000001 time 3.2014 (3.2014) loss 3.6606 (3.6606) grad_norm 0.0000 (0.0000) [2022-10-11 18:07:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3371 (0.3683) loss 3.4573 (3.6973) grad_norm 0.0000 (0.0000) [2022-10-11 18:08:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3362 (0.3509) loss 3.5899 (3.7053) grad_norm 0.0000 (0.0000) [2022-10-11 18:08:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3146 (0.3444) loss 3.7192 (3.7032) grad_norm 0.0000 (0.0000) [2022-10-11 18:09:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3321 (0.3408) loss 3.7825 (3.7076) grad_norm 0.0000 (0.0000) [2022-10-11 18:09:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3182 (0.3388) loss 3.6062 (3.7068) grad_norm 0.0000 (0.0000) [2022-10-11 18:10:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3138 (0.3378) loss 3.6057 (3.7088) grad_norm 0.0000 (0.0000) [2022-10-11 18:11:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3641 (0.3369) loss 3.7783 (3.7132) grad_norm 0.0000 (0.0000) [2022-10-11 18:11:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3164 (0.3362) loss 3.9963 (3.7118) grad_norm 0.0000 (0.0000) [2022-10-11 18:12:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3138 (0.3359) loss 3.6464 (3.7123) grad_norm 0.0000 (0.0000) [2022-10-11 18:12:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3478 (0.3357) loss 3.8005 (3.7135) grad_norm 0.0000 (0.0000) [2022-10-11 18:13:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3203 (0.3353) loss 3.9089 (3.7139) grad_norm 0.0000 (0.0000) [2022-10-11 18:13:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [140/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3252 (0.3350) loss 3.5374 (3.7130) grad_norm 0.0000 (0.0000) [2022-10-11 18:14:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 140 training takes 0:06:58 [2022-10-11 18:14:05 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_140 saving...... [2022-10-11 18:14:05 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_140 saved !!! [2022-10-11 18:14:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.231 (3.231) Loss 1.0591 (1.0591) Acc@1 76.270 (76.270) Acc@5 92.383 (92.383) [2022-10-11 18:14:20 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.986 Acc@5 92.982 [2022-10-11 18:14:20 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-11 18:14:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.99% [2022-10-11 18:14:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][0/1251] eta 1:13:49 lr 0.000001 time 3.5411 (3.5411) loss 3.6594 (3.6594) grad_norm 0.0000 (0.0000) [2022-10-11 18:14:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3359 (0.3699) loss 3.5613 (3.6736) grad_norm 0.0000 (0.0000) [2022-10-11 18:15:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3143 (0.3519) loss 3.6996 (3.6739) grad_norm 0.0000 (0.0000) [2022-10-11 18:16:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3097 (0.3460) loss 3.6201 (3.6820) grad_norm 0.0000 (0.0000) [2022-10-11 18:16:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3364 (0.3427) loss 3.6800 (3.6923) grad_norm 0.0000 (0.0000) [2022-10-11 18:17:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3324 (0.3407) loss 3.3775 (3.6983) grad_norm 0.0000 (0.0000) [2022-10-11 18:17:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][600/1251] eta 0:03:41 lr 0.000001 time 0.3270 (0.3395) loss 3.6235 (3.7030) grad_norm 0.0000 (0.0000) [2022-10-11 18:18:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3173 (0.3383) loss 4.0457 (3.7089) grad_norm 0.0000 (0.0000) [2022-10-11 18:18:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3045 (0.3375) loss 3.9603 (3.7116) grad_norm 0.0000 (0.0000) [2022-10-11 18:19:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3349 (0.3369) loss 3.3102 (3.7106) grad_norm 0.0000 (0.0000) [2022-10-11 18:19:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3360 (0.3365) loss 3.6059 (3.7134) grad_norm 0.0000 (0.0000) [2022-10-11 18:20:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3191 (0.3361) loss 3.8210 (3.7143) grad_norm 0.0000 (0.0000) [2022-10-11 18:21:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [141/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3091 (0.3356) loss 3.8155 (3.7157) grad_norm 0.0000 (0.0000) [2022-10-11 18:21:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 141 training takes 0:06:59 [2022-10-11 18:21:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.186 (3.186) Loss 0.9390 (0.9390) Acc@1 77.832 (77.832) Acc@5 94.141 (94.141) [2022-10-11 18:21:35 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.848 Acc@5 92.944 [2022-10-11 18:21:35 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-11 18:21:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 75.99% [2022-10-11 18:21:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][0/1251] eta 1:01:51 lr 0.000001 time 2.9671 (2.9671) loss 3.4557 (3.4557) grad_norm 0.0000 (0.0000) [2022-10-11 18:22:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3410 (0.3663) loss 4.0131 (3.6947) grad_norm 0.0000 (0.0000) [2022-10-11 18:22:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3154 (0.3486) loss 3.5809 (3.6958) grad_norm 0.0000 (0.0000) [2022-10-11 18:23:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3160 (0.3421) loss 3.7718 (3.7043) grad_norm 0.0000 (0.0000) [2022-10-11 18:23:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3029 (0.3387) loss 3.5735 (3.6977) grad_norm 0.0000 (0.0000) [2022-10-11 18:24:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3430 (0.3372) loss 3.8966 (3.6980) grad_norm 0.0000 (0.0000) [2022-10-11 18:24:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3281 (0.3362) loss 3.9735 (3.6994) grad_norm 0.0000 (0.0000) [2022-10-11 18:25:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3423 (0.3352) loss 3.6706 (3.7004) grad_norm 0.0000 (0.0000) [2022-10-11 18:26:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3421 (0.3344) loss 3.9007 (3.7041) grad_norm 0.0000 (0.0000) [2022-10-11 18:26:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3345 (0.3337) loss 3.6175 (3.7034) grad_norm 0.0000 (0.0000) [2022-10-11 18:27:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3169 (0.3334) loss 3.8568 (3.7036) grad_norm 0.0000 (0.0000) [2022-10-11 18:27:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3340 (0.3331) loss 3.8928 (3.7068) grad_norm 0.0000 (0.0000) [2022-10-11 18:28:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [142/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3247 (0.3330) loss 3.7174 (3.7082) grad_norm 0.0000 (0.0000) [2022-10-11 18:28:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 142 training takes 0:06:56 [2022-10-11 18:28:35 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.337 (3.337) Loss 0.9741 (0.9741) Acc@1 76.367 (76.367) Acc@5 94.238 (94.238) [2022-10-11 18:28:47 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.102 Acc@5 93.062 [2022-10-11 18:28:47 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-11 18:28:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.10% [2022-10-11 18:28:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][0/1251] eta 1:12:48 lr 0.000001 time 3.4918 (3.4918) loss 3.7280 (3.7280) grad_norm 0.0000 (0.0000) [2022-10-11 18:29:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3519 (0.3669) loss 3.6606 (3.6919) grad_norm 0.0000 (0.0000) [2022-10-11 18:29:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3390 (0.3491) loss 3.4786 (3.6891) grad_norm 0.0000 (0.0000) [2022-10-11 18:30:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3471 (0.3424) loss 3.4597 (3.6908) grad_norm 0.0000 (0.0000) [2022-10-11 18:31:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3504 (0.3400) loss 3.8772 (3.6910) grad_norm 0.0000 (0.0000) [2022-10-11 18:31:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3422 (0.3384) loss 3.6644 (3.6921) grad_norm 0.0000 (0.0000) [2022-10-11 18:32:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3314 (0.3369) loss 3.6977 (3.6949) grad_norm 0.0000 (0.0000) [2022-10-11 18:32:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][700/1251] eta 0:03:05 lr 0.000001 time 0.2991 (0.3360) loss 3.7647 (3.6944) grad_norm 0.0000 (0.0000) [2022-10-11 18:33:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3378 (0.3354) loss 3.9909 (3.6943) grad_norm 0.0000 (0.0000) [2022-10-11 18:33:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3273 (0.3348) loss 3.7154 (3.6936) grad_norm 0.0000 (0.0000) [2022-10-11 18:34:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3083 (0.3345) loss 3.4964 (3.6956) grad_norm 0.0000 (0.0000) [2022-10-11 18:34:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3391 (0.3343) loss 3.8098 (3.6972) grad_norm 0.0000 (0.0000) [2022-10-11 18:35:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [143/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.2971 (0.3340) loss 4.0387 (3.6991) grad_norm 0.0000 (0.0000) [2022-10-11 18:35:45 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 143 training takes 0:06:57 [2022-10-11 18:35:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.278 (3.278) Loss 1.0440 (1.0440) Acc@1 74.512 (74.512) Acc@5 92.480 (92.480) [2022-10-11 18:36:00 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.844 Acc@5 92.948 [2022-10-11 18:36:00 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-11 18:36:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.10% [2022-10-11 18:36:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][0/1251] eta 1:14:15 lr 0.000001 time 3.5619 (3.5619) loss 3.6915 (3.6915) grad_norm 0.0000 (0.0000) [2022-10-11 18:36:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3291 (0.3704) loss 3.6272 (3.6937) grad_norm 0.0000 (0.0000) [2022-10-11 18:37:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3635 (0.3526) loss 3.6899 (3.6806) grad_norm 0.0000 (0.0000) [2022-10-11 18:37:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3439 (0.3457) loss 3.5316 (3.6795) grad_norm 0.0000 (0.0000) [2022-10-11 18:38:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3452 (0.3419) loss 3.7567 (3.6781) grad_norm 0.0000 (0.0000) [2022-10-11 18:38:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3356 (0.3399) loss 3.6139 (3.6777) grad_norm 0.0000 (0.0000) [2022-10-11 18:39:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3424 (0.3381) loss 3.6126 (3.6816) grad_norm 0.0000 (0.0000) [2022-10-11 18:39:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3378 (0.3372) loss 3.5246 (3.6858) grad_norm 0.0000 (0.0000) [2022-10-11 18:40:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3496 (0.3365) loss 3.7013 (3.6851) grad_norm 0.0000 (0.0000) [2022-10-11 18:41:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3093 (0.3358) loss 3.8660 (3.6868) grad_norm 0.0000 (0.0000) [2022-10-11 18:41:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3008 (0.3354) loss 3.8410 (3.6890) grad_norm 0.0000 (0.0000) [2022-10-11 18:42:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3087 (0.3352) loss 3.6976 (3.6874) grad_norm 0.0000 (0.0000) [2022-10-11 18:42:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [144/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3583 (0.3352) loss 3.4078 (3.6863) grad_norm 0.0000 (0.0000) [2022-10-11 18:42:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 144 training takes 0:06:59 [2022-10-11 18:43:02 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.401 (3.401) Loss 1.1185 (1.1185) Acc@1 73.828 (73.828) Acc@5 91.699 (91.699) [2022-10-11 18:43:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.938 Acc@5 93.084 [2022-10-11 18:43:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-11 18:43:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.10% [2022-10-11 18:43:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][0/1251] eta 1:10:01 lr 0.000001 time 3.3583 (3.3583) loss 3.7707 (3.7707) grad_norm 0.0000 (0.0000) [2022-10-11 18:43:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3185 (0.3692) loss 3.5675 (3.6740) grad_norm 0.0000 (0.0000) [2022-10-11 18:44:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3358 (0.3505) loss 3.5848 (3.7018) grad_norm 0.0000 (0.0000) [2022-10-11 18:44:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3090 (0.3444) loss 3.6048 (3.6937) grad_norm 0.0000 (0.0000) [2022-10-11 18:45:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3348 (0.3413) loss 3.6837 (3.6869) grad_norm 0.0000 (0.0000) [2022-10-11 18:46:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3314 (0.3394) loss 3.7356 (3.6797) grad_norm 0.0000 (0.0000) [2022-10-11 18:46:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3584 (0.3381) loss 4.1568 (3.6814) grad_norm 0.0000 (0.0000) [2022-10-11 18:47:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3383 (0.3373) loss 3.7482 (3.6807) grad_norm 0.0000 (0.0000) [2022-10-11 18:47:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3016 (0.3365) loss 3.6654 (3.6821) grad_norm 0.0000 (0.0000) [2022-10-11 18:48:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3154 (0.3360) loss 3.8792 (3.6828) grad_norm 0.0000 (0.0000) [2022-10-11 18:48:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3603 (0.3354) loss 3.6310 (3.6848) grad_norm 0.0000 (0.0000) [2022-10-11 18:49:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3394 (0.3349) loss 4.0114 (3.6866) grad_norm 0.0000 (0.0000) [2022-10-11 18:49:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [145/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3145 (0.3348) loss 3.5299 (3.6872) grad_norm 0.0000 (0.0000) [2022-10-11 18:50:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 145 training takes 0:06:58 [2022-10-11 18:50:16 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.396 (3.396) Loss 1.0305 (1.0305) Acc@1 75.195 (75.195) Acc@5 92.578 (92.578) [2022-10-11 18:50:28 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.138 Acc@5 93.098 [2022-10-11 18:50:28 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-11 18:50:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.14% [2022-10-11 18:50:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][0/1251] eta 1:13:47 lr 0.000001 time 3.5395 (3.5395) loss 4.0906 (4.0906) grad_norm 0.0000 (0.0000) [2022-10-11 18:51:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3221 (0.3657) loss 3.7907 (3.6786) grad_norm 0.0000 (0.0000) [2022-10-11 18:51:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3689 (0.3488) loss 3.8207 (3.6730) grad_norm 0.0000 (0.0000) [2022-10-11 18:52:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3368 (0.3425) loss 3.8416 (3.6720) grad_norm 0.0000 (0.0000) [2022-10-11 18:52:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3186 (0.3398) loss 3.4829 (3.6746) grad_norm 0.0000 (0.0000) [2022-10-11 18:53:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3249 (0.3379) loss 3.5303 (3.6772) grad_norm 0.0000 (0.0000) [2022-10-11 18:53:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][600/1251] eta 0:03:39 lr 0.000001 time 0.2928 (0.3366) loss 3.8973 (3.6834) grad_norm 0.0000 (0.0000) [2022-10-11 18:54:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3320 (0.3358) loss 3.6758 (3.6826) grad_norm 0.0000 (0.0000) [2022-10-11 18:54:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3809 (0.3352) loss 3.8609 (3.6825) grad_norm 0.0000 (0.0000) [2022-10-11 18:55:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3184 (0.3347) loss 3.4885 (3.6821) grad_norm 0.0000 (0.0000) [2022-10-11 18:56:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3286 (0.3345) loss 3.9802 (3.6864) grad_norm 0.0000 (0.0000) [2022-10-11 18:56:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3437 (0.3344) loss 3.7694 (3.6859) grad_norm 0.0000 (0.0000) [2022-10-11 18:57:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [146/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3352 (0.3342) loss 3.7121 (3.6877) grad_norm 0.0000 (0.0000) [2022-10-11 18:57:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 146 training takes 0:06:57 [2022-10-11 18:57:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.328 (3.328) Loss 1.0153 (1.0153) Acc@1 76.367 (76.367) Acc@5 93.750 (93.750) [2022-10-11 18:57:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 75.846 Acc@5 93.090 [2022-10-11 18:57:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-11 18:57:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.14% [2022-10-11 18:57:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][0/1251] eta 1:02:45 lr 0.000001 time 3.0096 (3.0096) loss 3.7586 (3.7586) grad_norm 0.0000 (0.0000) [2022-10-11 18:58:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3481 (0.3675) loss 3.7679 (3.6689) grad_norm 0.0000 (0.0000) [2022-10-11 18:58:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3347 (0.3504) loss 3.7345 (3.6692) grad_norm 0.0000 (0.0000) [2022-10-11 18:59:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3280 (0.3441) loss 3.5514 (3.6725) grad_norm 0.0000 (0.0000) [2022-10-11 18:59:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3236 (0.3409) loss 3.7744 (3.6747) grad_norm 0.0000 (0.0000) [2022-10-11 19:00:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3117 (0.3390) loss 3.7448 (3.6743) grad_norm 0.0000 (0.0000) [2022-10-11 19:01:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3535 (0.3376) loss 3.6133 (3.6761) grad_norm 0.0000 (0.0000) [2022-10-11 19:01:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3327 (0.3366) loss 3.9291 (3.6804) grad_norm 0.0000 (0.0000) [2022-10-11 19:02:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3476 (0.3360) loss 3.7725 (3.6811) grad_norm 0.0000 (0.0000) [2022-10-11 19:02:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3551 (0.3356) loss 3.8219 (3.6824) grad_norm 0.0000 (0.0000) [2022-10-11 19:03:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3193 (0.3351) loss 3.5494 (3.6807) grad_norm 0.0000 (0.0000) [2022-10-11 19:03:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3264 (0.3348) loss 3.5207 (3.6800) grad_norm 0.0000 (0.0000) [2022-10-11 19:04:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [147/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3267 (0.3345) loss 3.8656 (3.6820) grad_norm 0.0000 (0.0000) [2022-10-11 19:04:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 147 training takes 0:06:58 [2022-10-11 19:04:42 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.379 (3.379) Loss 1.1282 (1.1282) Acc@1 73.926 (73.926) Acc@5 92.188 (92.188) [2022-10-11 19:04:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.152 Acc@5 93.116 [2022-10-11 19:04:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-11 19:04:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.15% [2022-10-11 19:04:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][0/1251] eta 1:10:11 lr 0.000001 time 3.3667 (3.3667) loss 3.5875 (3.5875) grad_norm 0.0000 (0.0000) [2022-10-11 19:05:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3558 (0.3672) loss 3.4169 (3.6566) grad_norm 0.0000 (0.0000) [2022-10-11 19:06:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3291 (0.3487) loss 3.6912 (3.6696) grad_norm 0.0000 (0.0000) [2022-10-11 19:06:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3989 (0.3427) loss 3.4122 (3.6734) grad_norm 0.0000 (0.0000) [2022-10-11 19:07:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3401 (0.3396) loss 3.6211 (3.6738) grad_norm 0.0000 (0.0000) [2022-10-11 19:07:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3269 (0.3376) loss 3.9392 (3.6791) grad_norm 0.0000 (0.0000) [2022-10-11 19:08:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3333 (0.3361) loss 3.5701 (3.6821) grad_norm 0.0000 (0.0000) [2022-10-11 19:08:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3189 (0.3349) loss 3.9718 (3.6817) grad_norm 0.0000 (0.0000) [2022-10-11 19:09:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3268 (0.3343) loss 3.6819 (3.6792) grad_norm 0.0000 (0.0000) [2022-10-11 19:09:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3486 (0.3341) loss 3.4529 (3.6811) grad_norm 0.0000 (0.0000) [2022-10-11 19:10:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3391 (0.3339) loss 3.7202 (3.6853) grad_norm 0.0000 (0.0000) [2022-10-11 19:11:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3219 (0.3338) loss 3.9990 (3.6878) grad_norm 0.0000 (0.0000) [2022-10-11 19:11:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [148/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3266 (0.3335) loss 3.6241 (3.6860) grad_norm 0.0000 (0.0000) [2022-10-11 19:11:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 148 training takes 0:06:56 [2022-10-11 19:11:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.478 (3.478) Loss 1.0854 (1.0854) Acc@1 75.781 (75.781) Acc@5 91.895 (91.895) [2022-10-11 19:12:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.640 Acc@5 93.140 [2022-10-11 19:12:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-11 19:12:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:12:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][0/1251] eta 1:16:48 lr 0.000001 time 3.6838 (3.6838) loss 3.9747 (3.9747) grad_norm 0.0000 (0.0000) [2022-10-11 19:12:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3259 (0.3687) loss 3.8788 (3.6734) grad_norm 0.0000 (0.0000) [2022-10-11 19:13:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3484 (0.3504) loss 3.5494 (3.6734) grad_norm 0.0000 (0.0000) [2022-10-11 19:13:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][300/1251] eta 0:05:27 lr 0.000001 time 0.2972 (0.3439) loss 3.5420 (3.6870) grad_norm 0.0000 (0.0000) [2022-10-11 19:14:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3227 (0.3406) loss 3.6994 (3.6788) grad_norm 0.0000 (0.0000) [2022-10-11 19:14:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3109 (0.3385) loss 3.8771 (3.6793) grad_norm 0.0000 (0.0000) [2022-10-11 19:15:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3239 (0.3372) loss 3.7552 (3.6805) grad_norm 0.0000 (0.0000) [2022-10-11 19:16:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3231 (0.3364) loss 3.7239 (3.6803) grad_norm 0.0000 (0.0000) [2022-10-11 19:16:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3608 (0.3358) loss 3.7135 (3.6780) grad_norm 0.0000 (0.0000) [2022-10-11 19:17:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3317 (0.3354) loss 3.8395 (3.6781) grad_norm 0.0000 (0.0000) [2022-10-11 19:17:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3393 (0.3351) loss 3.5118 (3.6789) grad_norm 0.0000 (0.0000) [2022-10-11 19:18:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3325 (0.3350) loss 3.7225 (3.6775) grad_norm 0.0000 (0.0000) [2022-10-11 19:18:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [149/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3290 (0.3347) loss 3.9440 (3.6774) grad_norm 0.0000 (0.0000) [2022-10-11 19:19:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 149 training takes 0:06:58 [2022-10-11 19:19:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.463 (3.463) Loss 1.0217 (1.0217) Acc@1 75.684 (75.684) Acc@5 94.043 (94.043) [2022-10-11 19:19:20 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.442 Acc@5 93.282 [2022-10-11 19:19:20 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-11 19:19:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:19:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][0/1251] eta 1:04:21 lr 0.000001 time 3.0867 (3.0867) loss 3.7981 (3.7981) grad_norm 0.0000 (0.0000) [2022-10-11 19:19:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3343 (0.3658) loss 3.5377 (3.6733) grad_norm 0.0000 (0.0000) [2022-10-11 19:20:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3263 (0.3488) loss 3.4822 (3.6815) grad_norm 0.0000 (0.0000) [2022-10-11 19:21:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3060 (0.3427) loss 3.8933 (3.6843) grad_norm 0.0000 (0.0000) [2022-10-11 19:21:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3282 (0.3397) loss 3.5181 (3.6803) grad_norm 0.0000 (0.0000) [2022-10-11 19:22:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3506 (0.3382) loss 3.6596 (3.6723) grad_norm 0.0000 (0.0000) [2022-10-11 19:22:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3240 (0.3368) loss 3.4985 (3.6741) grad_norm 0.0000 (0.0000) [2022-10-11 19:23:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3131 (0.3363) loss 3.8660 (3.6772) grad_norm 0.0000 (0.0000) [2022-10-11 19:23:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3280 (0.3360) loss 3.7174 (3.6773) grad_norm 0.0000 (0.0000) [2022-10-11 19:24:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3515 (0.3359) loss 3.5664 (3.6757) grad_norm 0.0000 (0.0000) [2022-10-11 19:24:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3355 (0.3356) loss 3.9101 (3.6740) grad_norm 0.0000 (0.0000) [2022-10-11 19:25:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3374 (0.3354) loss 3.8539 (3.6752) grad_norm 0.0000 (0.0000) [2022-10-11 19:26:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [150/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3221 (0.3351) loss 3.7043 (3.6746) grad_norm 0.0000 (0.0000) [2022-10-11 19:26:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 150 training takes 0:06:58 [2022-10-11 19:26:19 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_150 saving...... [2022-10-11 19:26:19 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_150 saved !!! [2022-10-11 19:26:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.177 (3.177) Loss 1.0082 (1.0082) Acc@1 76.465 (76.465) Acc@5 92.676 (92.676) [2022-10-11 19:26:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.198 Acc@5 93.194 [2022-10-11 19:26:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-11 19:26:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:26:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][0/1251] eta 1:08:09 lr 0.000001 time 3.2691 (3.2691) loss 3.6758 (3.6758) grad_norm 0.0000 (0.0000) [2022-10-11 19:27:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3128 (0.3652) loss 3.5675 (3.6707) grad_norm 0.0000 (0.0000) [2022-10-11 19:27:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3300 (0.3492) loss 3.7484 (3.6702) grad_norm 0.0000 (0.0000) [2022-10-11 19:28:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3376 (0.3439) loss 3.7596 (3.6747) grad_norm 0.0000 (0.0000) [2022-10-11 19:28:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3329 (0.3408) loss 3.5867 (3.6741) grad_norm 0.0000 (0.0000) [2022-10-11 19:29:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3000 (0.3385) loss 3.4045 (3.6689) grad_norm 0.0000 (0.0000) [2022-10-11 19:29:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3248 (0.3376) loss 3.7122 (3.6703) grad_norm 0.0000 (0.0000) [2022-10-11 19:30:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3315 (0.3365) loss 3.3626 (3.6680) grad_norm 0.0000 (0.0000) [2022-10-11 19:31:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3154 (0.3360) loss 3.8325 (3.6702) grad_norm 0.0000 (0.0000) [2022-10-11 19:31:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3288 (0.3356) loss 3.8204 (3.6707) grad_norm 0.0000 (0.0000) [2022-10-11 19:32:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3406 (0.3353) loss 3.6260 (3.6705) grad_norm 0.0000 (0.0000) [2022-10-11 19:32:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3686 (0.3349) loss 3.6879 (3.6722) grad_norm 0.0000 (0.0000) [2022-10-11 19:33:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [151/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3723 (0.3347) loss 3.4473 (3.6709) grad_norm 0.0000 (0.0000) [2022-10-11 19:33:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 151 training takes 0:06:58 [2022-10-11 19:33:35 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.335 (3.335) Loss 1.0571 (1.0571) Acc@1 75.098 (75.098) Acc@5 92.383 (92.383) [2022-10-11 19:33:47 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.452 Acc@5 93.312 [2022-10-11 19:33:47 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-11 19:33:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:33:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][0/1251] eta 1:16:25 lr 0.000001 time 3.6654 (3.6654) loss 3.8845 (3.8845) grad_norm 0.0000 (0.0000) [2022-10-11 19:34:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3261 (0.3656) loss 3.8700 (3.6411) grad_norm 0.0000 (0.0000) [2022-10-11 19:34:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3384 (0.3487) loss 3.4767 (3.6644) grad_norm 0.0000 (0.0000) [2022-10-11 19:35:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3406 (0.3426) loss 3.7864 (3.6639) grad_norm 0.0000 (0.0000) [2022-10-11 19:36:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3171 (0.3393) loss 3.7537 (3.6589) grad_norm 0.0000 (0.0000) [2022-10-11 19:36:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3200 (0.3377) loss 3.7520 (3.6533) grad_norm 0.0000 (0.0000) [2022-10-11 19:37:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3320 (0.3367) loss 3.7781 (3.6539) grad_norm 0.0000 (0.0000) [2022-10-11 19:37:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3214 (0.3356) loss 3.7905 (3.6588) grad_norm 0.0000 (0.0000) [2022-10-11 19:38:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3413 (0.3350) loss 3.7706 (3.6599) grad_norm 0.0000 (0.0000) [2022-10-11 19:38:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3254 (0.3345) loss 3.6858 (3.6657) grad_norm 0.0000 (0.0000) [2022-10-11 19:39:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3258 (0.3342) loss 3.9665 (3.6684) grad_norm 0.0000 (0.0000) [2022-10-11 19:39:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3041 (0.3337) loss 3.5815 (3.6694) grad_norm 0.0000 (0.0000) [2022-10-11 19:40:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [152/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3214 (0.3334) loss 3.8091 (3.6685) grad_norm 0.0000 (0.0000) [2022-10-11 19:40:44 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 152 training takes 0:06:56 [2022-10-11 19:40:47 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.406 (3.406) Loss 0.9547 (0.9547) Acc@1 77.637 (77.637) Acc@5 94.434 (94.434) [2022-10-11 19:40:59 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.550 Acc@5 93.424 [2022-10-11 19:40:59 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-11 19:40:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:41:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][0/1251] eta 1:09:01 lr 0.000001 time 3.3102 (3.3102) loss 3.7258 (3.7258) grad_norm 0.0000 (0.0000) [2022-10-11 19:41:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3339 (0.3653) loss 3.7266 (3.6485) grad_norm 0.0000 (0.0000) [2022-10-11 19:42:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3574 (0.3479) loss 3.3904 (3.6284) grad_norm 0.0000 (0.0000) [2022-10-11 19:42:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3460 (0.3429) loss 3.7597 (3.6398) grad_norm 0.0000 (0.0000) [2022-10-11 19:43:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3304 (0.3395) loss 3.8137 (3.6491) grad_norm 0.0000 (0.0000) [2022-10-11 19:43:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3244 (0.3378) loss 3.4945 (3.6548) grad_norm 0.0000 (0.0000) [2022-10-11 19:44:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3487 (0.3367) loss 3.5131 (3.6509) grad_norm 0.0000 (0.0000) [2022-10-11 19:44:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3118 (0.3356) loss 3.8997 (3.6532) grad_norm 0.0000 (0.0000) [2022-10-11 19:45:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3075 (0.3350) loss 3.5671 (3.6558) grad_norm 0.0000 (0.0000) [2022-10-11 19:46:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3328 (0.3345) loss 3.6503 (3.6552) grad_norm 0.0000 (0.0000) [2022-10-11 19:46:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3317 (0.3341) loss 3.4839 (3.6567) grad_norm 0.0000 (0.0000) [2022-10-11 19:47:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3283 (0.3337) loss 3.8996 (3.6551) grad_norm 0.0000 (0.0000) [2022-10-11 19:47:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [153/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3155 (0.3334) loss 3.6652 (3.6547) grad_norm 0.0000 (0.0000) [2022-10-11 19:47:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 153 training takes 0:06:56 [2022-10-11 19:47:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.466 (3.466) Loss 0.9781 (0.9781) Acc@1 78.223 (78.223) Acc@5 93.359 (93.359) [2022-10-11 19:48:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.330 Acc@5 93.298 [2022-10-11 19:48:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-11 19:48:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:48:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][0/1251] eta 1:17:36 lr 0.000001 time 3.7224 (3.7224) loss 3.7748 (3.7748) grad_norm 0.0000 (0.0000) [2022-10-11 19:48:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3196 (0.3660) loss 3.7821 (3.6356) grad_norm 0.0000 (0.0000) [2022-10-11 19:49:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3598 (0.3478) loss 3.7281 (3.6338) grad_norm 0.0000 (0.0000) [2022-10-11 19:49:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3334 (0.3416) loss 3.6935 (3.6367) grad_norm 0.0000 (0.0000) [2022-10-11 19:50:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3385 (0.3389) loss 3.7520 (3.6380) grad_norm 0.0000 (0.0000) [2022-10-11 19:51:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3266 (0.3376) loss 3.6146 (3.6413) grad_norm 0.0000 (0.0000) [2022-10-11 19:51:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3309 (0.3364) loss 3.5271 (3.6406) grad_norm 0.0000 (0.0000) [2022-10-11 19:52:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3477 (0.3356) loss 3.7223 (3.6450) grad_norm 0.0000 (0.0000) [2022-10-11 19:52:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3356 (0.3352) loss 3.8295 (3.6480) grad_norm 0.0000 (0.0000) [2022-10-11 19:53:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3250 (0.3346) loss 3.3930 (3.6448) grad_norm 0.0000 (0.0000) [2022-10-11 19:53:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3345 (0.3343) loss 3.7078 (3.6475) grad_norm 0.0000 (0.0000) [2022-10-11 19:54:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3618 (0.3340) loss 3.8512 (3.6481) grad_norm 0.0000 (0.0000) [2022-10-11 19:54:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [154/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3248 (0.3338) loss 3.9172 (3.6514) grad_norm 0.0000 (0.0000) [2022-10-11 19:55:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 154 training takes 0:06:57 [2022-10-11 19:55:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.572 (3.572) Loss 1.0984 (1.0984) Acc@1 74.512 (74.512) Acc@5 91.504 (91.504) [2022-10-11 19:55:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.388 Acc@5 93.214 [2022-10-11 19:55:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-11 19:55:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.64% [2022-10-11 19:55:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][0/1251] eta 1:11:06 lr 0.000001 time 3.4108 (3.4108) loss 3.6903 (3.6903) grad_norm 0.0000 (0.0000) [2022-10-11 19:56:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3457 (0.3648) loss 3.6090 (3.6344) grad_norm 0.0000 (0.0000) [2022-10-11 19:56:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3119 (0.3473) loss 3.6045 (3.6233) grad_norm 0.0000 (0.0000) [2022-10-11 19:57:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3280 (0.3419) loss 3.4377 (3.6363) grad_norm 0.0000 (0.0000) [2022-10-11 19:57:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3256 (0.3391) loss 3.6177 (3.6398) grad_norm 0.0000 (0.0000) [2022-10-11 19:58:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3333 (0.3375) loss 3.4981 (3.6411) grad_norm 0.0000 (0.0000) [2022-10-11 19:58:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3583 (0.3364) loss 3.9330 (3.6454) grad_norm 0.0000 (0.0000) [2022-10-11 19:59:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3270 (0.3355) loss 3.3551 (3.6453) grad_norm 0.0000 (0.0000) [2022-10-11 19:59:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3383 (0.3346) loss 3.7032 (3.6458) grad_norm 0.0000 (0.0000) [2022-10-11 20:00:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3263 (0.3340) loss 3.4472 (3.6449) grad_norm 0.0000 (0.0000) [2022-10-11 20:00:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3209 (0.3336) loss 3.7259 (3.6458) grad_norm 0.0000 (0.0000) [2022-10-11 20:01:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3224 (0.3334) loss 3.5941 (3.6456) grad_norm 0.0000 (0.0000) [2022-10-11 20:02:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [155/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3148 (0.3331) loss 3.6206 (3.6466) grad_norm 0.0000 (0.0000) [2022-10-11 20:02:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 155 training takes 0:06:56 [2022-10-11 20:02:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.028 (3.028) Loss 0.9495 (0.9495) Acc@1 78.125 (78.125) Acc@5 94.824 (94.824) [2022-10-11 20:02:35 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.768 Acc@5 93.400 [2022-10-11 20:02:35 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-11 20:02:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.77% [2022-10-11 20:02:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][0/1251] eta 1:09:26 lr 0.000001 time 3.3302 (3.3302) loss 3.8848 (3.8848) grad_norm 0.0000 (0.0000) [2022-10-11 20:03:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3288 (0.3661) loss 3.8554 (3.6486) grad_norm 0.0000 (0.0000) [2022-10-11 20:03:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3256 (0.3492) loss 3.4927 (3.6456) grad_norm 0.0000 (0.0000) [2022-10-11 20:04:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3329 (0.3443) loss 3.5708 (3.6334) grad_norm 0.0000 (0.0000) [2022-10-11 20:04:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3317 (0.3414) loss 3.6867 (3.6381) grad_norm 0.0000 (0.0000) [2022-10-11 20:05:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3086 (0.3397) loss 3.5308 (3.6402) grad_norm 0.0000 (0.0000) [2022-10-11 20:05:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3365 (0.3383) loss 3.8558 (3.6378) grad_norm 0.0000 (0.0000) [2022-10-11 20:06:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3278 (0.3372) loss 3.4336 (3.6398) grad_norm 0.0000 (0.0000) [2022-10-11 20:07:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3490 (0.3366) loss 3.6007 (3.6387) grad_norm 0.0000 (0.0000) [2022-10-11 20:07:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3083 (0.3360) loss 3.8010 (3.6420) grad_norm 0.0000 (0.0000) [2022-10-11 20:08:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3272 (0.3356) loss 3.8795 (3.6407) grad_norm 0.0000 (0.0000) [2022-10-11 20:08:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3312 (0.3353) loss 3.8018 (3.6430) grad_norm 0.0000 (0.0000) [2022-10-11 20:09:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [156/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3080 (0.3350) loss 3.4109 (3.6468) grad_norm 0.0000 (0.0000) [2022-10-11 20:09:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 156 training takes 0:06:58 [2022-10-11 20:09:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.332 (3.332) Loss 1.0415 (1.0415) Acc@1 76.367 (76.367) Acc@5 93.262 (93.262) [2022-10-11 20:09:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.536 Acc@5 93.488 [2022-10-11 20:09:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-11 20:09:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.77% [2022-10-11 20:09:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][0/1251] eta 1:18:04 lr 0.000001 time 3.7445 (3.7445) loss 3.5054 (3.5054) grad_norm 0.0000 (0.0000) [2022-10-11 20:10:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3222 (0.3685) loss 3.5951 (3.6335) grad_norm 0.0000 (0.0000) [2022-10-11 20:10:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3334 (0.3504) loss 3.5524 (3.6277) grad_norm 0.0000 (0.0000) [2022-10-11 20:11:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3225 (0.3443) loss 3.2872 (3.6298) grad_norm 0.0000 (0.0000) [2022-10-11 20:12:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3514 (0.3409) loss 3.8240 (3.6274) grad_norm 0.0000 (0.0000) [2022-10-11 20:12:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3131 (0.3387) loss 3.7819 (3.6344) grad_norm 0.0000 (0.0000) [2022-10-11 20:13:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3104 (0.3372) loss 3.5375 (3.6359) grad_norm 0.0000 (0.0000) [2022-10-11 20:13:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3373 (0.3364) loss 3.6995 (3.6371) grad_norm 0.0000 (0.0000) [2022-10-11 20:14:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3168 (0.3355) loss 3.3604 (3.6357) grad_norm 0.0000 (0.0000) [2022-10-11 20:14:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3184 (0.3350) loss 3.8567 (3.6387) grad_norm 0.0000 (0.0000) [2022-10-11 20:15:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3600 (0.3348) loss 3.7283 (3.6404) grad_norm 0.0000 (0.0000) [2022-10-11 20:15:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3442 (0.3347) loss 3.7069 (3.6426) grad_norm 0.0000 (0.0000) [2022-10-11 20:16:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [157/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3386 (0.3345) loss 3.6318 (3.6420) grad_norm 0.0000 (0.0000) [2022-10-11 20:16:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 157 training takes 0:06:58 [2022-10-11 20:16:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.138 (3.138) Loss 0.9303 (0.9303) Acc@1 78.711 (78.711) Acc@5 94.238 (94.238) [2022-10-11 20:17:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.372 Acc@5 93.414 [2022-10-11 20:17:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-11 20:17:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.77% [2022-10-11 20:17:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][0/1251] eta 1:13:19 lr 0.000001 time 3.5165 (3.5165) loss 3.6741 (3.6741) grad_norm 0.0000 (0.0000) [2022-10-11 20:17:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3454 (0.3655) loss 3.8200 (3.6415) grad_norm 0.0000 (0.0000) [2022-10-11 20:18:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3539 (0.3485) loss 3.6167 (3.6442) grad_norm 0.0000 (0.0000) [2022-10-11 20:18:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3107 (0.3428) loss 3.5618 (3.6483) grad_norm 0.0000 (0.0000) [2022-10-11 20:19:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3191 (0.3401) loss 3.4571 (3.6436) grad_norm 0.0000 (0.0000) [2022-10-11 20:19:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3518 (0.3383) loss 3.7518 (3.6423) grad_norm 0.0000 (0.0000) [2022-10-11 20:20:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3629 (0.3372) loss 3.7831 (3.6427) grad_norm 0.0000 (0.0000) [2022-10-11 20:20:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3317 (0.3365) loss 3.7406 (3.6411) grad_norm 0.0000 (0.0000) [2022-10-11 20:21:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3416 (0.3359) loss 3.5933 (3.6419) grad_norm 0.0000 (0.0000) [2022-10-11 20:22:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3226 (0.3354) loss 3.6232 (3.6429) grad_norm 0.0000 (0.0000) [2022-10-11 20:22:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3409 (0.3351) loss 3.7660 (3.6441) grad_norm 0.0000 (0.0000) [2022-10-11 20:23:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3471 (0.3349) loss 3.7542 (3.6421) grad_norm 0.0000 (0.0000) [2022-10-11 20:23:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [158/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3389 (0.3346) loss 3.5243 (3.6419) grad_norm 0.0000 (0.0000) [2022-10-11 20:24:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 158 training takes 0:06:58 [2022-10-11 20:24:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.455 (3.455) Loss 0.9407 (0.9407) Acc@1 78.027 (78.027) Acc@5 94.531 (94.531) [2022-10-11 20:24:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.686 Acc@5 93.478 [2022-10-11 20:24:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-10-11 20:24:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.77% [2022-10-11 20:24:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][0/1251] eta 1:15:53 lr 0.000001 time 3.6399 (3.6399) loss 3.6421 (3.6421) grad_norm 0.0000 (0.0000) [2022-10-11 20:24:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3255 (0.3688) loss 3.7377 (3.6246) grad_norm 0.0000 (0.0000) [2022-10-11 20:25:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3643 (0.3508) loss 3.5465 (3.6271) grad_norm 0.0000 (0.0000) [2022-10-11 20:25:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3384 (0.3442) loss 3.5380 (3.6214) grad_norm 0.0000 (0.0000) [2022-10-11 20:26:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3103 (0.3406) loss 3.7612 (3.6182) grad_norm 0.0000 (0.0000) [2022-10-11 20:27:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3467 (0.3388) loss 3.5963 (3.6177) grad_norm 0.0000 (0.0000) [2022-10-11 20:27:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3306 (0.3375) loss 3.6703 (3.6227) grad_norm 0.0000 (0.0000) [2022-10-11 20:28:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3105 (0.3367) loss 3.6958 (3.6279) grad_norm 0.0000 (0.0000) [2022-10-11 20:28:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3584 (0.3363) loss 3.9038 (3.6309) grad_norm 0.0000 (0.0000) [2022-10-11 20:29:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3303 (0.3360) loss 3.8760 (3.6316) grad_norm 0.0000 (0.0000) [2022-10-11 20:29:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3291 (0.3355) loss 3.5333 (3.6345) grad_norm 0.0000 (0.0000) [2022-10-11 20:30:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3412 (0.3352) loss 3.4085 (3.6371) grad_norm 0.0000 (0.0000) [2022-10-11 20:30:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [159/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3287 (0.3352) loss 3.5341 (3.6365) grad_norm 0.0000 (0.0000) [2022-10-11 20:31:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 159 training takes 0:06:58 [2022-10-11 20:31:18 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.428 (3.428) Loss 1.0042 (1.0042) Acc@1 75.879 (75.879) Acc@5 94.141 (94.141) [2022-10-11 20:31:30 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.534 Acc@5 93.514 [2022-10-11 20:31:30 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-11 20:31:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.77% [2022-10-11 20:31:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][0/1251] eta 1:12:43 lr 0.000001 time 3.4884 (3.4884) loss 3.6515 (3.6515) grad_norm 0.0000 (0.0000) [2022-10-11 20:32:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3218 (0.3666) loss 3.7624 (3.6187) grad_norm 0.0000 (0.0000) [2022-10-11 20:32:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3360 (0.3490) loss 3.8485 (3.6223) grad_norm 0.0000 (0.0000) [2022-10-11 20:33:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3477 (0.3431) loss 3.4807 (3.6310) grad_norm 0.0000 (0.0000) [2022-10-11 20:33:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3213 (0.3398) loss 3.5375 (3.6312) grad_norm 0.0000 (0.0000) [2022-10-11 20:34:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3274 (0.3377) loss 3.6575 (3.6308) grad_norm 0.0000 (0.0000) [2022-10-11 20:34:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3132 (0.3365) loss 3.7912 (3.6288) grad_norm 0.0000 (0.0000) [2022-10-11 20:35:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3152 (0.3357) loss 3.4875 (3.6293) grad_norm 0.0000 (0.0000) [2022-10-11 20:35:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3488 (0.3350) loss 3.8332 (3.6304) grad_norm 0.0000 (0.0000) [2022-10-11 20:36:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3166 (0.3343) loss 3.4985 (3.6317) grad_norm 0.0000 (0.0000) [2022-10-11 20:37:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3311 (0.3338) loss 3.5316 (3.6323) grad_norm 0.0000 (0.0000) [2022-10-11 20:37:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3264 (0.3336) loss 3.7100 (3.6266) grad_norm 0.0000 (0.0000) [2022-10-11 20:38:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [160/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3068 (0.3334) loss 3.8978 (3.6284) grad_norm 0.0000 (0.0000) [2022-10-11 20:38:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 160 training takes 0:06:56 [2022-10-11 20:38:26 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_160 saving...... [2022-10-11 20:38:27 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_160 saved !!! [2022-10-11 20:38:30 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.225 (3.225) Loss 1.0204 (1.0204) Acc@1 76.855 (76.855) Acc@5 93.359 (93.359) [2022-10-11 20:38:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.798 Acc@5 93.478 [2022-10-11 20:38:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-11 20:38:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.80% [2022-10-11 20:38:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][0/1251] eta 1:16:06 lr 0.000001 time 3.6504 (3.6504) loss 3.2684 (3.2684) grad_norm 0.0000 (0.0000) [2022-10-11 20:39:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3412 (0.3667) loss 3.7110 (3.6221) grad_norm 0.0000 (0.0000) [2022-10-11 20:39:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3329 (0.3489) loss 3.5152 (3.6156) grad_norm 0.0000 (0.0000) [2022-10-11 20:40:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][300/1251] eta 0:05:25 lr 0.000001 time 0.2946 (0.3419) loss 3.4778 (3.6067) grad_norm 0.0000 (0.0000) [2022-10-11 20:40:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3028 (0.3386) loss 3.2827 (3.6072) grad_norm 0.0000 (0.0000) [2022-10-11 20:41:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3467 (0.3369) loss 3.5263 (3.6110) grad_norm 0.0000 (0.0000) [2022-10-11 20:42:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3139 (0.3357) loss 3.5352 (3.6159) grad_norm 0.0000 (0.0000) [2022-10-11 20:42:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3383 (0.3350) loss 3.3192 (3.6171) grad_norm 0.0000 (0.0000) [2022-10-11 20:43:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3380 (0.3346) loss 3.6958 (3.6222) grad_norm 0.0000 (0.0000) [2022-10-11 20:43:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3114 (0.3343) loss 3.8406 (3.6234) grad_norm 0.0000 (0.0000) [2022-10-11 20:44:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3229 (0.3338) loss 3.5257 (3.6268) grad_norm 0.0000 (0.0000) [2022-10-11 20:44:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3398 (0.3335) loss 3.8779 (3.6260) grad_norm 0.0000 (0.0000) [2022-10-11 20:45:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [161/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3293 (0.3332) loss 3.6283 (3.6291) grad_norm 0.0000 (0.0000) [2022-10-11 20:45:38 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 161 training takes 0:06:56 [2022-10-11 20:45:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.140 (3.140) Loss 0.9961 (0.9961) Acc@1 76.855 (76.855) Acc@5 93.848 (93.848) [2022-10-11 20:45:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.884 Acc@5 93.442 [2022-10-11 20:45:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 20:45:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.88% [2022-10-11 20:45:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][0/1251] eta 1:13:29 lr 0.000001 time 3.5250 (3.5250) loss 3.8445 (3.8445) grad_norm 0.0000 (0.0000) [2022-10-11 20:46:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3308 (0.3678) loss 3.3730 (3.6061) grad_norm 0.0000 (0.0000) [2022-10-11 20:47:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3324 (0.3492) loss 3.5392 (3.5962) grad_norm 0.0000 (0.0000) [2022-10-11 20:47:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3262 (0.3427) loss 3.7484 (3.5915) grad_norm 0.0000 (0.0000) [2022-10-11 20:48:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3183 (0.3396) loss 3.4073 (3.5992) grad_norm 0.0000 (0.0000) [2022-10-11 20:48:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3118 (0.3378) loss 3.7648 (3.6038) grad_norm 0.0000 (0.0000) [2022-10-11 20:49:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3198 (0.3366) loss 3.5512 (3.6030) grad_norm 0.0000 (0.0000) [2022-10-11 20:49:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3295 (0.3356) loss 3.4896 (3.6086) grad_norm 0.0000 (0.0000) [2022-10-11 20:50:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3459 (0.3350) loss 3.6937 (3.6095) grad_norm 0.0000 (0.0000) [2022-10-11 20:50:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3277 (0.3346) loss 3.7995 (3.6109) grad_norm 0.0000 (0.0000) [2022-10-11 20:51:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3437 (0.3341) loss 3.4460 (3.6126) grad_norm 0.0000 (0.0000) [2022-10-11 20:52:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3247 (0.3338) loss 3.6884 (3.6125) grad_norm 0.0000 (0.0000) [2022-10-11 20:52:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [162/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3323 (0.3337) loss 3.9221 (3.6123) grad_norm 0.0000 (0.0000) [2022-10-11 20:52:50 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 162 training takes 0:06:57 [2022-10-11 20:52:53 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.201 (3.201) Loss 0.9492 (0.9492) Acc@1 77.637 (77.637) Acc@5 93.555 (93.555) [2022-10-11 20:53:05 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.896 Acc@5 93.582 [2022-10-11 20:53:05 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 20:53:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.90% [2022-10-11 20:53:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][0/1251] eta 1:15:35 lr 0.000001 time 3.6254 (3.6254) loss 3.7503 (3.7503) grad_norm 0.0000 (0.0000) [2022-10-11 20:53:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3319 (0.3658) loss 3.0543 (3.6026) grad_norm 0.0000 (0.0000) [2022-10-11 20:54:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3286 (0.3481) loss 3.6803 (3.6012) grad_norm 0.0000 (0.0000) [2022-10-11 20:54:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3072 (0.3422) loss 3.5615 (3.6127) grad_norm 0.0000 (0.0000) [2022-10-11 20:55:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3362 (0.3389) loss 3.7136 (3.6138) grad_norm 0.0000 (0.0000) [2022-10-11 20:55:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3480 (0.3372) loss 3.5579 (3.6104) grad_norm 0.0000 (0.0000) [2022-10-11 20:56:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3475 (0.3363) loss 3.6042 (3.6160) grad_norm 0.0000 (0.0000) [2022-10-11 20:57:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3750 (0.3357) loss 3.9384 (3.6189) grad_norm 0.0000 (0.0000) [2022-10-11 20:57:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3371 (0.3352) loss 4.0311 (3.6198) grad_norm 0.0000 (0.0000) [2022-10-11 20:58:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3567 (0.3347) loss 3.7862 (3.6213) grad_norm 0.0000 (0.0000) [2022-10-11 20:58:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3381 (0.3343) loss 3.5761 (3.6217) grad_norm 0.0000 (0.0000) [2022-10-11 20:59:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3427 (0.3339) loss 3.5816 (3.6212) grad_norm 0.0000 (0.0000) [2022-10-11 20:59:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [163/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3344 (0.3336) loss 3.3426 (3.6215) grad_norm 0.0000 (0.0000) [2022-10-11 21:00:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 163 training takes 0:06:57 [2022-10-11 21:00:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.439 (3.439) Loss 0.9317 (0.9317) Acc@1 78.516 (78.516) Acc@5 93.945 (93.945) [2022-10-11 21:00:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.942 Acc@5 93.450 [2022-10-11 21:00:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 21:00:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.94% [2022-10-11 21:00:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][0/1251] eta 1:16:21 lr 0.000001 time 3.6624 (3.6624) loss 3.3284 (3.3284) grad_norm 0.0000 (0.0000) [2022-10-11 21:00:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3347 (0.3656) loss 3.5613 (3.6091) grad_norm 0.0000 (0.0000) [2022-10-11 21:01:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3380 (0.3497) loss 3.5147 (3.6158) grad_norm 0.0000 (0.0000) [2022-10-11 21:02:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3661 (0.3430) loss 3.2779 (3.6115) grad_norm 0.0000 (0.0000) [2022-10-11 21:02:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3166 (0.3391) loss 3.5063 (3.6110) grad_norm 0.0000 (0.0000) [2022-10-11 21:03:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3020 (0.3369) loss 3.4770 (3.6121) grad_norm 0.0000 (0.0000) [2022-10-11 21:03:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3466 (0.3359) loss 3.7485 (3.6132) grad_norm 0.0000 (0.0000) [2022-10-11 21:04:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3470 (0.3353) loss 3.6721 (3.6131) grad_norm 0.0000 (0.0000) [2022-10-11 21:04:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3127 (0.3345) loss 3.5851 (3.6149) grad_norm 0.0000 (0.0000) [2022-10-11 21:05:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3456 (0.3344) loss 3.5274 (3.6171) grad_norm 0.0000 (0.0000) [2022-10-11 21:05:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3401 (0.3340) loss 3.4945 (3.6189) grad_norm 0.0000 (0.0000) [2022-10-11 21:06:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3308 (0.3336) loss 3.8243 (3.6188) grad_norm 0.0000 (0.0000) [2022-10-11 21:06:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [164/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3247 (0.3334) loss 3.8501 (3.6190) grad_norm 0.0000 (0.0000) [2022-10-11 21:07:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 164 training takes 0:06:56 [2022-10-11 21:07:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.082 (3.082) Loss 1.0046 (1.0046) Acc@1 76.562 (76.562) Acc@5 93.652 (93.652) [2022-10-11 21:07:28 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.750 Acc@5 93.452 [2022-10-11 21:07:28 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-10-11 21:07:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 76.94% [2022-10-11 21:07:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][0/1251] eta 1:17:58 lr 0.000001 time 3.7399 (3.7399) loss 3.6066 (3.6066) grad_norm 0.0000 (0.0000) [2022-10-11 21:08:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3063 (0.3657) loss 3.8360 (3.6344) grad_norm 0.0000 (0.0000) [2022-10-11 21:08:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3327 (0.3478) loss 3.5825 (3.6210) grad_norm 0.0000 (0.0000) [2022-10-11 21:09:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3168 (0.3420) loss 3.3641 (3.6152) grad_norm 0.0000 (0.0000) [2022-10-11 21:09:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3307 (0.3390) loss 3.5097 (3.6154) grad_norm 0.0000 (0.0000) [2022-10-11 21:10:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3121 (0.3374) loss 3.9306 (3.6158) grad_norm 0.0000 (0.0000) [2022-10-11 21:10:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3455 (0.3362) loss 3.8101 (3.6157) grad_norm 0.0000 (0.0000) [2022-10-11 21:11:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3295 (0.3355) loss 3.5977 (3.6135) grad_norm 0.0000 (0.0000) [2022-10-11 21:11:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3633 (0.3351) loss 3.5510 (3.6121) grad_norm 0.0000 (0.0000) [2022-10-11 21:12:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3353 (0.3345) loss 3.6650 (3.6119) grad_norm 0.0000 (0.0000) [2022-10-11 21:13:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3157 (0.3340) loss 3.7437 (3.6109) grad_norm 0.0000 (0.0000) [2022-10-11 21:13:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3295 (0.3337) loss 3.6339 (3.6097) grad_norm 0.0000 (0.0000) [2022-10-11 21:14:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [165/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3114 (0.3335) loss 3.6170 (3.6113) grad_norm 0.0000 (0.0000) [2022-10-11 21:14:25 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 165 training takes 0:06:56 [2022-10-11 21:14:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.316 (3.316) Loss 1.0006 (1.0006) Acc@1 76.855 (76.855) Acc@5 93.652 (93.652) [2022-10-11 21:14:40 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.050 Acc@5 93.614 [2022-10-11 21:14:40 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-11 21:14:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.05% [2022-10-11 21:14:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][0/1251] eta 1:08:38 lr 0.000001 time 3.2921 (3.2921) loss 3.9552 (3.9552) grad_norm 0.0000 (0.0000) [2022-10-11 21:15:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3433 (0.3672) loss 3.8286 (3.6094) grad_norm 0.0000 (0.0000) [2022-10-11 21:15:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3579 (0.3499) loss 3.3859 (3.5980) grad_norm 0.0000 (0.0000) [2022-10-11 21:16:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3105 (0.3435) loss 3.7905 (3.6095) grad_norm 0.0000 (0.0000) [2022-10-11 21:16:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3462 (0.3405) loss 3.6919 (3.6114) grad_norm 0.0000 (0.0000) [2022-10-11 21:17:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3292 (0.3384) loss 3.6761 (3.6096) grad_norm 0.0000 (0.0000) [2022-10-11 21:18:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3610 (0.3372) loss 3.7667 (3.6140) grad_norm 0.0000 (0.0000) [2022-10-11 21:18:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3262 (0.3361) loss 3.8809 (3.6120) grad_norm 0.0000 (0.0000) [2022-10-11 21:19:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3250 (0.3353) loss 3.5489 (3.6079) grad_norm 0.0000 (0.0000) [2022-10-11 21:19:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3114 (0.3345) loss 3.8985 (3.6095) grad_norm 0.0000 (0.0000) [2022-10-11 21:20:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3350 (0.3342) loss 3.6803 (3.6105) grad_norm 0.0000 (0.0000) [2022-10-11 21:20:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3199 (0.3338) loss 3.8401 (3.6110) grad_norm 0.0000 (0.0000) [2022-10-11 21:21:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [166/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3655 (0.3336) loss 3.5718 (3.6129) grad_norm 0.0000 (0.0000) [2022-10-11 21:21:38 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 166 training takes 0:06:57 [2022-10-11 21:21:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.144 (3.144) Loss 1.0183 (1.0183) Acc@1 75.684 (75.684) Acc@5 93.164 (93.164) [2022-10-11 21:21:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.002 Acc@5 93.524 [2022-10-11 21:21:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-11 21:21:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.05% [2022-10-11 21:21:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][0/1251] eta 1:14:39 lr 0.000001 time 3.5808 (3.5808) loss 3.6407 (3.6407) grad_norm 0.0000 (0.0000) [2022-10-11 21:22:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3061 (0.3680) loss 3.4633 (3.5945) grad_norm 0.0000 (0.0000) [2022-10-11 21:23:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3186 (0.3497) loss 3.5802 (3.5970) grad_norm 0.0000 (0.0000) [2022-10-11 21:23:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3164 (0.3436) loss 3.8325 (3.6015) grad_norm 0.0000 (0.0000) [2022-10-11 21:24:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3265 (0.3407) loss 3.9113 (3.6009) grad_norm 0.0000 (0.0000) [2022-10-11 21:24:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3712 (0.3387) loss 3.7446 (3.6037) grad_norm 0.0000 (0.0000) [2022-10-11 21:25:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3138 (0.3374) loss 3.7146 (3.6002) grad_norm 0.0000 (0.0000) [2022-10-11 21:25:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3209 (0.3365) loss 3.6348 (3.6019) grad_norm 0.0000 (0.0000) [2022-10-11 21:26:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3268 (0.3357) loss 3.4534 (3.5999) grad_norm 0.0000 (0.0000) [2022-10-11 21:26:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3353 (0.3352) loss 3.4760 (3.5993) grad_norm 0.0000 (0.0000) [2022-10-11 21:27:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3302 (0.3348) loss 3.6578 (3.6002) grad_norm 0.0000 (0.0000) [2022-10-11 21:28:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3234 (0.3344) loss 3.6866 (3.5990) grad_norm 0.0000 (0.0000) [2022-10-11 21:28:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [167/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3320 (0.3343) loss 3.9954 (3.5993) grad_norm 0.0000 (0.0000) [2022-10-11 21:28:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 167 training takes 0:06:57 [2022-10-11 21:28:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.098 (3.098) Loss 0.9989 (0.9989) Acc@1 77.930 (77.930) Acc@5 93.555 (93.555) [2022-10-11 21:29:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.068 Acc@5 93.658 [2022-10-11 21:29:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-11 21:29:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.07% [2022-10-11 21:29:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][0/1251] eta 1:12:19 lr 0.000001 time 3.4689 (3.4689) loss 3.6527 (3.6527) grad_norm 0.0000 (0.0000) [2022-10-11 21:29:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3078 (0.3678) loss 3.4719 (3.5924) grad_norm 0.0000 (0.0000) [2022-10-11 21:30:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3241 (0.3499) loss 3.4972 (3.5910) grad_norm 0.0000 (0.0000) [2022-10-11 21:30:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3108 (0.3440) loss 3.6297 (3.5860) grad_norm 0.0000 (0.0000) [2022-10-11 21:31:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3469 (0.3404) loss 3.6126 (3.5857) grad_norm 0.0000 (0.0000) [2022-10-11 21:31:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3230 (0.3387) loss 3.6459 (3.5877) grad_norm 0.0000 (0.0000) [2022-10-11 21:32:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3257 (0.3375) loss 3.7765 (3.5892) grad_norm 0.0000 (0.0000) [2022-10-11 21:33:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3226 (0.3367) loss 3.7252 (3.5883) grad_norm 0.0000 (0.0000) [2022-10-11 21:33:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3428 (0.3361) loss 3.4408 (3.5879) grad_norm 0.0000 (0.0000) [2022-10-11 21:34:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3115 (0.3357) loss 3.5160 (3.5933) grad_norm 0.0000 (0.0000) [2022-10-11 21:34:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3104 (0.3351) loss 3.4395 (3.5937) grad_norm 0.0000 (0.0000) [2022-10-11 21:35:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3144 (0.3349) loss 3.7723 (3.5932) grad_norm 0.0000 (0.0000) [2022-10-11 21:35:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [168/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3424 (0.3346) loss 3.4360 (3.5913) grad_norm 0.0000 (0.0000) [2022-10-11 21:36:04 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 168 training takes 0:06:58 [2022-10-11 21:36:07 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.055 (3.055) Loss 0.9267 (0.9267) Acc@1 78.613 (78.613) Acc@5 94.531 (94.531) [2022-10-11 21:36:19 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.832 Acc@5 93.568 [2022-10-11 21:36:19 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-11 21:36:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.07% [2022-10-11 21:36:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][0/1251] eta 1:12:12 lr 0.000001 time 3.4628 (3.4628) loss 3.7724 (3.7724) grad_norm 0.0000 (0.0000) [2022-10-11 21:36:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3413 (0.3647) loss 3.8508 (3.5744) grad_norm 0.0000 (0.0000) [2022-10-11 21:37:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3180 (0.3481) loss 3.6904 (3.5783) grad_norm 0.0000 (0.0000) [2022-10-11 21:38:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3171 (0.3431) loss 3.7291 (3.5774) grad_norm 0.0000 (0.0000) [2022-10-11 21:38:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3155 (0.3399) loss 3.6369 (3.5809) grad_norm 0.0000 (0.0000) [2022-10-11 21:39:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3130 (0.3381) loss 3.7840 (3.5846) grad_norm 0.0000 (0.0000) [2022-10-11 21:39:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3128 (0.3370) loss 3.5067 (3.5864) grad_norm 0.0000 (0.0000) [2022-10-11 21:40:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3561 (0.3359) loss 3.2838 (3.5842) grad_norm 0.0000 (0.0000) [2022-10-11 21:40:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3255 (0.3353) loss 3.3501 (3.5869) grad_norm 0.0000 (0.0000) [2022-10-11 21:41:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3082 (0.3347) loss 3.4135 (3.5866) grad_norm 0.0000 (0.0000) [2022-10-11 21:41:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3280 (0.3342) loss 3.7604 (3.5919) grad_norm 0.0000 (0.0000) [2022-10-11 21:42:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3362 (0.3338) loss 3.8204 (3.5939) grad_norm 0.0000 (0.0000) [2022-10-11 21:42:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [169/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3250 (0.3335) loss 3.6283 (3.5946) grad_norm 0.0000 (0.0000) [2022-10-11 21:43:16 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 169 training takes 0:06:56 [2022-10-11 21:43:19 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.250 (3.250) Loss 1.0368 (1.0368) Acc@1 74.121 (74.121) Acc@5 92.480 (92.480) [2022-10-11 21:43:31 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.906 Acc@5 93.532 [2022-10-11 21:43:31 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 21:43:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.07% [2022-10-11 21:43:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][0/1251] eta 1:12:49 lr 0.000001 time 3.4926 (3.4926) loss 3.6372 (3.6372) grad_norm 0.0000 (0.0000) [2022-10-11 21:44:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3238 (0.3649) loss 3.7309 (3.5666) grad_norm 0.0000 (0.0000) [2022-10-11 21:44:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3248 (0.3480) loss 3.3362 (3.5680) grad_norm 0.0000 (0.0000) [2022-10-11 21:45:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3433 (0.3421) loss 3.6845 (3.5827) grad_norm 0.0000 (0.0000) [2022-10-11 21:45:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3269 (0.3396) loss 3.8038 (3.5856) grad_norm 0.0000 (0.0000) [2022-10-11 21:46:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3308 (0.3374) loss 3.6941 (3.5860) grad_norm 0.0000 (0.0000) [2022-10-11 21:46:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3384 (0.3364) loss 3.5910 (3.5807) grad_norm 0.0000 (0.0000) [2022-10-11 21:47:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3202 (0.3354) loss 3.4420 (3.5849) grad_norm 0.0000 (0.0000) [2022-10-11 21:47:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3314 (0.3348) loss 3.7777 (3.5852) grad_norm 0.0000 (0.0000) [2022-10-11 21:48:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3312 (0.3342) loss 3.3320 (3.5894) grad_norm 0.0000 (0.0000) [2022-10-11 21:49:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3406 (0.3339) loss 3.6157 (3.5897) grad_norm 0.0000 (0.0000) [2022-10-11 21:49:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3347 (0.3336) loss 3.6591 (3.5896) grad_norm 0.0000 (0.0000) [2022-10-11 21:50:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [170/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3274 (0.3334) loss 3.5461 (3.5899) grad_norm 0.0000 (0.0000) [2022-10-11 21:50:27 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 170 training takes 0:06:56 [2022-10-11 21:50:27 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_170 saving...... [2022-10-11 21:50:27 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_170 saved !!! [2022-10-11 21:50:31 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.351 (3.351) Loss 0.9168 (0.9168) Acc@1 78.516 (78.516) Acc@5 94.922 (94.922) [2022-10-11 21:50:42 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.874 Acc@5 93.528 [2022-10-11 21:50:42 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 21:50:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.07% [2022-10-11 21:50:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][0/1251] eta 1:15:55 lr 0.000001 time 3.6415 (3.6415) loss 3.7928 (3.7928) grad_norm 0.0000 (0.0000) [2022-10-11 21:51:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3277 (0.3641) loss 3.5593 (3.5634) grad_norm 0.0000 (0.0000) [2022-10-11 21:51:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3355 (0.3482) loss 3.3415 (3.5767) grad_norm 0.0000 (0.0000) [2022-10-11 21:52:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3115 (0.3430) loss 3.7640 (3.5781) grad_norm 0.0000 (0.0000) [2022-10-11 21:52:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3457 (0.3402) loss 3.4996 (3.5816) grad_norm 0.0000 (0.0000) [2022-10-11 21:53:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3325 (0.3385) loss 3.9286 (3.5801) grad_norm 0.0000 (0.0000) [2022-10-11 21:54:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3587 (0.3375) loss 3.4526 (3.5824) grad_norm 0.0000 (0.0000) [2022-10-11 21:54:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3131 (0.3369) loss 3.4264 (3.5848) grad_norm 0.0000 (0.0000) [2022-10-11 21:55:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3560 (0.3361) loss 3.5333 (3.5865) grad_norm 0.0000 (0.0000) [2022-10-11 21:55:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3729 (0.3356) loss 3.4945 (3.5868) grad_norm 0.0000 (0.0000) [2022-10-11 21:56:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3813 (0.3351) loss 3.3532 (3.5856) grad_norm 0.0000 (0.0000) [2022-10-11 21:56:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3537 (0.3348) loss 3.6813 (3.5879) grad_norm 0.0000 (0.0000) [2022-10-11 21:57:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [171/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3057 (0.3344) loss 3.6975 (3.5915) grad_norm 0.0000 (0.0000) [2022-10-11 21:57:40 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 171 training takes 0:06:57 [2022-10-11 21:57:44 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.367 (3.367) Loss 0.9532 (0.9532) Acc@1 77.930 (77.930) Acc@5 94.043 (94.043) [2022-10-11 21:57:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.954 Acc@5 93.650 [2022-10-11 21:57:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-11 21:57:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.07% [2022-10-11 21:57:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][0/1251] eta 1:11:38 lr 0.000001 time 3.4361 (3.4361) loss 3.6766 (3.6766) grad_norm 0.0000 (0.0000) [2022-10-11 21:58:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3674 (0.3650) loss 3.5502 (3.5552) grad_norm 0.0000 (0.0000) [2022-10-11 21:59:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3163 (0.3489) loss 3.6511 (3.5458) grad_norm 0.0000 (0.0000) [2022-10-11 21:59:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3690 (0.3436) loss 3.6462 (3.5533) grad_norm 0.0000 (0.0000) [2022-10-11 22:00:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3459 (0.3412) loss 3.6312 (3.5546) grad_norm 0.0000 (0.0000) [2022-10-11 22:00:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3162 (0.3393) loss 3.4737 (3.5575) grad_norm 0.0000 (0.0000) [2022-10-11 22:01:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3463 (0.3382) loss 3.7813 (3.5609) grad_norm 0.0000 (0.0000) [2022-10-11 22:01:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3162 (0.3373) loss 3.7825 (3.5644) grad_norm 0.0000 (0.0000) [2022-10-11 22:02:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3217 (0.3366) loss 3.6920 (3.5681) grad_norm 0.0000 (0.0000) [2022-10-11 22:02:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3000 (0.3358) loss 3.6689 (3.5668) grad_norm 0.0000 (0.0000) [2022-10-11 22:03:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3196 (0.3353) loss 3.4547 (3.5683) grad_norm 0.0000 (0.0000) [2022-10-11 22:04:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3296 (0.3348) loss 3.1386 (3.5705) grad_norm 0.0000 (0.0000) [2022-10-11 22:04:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [172/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3242 (0.3345) loss 3.6523 (3.5725) grad_norm 0.0000 (0.0000) [2022-10-11 22:04:53 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 172 training takes 0:06:58 [2022-10-11 22:04:57 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.559 (3.559) Loss 1.0497 (1.0497) Acc@1 75.000 (75.000) Acc@5 93.555 (93.555) [2022-10-11 22:05:08 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.096 Acc@5 93.650 [2022-10-11 22:05:08 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-11 22:05:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.10% [2022-10-11 22:05:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][0/1251] eta 1:01:57 lr 0.000001 time 2.9719 (2.9719) loss 3.6389 (3.6389) grad_norm 0.0000 (0.0000) [2022-10-11 22:05:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3274 (0.3626) loss 3.5515 (3.5622) grad_norm 0.0000 (0.0000) [2022-10-11 22:06:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][200/1251] eta 0:06:04 lr 0.000001 time 0.4080 (0.3471) loss 3.3870 (3.5747) grad_norm 0.0000 (0.0000) [2022-10-11 22:06:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3099 (0.3420) loss 3.4145 (3.5662) grad_norm 0.0000 (0.0000) [2022-10-11 22:07:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3129 (0.3393) loss 3.9412 (3.5691) grad_norm 0.0000 (0.0000) [2022-10-11 22:07:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3259 (0.3380) loss 3.2051 (3.5729) grad_norm 0.0000 (0.0000) [2022-10-11 22:08:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3390 (0.3366) loss 3.4862 (3.5721) grad_norm 0.0000 (0.0000) [2022-10-11 22:09:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3317 (0.3361) loss 3.7464 (3.5754) grad_norm 0.0000 (0.0000) [2022-10-11 22:09:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3334 (0.3356) loss 3.6712 (3.5752) grad_norm 0.0000 (0.0000) [2022-10-11 22:10:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3154 (0.3349) loss 3.3563 (3.5775) grad_norm 0.0000 (0.0000) [2022-10-11 22:10:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3038 (0.3343) loss 3.6645 (3.5783) grad_norm 0.0000 (0.0000) [2022-10-11 22:11:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3139 (0.3342) loss 3.6921 (3.5780) grad_norm 0.0000 (0.0000) [2022-10-11 22:11:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [173/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3416 (0.3339) loss 3.5379 (3.5784) grad_norm 0.0000 (0.0000) [2022-10-11 22:12:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 173 training takes 0:06:57 [2022-10-11 22:12:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.445 (3.445) Loss 1.0381 (1.0381) Acc@1 76.367 (76.367) Acc@5 92.480 (92.480) [2022-10-11 22:12:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.038 Acc@5 93.676 [2022-10-11 22:12:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-11 22:12:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.10% [2022-10-11 22:12:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][0/1251] eta 1:18:07 lr 0.000001 time 3.7473 (3.7473) loss 3.3319 (3.3319) grad_norm 0.0000 (0.0000) [2022-10-11 22:12:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3240 (0.3651) loss 3.5459 (3.5592) grad_norm 0.0000 (0.0000) [2022-10-11 22:13:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3267 (0.3477) loss 3.4977 (3.5628) grad_norm 0.0000 (0.0000) [2022-10-11 22:14:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3384 (0.3416) loss 3.3229 (3.5642) grad_norm 0.0000 (0.0000) [2022-10-11 22:14:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][400/1251] eta 0:04:47 lr 0.000001 time 0.3476 (0.3384) loss 3.6051 (3.5607) grad_norm 0.0000 (0.0000) [2022-10-11 22:15:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][500/1251] eta 0:04:12 lr 0.000001 time 0.3605 (0.3367) loss 3.3974 (3.5641) grad_norm 0.0000 (0.0000) [2022-10-11 22:15:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3289 (0.3356) loss 3.4683 (3.5645) grad_norm 0.0000 (0.0000) [2022-10-11 22:16:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3305 (0.3349) loss 3.3959 (3.5618) grad_norm 0.0000 (0.0000) [2022-10-11 22:16:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3048 (0.3343) loss 3.1398 (3.5594) grad_norm 0.0000 (0.0000) [2022-10-11 22:17:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3338 (0.3340) loss 3.5190 (3.5620) grad_norm 0.0000 (0.0000) [2022-10-11 22:17:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3202 (0.3336) loss 3.7938 (3.5619) grad_norm 0.0000 (0.0000) [2022-10-11 22:18:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3662 (0.3333) loss 3.5176 (3.5664) grad_norm 0.0000 (0.0000) [2022-10-11 22:19:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [174/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3418 (0.3330) loss 3.3190 (3.5677) grad_norm 0.0000 (0.0000) [2022-10-11 22:19:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 174 training takes 0:06:56 [2022-10-11 22:19:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.776 (2.776) Loss 0.9902 (0.9902) Acc@1 76.855 (76.855) Acc@5 93.945 (93.945) [2022-10-11 22:19:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.326 Acc@5 93.642 [2022-10-11 22:19:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-11 22:19:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.33% [2022-10-11 22:19:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][0/1251] eta 1:13:13 lr 0.000001 time 3.5118 (3.5118) loss 3.7082 (3.7082) grad_norm 0.0000 (0.0000) [2022-10-11 22:20:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3374 (0.3683) loss 3.4698 (3.5744) grad_norm 0.0000 (0.0000) [2022-10-11 22:20:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3268 (0.3500) loss 3.4163 (3.5575) grad_norm 0.0000 (0.0000) [2022-10-11 22:21:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3137 (0.3440) loss 3.4122 (3.5570) grad_norm 0.0000 (0.0000) [2022-10-11 22:21:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3322 (0.3409) loss 3.6478 (3.5598) grad_norm 0.0000 (0.0000) [2022-10-11 22:22:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3264 (0.3392) loss 3.5352 (3.5584) grad_norm 0.0000 (0.0000) [2022-10-11 22:22:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3292 (0.3375) loss 3.7152 (3.5599) grad_norm 0.0000 (0.0000) [2022-10-11 22:23:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3170 (0.3365) loss 3.6388 (3.5603) grad_norm 0.0000 (0.0000) [2022-10-11 22:24:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3552 (0.3359) loss 3.6521 (3.5645) grad_norm 0.0000 (0.0000) [2022-10-11 22:24:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][900/1251] eta 0:01:57 lr 0.000001 time 0.2993 (0.3354) loss 3.4150 (3.5655) grad_norm 0.0000 (0.0000) [2022-10-11 22:25:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3367 (0.3351) loss 3.7477 (3.5656) grad_norm 0.0000 (0.0000) [2022-10-11 22:25:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3127 (0.3347) loss 3.5655 (3.5676) grad_norm 0.0000 (0.0000) [2022-10-11 22:26:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [175/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3250 (0.3342) loss 3.5765 (3.5690) grad_norm 0.0000 (0.0000) [2022-10-11 22:26:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 175 training takes 0:06:57 [2022-10-11 22:26:33 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.187 (3.187) Loss 0.9716 (0.9716) Acc@1 76.465 (76.465) Acc@5 93.945 (93.945) [2022-10-11 22:26:45 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 76.942 Acc@5 93.622 [2022-10-11 22:26:45 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-11 22:26:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.33% [2022-10-11 22:26:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][0/1251] eta 1:12:19 lr 0.000001 time 3.4688 (3.4688) loss 3.6027 (3.6027) grad_norm 0.0000 (0.0000) [2022-10-11 22:27:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3396 (0.3647) loss 3.5612 (3.5596) grad_norm 0.0000 (0.0000) [2022-10-11 22:27:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3644 (0.3479) loss 3.5176 (3.5576) grad_norm 0.0000 (0.0000) [2022-10-11 22:28:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3368 (0.3432) loss 3.5261 (3.5585) grad_norm 0.0000 (0.0000) [2022-10-11 22:29:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3167 (0.3400) loss 3.3449 (3.5652) grad_norm 0.0000 (0.0000) [2022-10-11 22:29:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3184 (0.3381) loss 3.5405 (3.5666) grad_norm 0.0000 (0.0000) [2022-10-11 22:30:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3064 (0.3367) loss 3.6507 (3.5669) grad_norm 0.0000 (0.0000) [2022-10-11 22:30:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3454 (0.3360) loss 3.6932 (3.5644) grad_norm 0.0000 (0.0000) [2022-10-11 22:31:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3252 (0.3354) loss 3.5409 (3.5648) grad_norm 0.0000 (0.0000) [2022-10-11 22:31:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3429 (0.3347) loss 3.4771 (3.5632) grad_norm 0.0000 (0.0000) [2022-10-11 22:32:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3138 (0.3343) loss 3.7499 (3.5645) grad_norm 0.0000 (0.0000) [2022-10-11 22:32:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3452 (0.3342) loss 3.5419 (3.5643) grad_norm 0.0000 (0.0000) [2022-10-11 22:33:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [176/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3186 (0.3338) loss 3.3849 (3.5666) grad_norm 0.0000 (0.0000) [2022-10-11 22:33:42 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 176 training takes 0:06:57 [2022-10-11 22:33:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.388 (3.388) Loss 0.9938 (0.9938) Acc@1 76.855 (76.855) Acc@5 94.043 (94.043) [2022-10-11 22:33:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.288 Acc@5 93.696 [2022-10-11 22:33:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-11 22:33:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.33% [2022-10-11 22:34:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][0/1251] eta 1:04:41 lr 0.000001 time 3.1027 (3.1027) loss 3.9276 (3.9276) grad_norm 0.0000 (0.0000) [2022-10-11 22:34:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3287 (0.3654) loss 3.6719 (3.5721) grad_norm 0.0000 (0.0000) [2022-10-11 22:35:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3517 (0.3500) loss 3.4062 (3.5547) grad_norm 0.0000 (0.0000) [2022-10-11 22:35:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3370 (0.3443) loss 3.4480 (3.5516) grad_norm 0.0000 (0.0000) [2022-10-11 22:36:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3258 (0.3410) loss 3.6961 (3.5442) grad_norm 0.0000 (0.0000) [2022-10-11 22:36:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3149 (0.3391) loss 3.4568 (3.5414) grad_norm 0.0000 (0.0000) [2022-10-11 22:37:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3598 (0.3380) loss 3.6788 (3.5479) grad_norm 0.0000 (0.0000) [2022-10-11 22:37:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3297 (0.3373) loss 3.3895 (3.5490) grad_norm 0.0000 (0.0000) [2022-10-11 22:38:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3423 (0.3368) loss 3.3212 (3.5528) grad_norm 0.0000 (0.0000) [2022-10-11 22:39:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3303 (0.3362) loss 3.4689 (3.5524) grad_norm 0.0000 (0.0000) [2022-10-11 22:39:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3385 (0.3357) loss 3.8300 (3.5559) grad_norm 0.0000 (0.0000) [2022-10-11 22:40:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3148 (0.3354) loss 3.2647 (3.5573) grad_norm 0.0000 (0.0000) [2022-10-11 22:40:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [177/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3380 (0.3351) loss 3.5941 (3.5594) grad_norm 0.0000 (0.0000) [2022-10-11 22:40:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 177 training takes 0:06:58 [2022-10-11 22:40:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.292 (3.292) Loss 0.9040 (0.9040) Acc@1 78.418 (78.418) Acc@5 95.117 (95.117) [2022-10-11 22:41:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.582 Acc@5 93.854 [2022-10-11 22:41:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-11 22:41:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 22:41:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][0/1251] eta 1:12:34 lr 0.000001 time 3.4808 (3.4808) loss 3.6423 (3.6423) grad_norm 0.0000 (0.0000) [2022-10-11 22:41:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3350 (0.3666) loss 3.5635 (3.5139) grad_norm 0.0000 (0.0000) [2022-10-11 22:42:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3304 (0.3505) loss 3.4115 (3.5464) grad_norm 0.0000 (0.0000) [2022-10-11 22:42:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3403 (0.3443) loss 3.6614 (3.5449) grad_norm 0.0000 (0.0000) [2022-10-11 22:43:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3154 (0.3411) loss 3.3906 (3.5501) grad_norm 0.0000 (0.0000) [2022-10-11 22:44:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3588 (0.3392) loss 3.3057 (3.5505) grad_norm 0.0000 (0.0000) [2022-10-11 22:44:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3191 (0.3380) loss 3.3906 (3.5492) grad_norm 0.0000 (0.0000) [2022-10-11 22:45:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3281 (0.3370) loss 3.4253 (3.5532) grad_norm 0.0000 (0.0000) [2022-10-11 22:45:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3484 (0.3363) loss 3.7335 (3.5562) grad_norm 0.0000 (0.0000) [2022-10-11 22:46:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3376 (0.3358) loss 3.4807 (3.5562) grad_norm 0.0000 (0.0000) [2022-10-11 22:46:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3198 (0.3357) loss 3.2812 (3.5545) grad_norm 0.0000 (0.0000) [2022-10-11 22:47:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3230 (0.3352) loss 3.9676 (3.5558) grad_norm 0.0000 (0.0000) [2022-10-11 22:47:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [178/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3375 (0.3350) loss 3.7063 (3.5575) grad_norm 0.0000 (0.0000) [2022-10-11 22:48:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 178 training takes 0:06:58 [2022-10-11 22:48:13 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.428 (3.428) Loss 1.0082 (1.0082) Acc@1 76.758 (76.758) Acc@5 93.164 (93.164) [2022-10-11 22:48:25 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.286 Acc@5 93.602 [2022-10-11 22:48:25 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-11 22:48:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 22:48:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][0/1251] eta 1:10:44 lr 0.000001 time 3.3933 (3.3933) loss 3.4918 (3.4918) grad_norm 0.0000 (0.0000) [2022-10-11 22:49:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3323 (0.3666) loss 3.4990 (3.5411) grad_norm 0.0000 (0.0000) [2022-10-11 22:49:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3353 (0.3499) loss 3.5177 (3.5519) grad_norm 0.0000 (0.0000) [2022-10-11 22:50:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3270 (0.3445) loss 3.7029 (3.5455) grad_norm 0.0000 (0.0000) [2022-10-11 22:50:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3476 (0.3417) loss 3.4680 (3.5472) grad_norm 0.0000 (0.0000) [2022-10-11 22:51:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3254 (0.3398) loss 3.7979 (3.5471) grad_norm 0.0000 (0.0000) [2022-10-11 22:51:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3243 (0.3389) loss 3.6564 (3.5457) grad_norm 0.0000 (0.0000) [2022-10-11 22:52:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3193 (0.3380) loss 3.6798 (3.5474) grad_norm 0.0000 (0.0000) [2022-10-11 22:52:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3221 (0.3373) loss 3.7325 (3.5463) grad_norm 0.0000 (0.0000) [2022-10-11 22:53:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3213 (0.3365) loss 3.8480 (3.5497) grad_norm 0.0000 (0.0000) [2022-10-11 22:54:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3317 (0.3358) loss 3.6753 (3.5500) grad_norm 0.0000 (0.0000) [2022-10-11 22:54:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3573 (0.3355) loss 3.4297 (3.5497) grad_norm 0.0000 (0.0000) [2022-10-11 22:55:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [179/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3488 (0.3351) loss 3.3945 (3.5488) grad_norm 0.0000 (0.0000) [2022-10-11 22:55:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 179 training takes 0:06:58 [2022-10-11 22:55:27 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.501 (3.501) Loss 1.0428 (1.0428) Acc@1 75.684 (75.684) Acc@5 92.773 (92.773) [2022-10-11 22:55:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.394 Acc@5 93.828 [2022-10-11 22:55:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-11 22:55:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 22:55:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][0/1251] eta 1:16:02 lr 0.000001 time 3.6474 (3.6474) loss 3.2977 (3.2977) grad_norm 0.0000 (0.0000) [2022-10-11 22:56:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3107 (0.3683) loss 3.5531 (3.5440) grad_norm 0.0000 (0.0000) [2022-10-11 22:56:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3455 (0.3500) loss 3.4609 (3.5225) grad_norm 0.0000 (0.0000) [2022-10-11 22:57:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3978 (0.3438) loss 3.5376 (3.5233) grad_norm 0.0000 (0.0000) [2022-10-11 22:57:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3096 (0.3408) loss 3.5506 (3.5272) grad_norm 0.0000 (0.0000) [2022-10-11 22:58:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3493 (0.3388) loss 3.3080 (3.5315) grad_norm 0.0000 (0.0000) [2022-10-11 22:59:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3259 (0.3379) loss 3.4392 (3.5356) grad_norm 0.0000 (0.0000) [2022-10-11 22:59:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3513 (0.3370) loss 3.8428 (3.5346) grad_norm 0.0000 (0.0000) [2022-10-11 23:00:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3416 (0.3362) loss 3.4902 (3.5387) grad_norm 0.0000 (0.0000) [2022-10-11 23:00:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3288 (0.3356) loss 3.5415 (3.5399) grad_norm 0.0000 (0.0000) [2022-10-11 23:01:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3290 (0.3351) loss 3.6062 (3.5388) grad_norm 0.0000 (0.0000) [2022-10-11 23:01:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3366 (0.3347) loss 3.7762 (3.5409) grad_norm 0.0000 (0.0000) [2022-10-11 23:02:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [180/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3232 (0.3345) loss 3.5450 (3.5411) grad_norm 0.0000 (0.0000) [2022-10-11 23:02:37 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 180 training takes 0:06:58 [2022-10-11 23:02:37 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_180 saving...... [2022-10-11 23:02:37 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_180 saved !!! [2022-10-11 23:02:40 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.325 (3.325) Loss 1.0540 (1.0540) Acc@1 76.758 (76.758) Acc@5 92.773 (92.773) [2022-10-11 23:02:52 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.544 Acc@5 93.746 [2022-10-11 23:02:52 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-10-11 23:02:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 23:02:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][0/1251] eta 1:10:11 lr 0.000001 time 3.3668 (3.3668) loss 3.3339 (3.3339) grad_norm 0.0000 (0.0000) [2022-10-11 23:03:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3483 (0.3659) loss 3.6549 (3.5166) grad_norm 0.0000 (0.0000) [2022-10-11 23:04:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3361 (0.3480) loss 3.6937 (3.5347) grad_norm 0.0000 (0.0000) [2022-10-11 23:04:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3329 (0.3421) loss 3.4783 (3.5365) grad_norm 0.0000 (0.0000) [2022-10-11 23:05:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3490 (0.3395) loss 3.3709 (3.5310) grad_norm 0.0000 (0.0000) [2022-10-11 23:05:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3216 (0.3379) loss 3.3355 (3.5318) grad_norm 0.0000 (0.0000) [2022-10-11 23:06:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3151 (0.3370) loss 3.4776 (3.5344) grad_norm 0.0000 (0.0000) [2022-10-11 23:06:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3167 (0.3364) loss 3.6312 (3.5368) grad_norm 0.0000 (0.0000) [2022-10-11 23:07:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3090 (0.3358) loss 4.0079 (3.5388) grad_norm 0.0000 (0.0000) [2022-10-11 23:07:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3162 (0.3350) loss 3.5355 (3.5382) grad_norm 0.0000 (0.0000) [2022-10-11 23:08:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3383 (0.3344) loss 3.4651 (3.5403) grad_norm 0.0000 (0.0000) [2022-10-11 23:08:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3093 (0.3340) loss 3.5076 (3.5423) grad_norm 0.0000 (0.0000) [2022-10-11 23:09:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [181/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3199 (0.3335) loss 3.7069 (3.5411) grad_norm 0.0000 (0.0000) [2022-10-11 23:09:48 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 181 training takes 0:06:56 [2022-10-11 23:09:52 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.490 (3.490) Loss 0.9045 (0.9045) Acc@1 77.539 (77.539) Acc@5 95.020 (95.020) [2022-10-11 23:10:04 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.360 Acc@5 93.784 [2022-10-11 23:10:04 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-11 23:10:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 23:10:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][0/1251] eta 1:04:16 lr 0.000001 time 3.0827 (3.0827) loss 3.6239 (3.6239) grad_norm 0.0000 (0.0000) [2022-10-11 23:10:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3204 (0.3659) loss 3.4353 (3.5308) grad_norm 0.0000 (0.0000) [2022-10-11 23:11:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3307 (0.3490) loss 3.5287 (3.5272) grad_norm 0.0000 (0.0000) [2022-10-11 23:11:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3249 (0.3430) loss 3.6980 (3.5322) grad_norm 0.0000 (0.0000) [2022-10-11 23:12:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3240 (0.3400) loss 3.6628 (3.5402) grad_norm 0.0000 (0.0000) [2022-10-11 23:12:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3283 (0.3382) loss 3.6215 (3.5407) grad_norm 0.0000 (0.0000) [2022-10-11 23:13:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3334 (0.3371) loss 3.5802 (3.5399) grad_norm 0.0000 (0.0000) [2022-10-11 23:13:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3361 (0.3364) loss 3.5989 (3.5399) grad_norm 0.0000 (0.0000) [2022-10-11 23:14:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3206 (0.3355) loss 3.3668 (3.5377) grad_norm 0.0000 (0.0000) [2022-10-11 23:15:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3210 (0.3351) loss 3.2903 (3.5373) grad_norm 0.0000 (0.0000) [2022-10-11 23:15:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3750 (0.3348) loss 3.4491 (3.5401) grad_norm 0.0000 (0.0000) [2022-10-11 23:16:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3133 (0.3344) loss 3.8101 (3.5423) grad_norm 0.0000 (0.0000) [2022-10-11 23:16:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [182/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3411 (0.3342) loss 3.7889 (3.5428) grad_norm 0.0000 (0.0000) [2022-10-11 23:17:01 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 182 training takes 0:06:57 [2022-10-11 23:17:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.089 (3.089) Loss 0.9042 (0.9042) Acc@1 78.906 (78.906) Acc@5 94.531 (94.531) [2022-10-11 23:17:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.552 Acc@5 93.948 [2022-10-11 23:17:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-11 23:17:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.58% [2022-10-11 23:17:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][0/1251] eta 1:11:24 lr 0.000001 time 3.4248 (3.4248) loss 3.6133 (3.6133) grad_norm 0.0000 (0.0000) [2022-10-11 23:17:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3491 (0.3684) loss 3.2288 (3.5324) grad_norm 0.0000 (0.0000) [2022-10-11 23:18:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3559 (0.3500) loss 3.3650 (3.5299) grad_norm 0.0000 (0.0000) [2022-10-11 23:19:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3467 (0.3436) loss 3.6749 (3.5249) grad_norm 0.0000 (0.0000) [2022-10-11 23:19:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3082 (0.3405) loss 3.7506 (3.5265) grad_norm 0.0000 (0.0000) [2022-10-11 23:20:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3234 (0.3386) loss 3.5845 (3.5266) grad_norm 0.0000 (0.0000) [2022-10-11 23:20:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3309 (0.3371) loss 3.5053 (3.5271) grad_norm 0.0000 (0.0000) [2022-10-11 23:21:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3134 (0.3362) loss 3.6746 (3.5274) grad_norm 0.0000 (0.0000) [2022-10-11 23:21:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3431 (0.3356) loss 3.7552 (3.5291) grad_norm 0.0000 (0.0000) [2022-10-11 23:22:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3214 (0.3351) loss 3.5659 (3.5281) grad_norm 0.0000 (0.0000) [2022-10-11 23:22:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3191 (0.3348) loss 3.3243 (3.5277) grad_norm 0.0000 (0.0000) [2022-10-11 23:23:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3437 (0.3343) loss 3.2270 (3.5293) grad_norm 0.0000 (0.0000) [2022-10-11 23:23:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [183/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3812 (0.3339) loss 3.8546 (3.5273) grad_norm 0.0000 (0.0000) [2022-10-11 23:24:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 183 training takes 0:06:57 [2022-10-11 23:24:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.346 (3.346) Loss 0.8698 (0.8698) Acc@1 79.102 (79.102) Acc@5 95.117 (95.117) [2022-10-11 23:24:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.940 Acc@5 93.972 [2022-10-11 23:24:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-10-11 23:24:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-11 23:24:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][0/1251] eta 1:08:09 lr 0.000001 time 3.2691 (3.2691) loss 3.0471 (3.0471) grad_norm 0.0000 (0.0000) [2022-10-11 23:25:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3247 (0.3662) loss 3.1574 (3.4998) grad_norm 0.0000 (0.0000) [2022-10-11 23:25:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3238 (0.3483) loss 3.2030 (3.5123) grad_norm 0.0000 (0.0000) [2022-10-11 23:26:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3541 (0.3425) loss 3.3826 (3.5171) grad_norm 0.0000 (0.0000) [2022-10-11 23:26:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3196 (0.3400) loss 3.3324 (3.5153) grad_norm 0.0000 (0.0000) [2022-10-11 23:27:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3492 (0.3382) loss 3.6529 (3.5220) grad_norm 0.0000 (0.0000) [2022-10-11 23:27:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3267 (0.3372) loss 3.3257 (3.5247) grad_norm 0.0000 (0.0000) [2022-10-11 23:28:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3198 (0.3360) loss 3.2443 (3.5244) grad_norm 0.0000 (0.0000) [2022-10-11 23:28:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3311 (0.3353) loss 3.5392 (3.5263) grad_norm 0.0000 (0.0000) [2022-10-11 23:29:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3380 (0.3351) loss 3.6256 (3.5250) grad_norm 0.0000 (0.0000) [2022-10-11 23:30:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3246 (0.3346) loss 3.3403 (3.5252) grad_norm 0.0000 (0.0000) [2022-10-11 23:30:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3317 (0.3343) loss 3.5352 (3.5259) grad_norm 0.0000 (0.0000) [2022-10-11 23:31:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [184/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3458 (0.3338) loss 3.4759 (3.5265) grad_norm 0.0000 (0.0000) [2022-10-11 23:31:26 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 184 training takes 0:06:57 [2022-10-11 23:31:29 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.398 (3.398) Loss 0.9259 (0.9259) Acc@1 78.125 (78.125) Acc@5 94.434 (94.434) [2022-10-11 23:31:41 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.712 Acc@5 93.804 [2022-10-11 23:31:41 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-11 23:31:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-11 23:31:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][0/1251] eta 1:04:27 lr 0.000001 time 3.0919 (3.0919) loss 3.5757 (3.5757) grad_norm 0.0000 (0.0000) [2022-10-11 23:32:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][100/1251] eta 0:06:57 lr 0.000001 time 0.3301 (0.3631) loss 3.2060 (3.5097) grad_norm 0.0000 (0.0000) [2022-10-11 23:32:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3237 (0.3479) loss 3.4542 (3.5059) grad_norm 0.0000 (0.0000) [2022-10-11 23:33:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3191 (0.3431) loss 3.4864 (3.5016) grad_norm 0.0000 (0.0000) [2022-10-11 23:33:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3527 (0.3402) loss 3.4731 (3.5083) grad_norm 0.0000 (0.0000) [2022-10-11 23:34:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3403 (0.3382) loss 3.5582 (3.5112) grad_norm 0.0000 (0.0000) [2022-10-11 23:35:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3629 (0.3369) loss 3.4072 (3.5097) grad_norm 0.0000 (0.0000) [2022-10-11 23:35:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3275 (0.3365) loss 3.4149 (3.5145) grad_norm 0.0000 (0.0000) [2022-10-11 23:36:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3173 (0.3363) loss 3.4753 (3.5162) grad_norm 0.0000 (0.0000) [2022-10-11 23:36:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3349 (0.3357) loss 3.2636 (3.5154) grad_norm 0.0000 (0.0000) [2022-10-11 23:37:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3701 (0.3350) loss 3.8454 (3.5155) grad_norm 0.0000 (0.0000) [2022-10-11 23:37:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3538 (0.3349) loss 3.3539 (3.5166) grad_norm 0.0000 (0.0000) [2022-10-11 23:38:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [185/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3608 (0.3346) loss 3.6016 (3.5177) grad_norm 0.0000 (0.0000) [2022-10-11 23:38:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 185 training takes 0:06:58 [2022-10-11 23:38:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.327 (3.327) Loss 0.9641 (0.9641) Acc@1 77.832 (77.832) Acc@5 93.652 (93.652) [2022-10-11 23:38:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.746 Acc@5 93.834 [2022-10-11 23:38:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-11 23:38:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-11 23:38:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][0/1251] eta 1:10:21 lr 0.000001 time 3.3743 (3.3743) loss 3.3679 (3.3679) grad_norm 0.0000 (0.0000) [2022-10-11 23:39:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3579 (0.3696) loss 3.5568 (3.4830) grad_norm 0.0000 (0.0000) [2022-10-11 23:40:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3484 (0.3510) loss 3.5175 (3.5014) grad_norm 0.0000 (0.0000) [2022-10-11 23:40:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3548 (0.3450) loss 3.3697 (3.5062) grad_norm 0.0000 (0.0000) [2022-10-11 23:41:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3545 (0.3419) loss 3.5324 (3.5104) grad_norm 0.0000 (0.0000) [2022-10-11 23:41:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3428 (0.3403) loss 3.4410 (3.5095) grad_norm 0.0000 (0.0000) [2022-10-11 23:42:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3430 (0.3388) loss 3.6988 (3.5116) grad_norm 0.0000 (0.0000) [2022-10-11 23:42:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3194 (0.3377) loss 3.2774 (3.5165) grad_norm 0.0000 (0.0000) [2022-10-11 23:43:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3369 (0.3369) loss 3.5526 (3.5144) grad_norm 0.0000 (0.0000) [2022-10-11 23:43:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3257 (0.3365) loss 3.4634 (3.5152) grad_norm 0.0000 (0.0000) [2022-10-11 23:44:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3380 (0.3362) loss 3.2860 (3.5161) grad_norm 0.0000 (0.0000) [2022-10-11 23:45:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3271 (0.3359) loss 3.7724 (3.5146) grad_norm 0.0000 (0.0000) [2022-10-11 23:45:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [186/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3614 (0.3357) loss 3.7014 (3.5164) grad_norm 0.0000 (0.0000) [2022-10-11 23:45:54 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 186 training takes 0:06:59 [2022-10-11 23:45:58 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.631 (3.631) Loss 0.9971 (0.9971) Acc@1 77.539 (77.539) Acc@5 93.066 (93.066) [2022-10-11 23:46:09 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.684 Acc@5 93.908 [2022-10-11 23:46:09 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-11 23:46:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-11 23:46:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][0/1251] eta 1:09:58 lr 0.000001 time 3.3560 (3.3560) loss 3.3461 (3.3461) grad_norm 0.0000 (0.0000) [2022-10-11 23:46:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3343 (0.3654) loss 3.3049 (3.4848) grad_norm 0.0000 (0.0000) [2022-10-11 23:47:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3686 (0.3482) loss 3.5955 (3.5108) grad_norm 0.0000 (0.0000) [2022-10-11 23:47:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3559 (0.3432) loss 3.7153 (3.5221) grad_norm 0.0000 (0.0000) [2022-10-11 23:48:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3400 (0.3408) loss 3.8176 (3.5225) grad_norm 0.0000 (0.0000) [2022-10-11 23:48:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3193 (0.3390) loss 3.3323 (3.5175) grad_norm 0.0000 (0.0000) [2022-10-11 23:49:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3063 (0.3379) loss 3.7163 (3.5213) grad_norm 0.0000 (0.0000) [2022-10-11 23:50:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3342 (0.3370) loss 3.3014 (3.5185) grad_norm 0.0000 (0.0000) [2022-10-11 23:50:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3249 (0.3364) loss 3.5662 (3.5151) grad_norm 0.0000 (0.0000) [2022-10-11 23:51:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3322 (0.3362) loss 3.5175 (3.5136) grad_norm 0.0000 (0.0000) [2022-10-11 23:51:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3252 (0.3358) loss 3.4640 (3.5154) grad_norm 0.0000 (0.0000) [2022-10-11 23:52:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3355 (0.3355) loss 3.7298 (3.5161) grad_norm 0.0000 (0.0000) [2022-10-11 23:52:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [187/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3304 (0.3352) loss 3.6528 (3.5186) grad_norm 0.0000 (0.0000) [2022-10-11 23:53:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 187 training takes 0:06:59 [2022-10-11 23:53:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.107 (3.107) Loss 0.9031 (0.9031) Acc@1 78.516 (78.516) Acc@5 94.141 (94.141) [2022-10-11 23:53:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.746 Acc@5 94.064 [2022-10-11 23:53:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-11 23:53:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-11 23:53:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][0/1251] eta 1:12:27 lr 0.000001 time 3.4756 (3.4756) loss 3.2997 (3.2997) grad_norm 0.0000 (0.0000) [2022-10-11 23:54:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3402 (0.3697) loss 3.6669 (3.4982) grad_norm 0.0000 (0.0000) [2022-10-11 23:54:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3061 (0.3520) loss 3.5357 (3.5114) grad_norm 0.0000 (0.0000) [2022-10-11 23:55:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3352 (0.3454) loss 3.7280 (3.5128) grad_norm 0.0000 (0.0000) [2022-10-11 23:55:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3218 (0.3419) loss 3.4295 (3.5150) grad_norm 0.0000 (0.0000) [2022-10-11 23:56:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3437 (0.3401) loss 3.4058 (3.5131) grad_norm 0.0000 (0.0000) [2022-10-11 23:56:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3024 (0.3385) loss 3.0155 (3.5066) grad_norm 0.0000 (0.0000) [2022-10-11 23:57:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3233 (0.3375) loss 3.6316 (3.5086) grad_norm 0.0000 (0.0000) [2022-10-11 23:57:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3432 (0.3371) loss 3.6930 (3.5057) grad_norm 0.0000 (0.0000) [2022-10-11 23:58:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3334 (0.3364) loss 3.2983 (3.5077) grad_norm 0.0000 (0.0000) [2022-10-11 23:59:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3492 (0.3360) loss 3.2042 (3.5059) grad_norm 0.0000 (0.0000) [2022-10-11 23:59:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3332 (0.3354) loss 3.6834 (3.5057) grad_norm 0.0000 (0.0000) [2022-10-12 00:00:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [188/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3464 (0.3353) loss 3.6773 (3.5065) grad_norm 0.0000 (0.0000) [2022-10-12 00:00:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 188 training takes 0:06:59 [2022-10-12 00:00:27 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.346 (3.346) Loss 0.9548 (0.9548) Acc@1 78.320 (78.320) Acc@5 94.043 (94.043) [2022-10-12 00:00:39 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.936 Acc@5 94.036 [2022-10-12 00:00:39 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-10-12 00:00:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 77.94% [2022-10-12 00:00:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][0/1251] eta 1:15:05 lr 0.000001 time 3.6013 (3.6013) loss 3.3553 (3.3553) grad_norm 0.0000 (0.0000) [2022-10-12 00:01:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][100/1251] eta 0:07:02 lr 0.000001 time 0.2987 (0.3670) loss 3.2412 (3.4871) grad_norm 0.0000 (0.0000) [2022-10-12 00:01:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3438 (0.3497) loss 3.3436 (3.4908) grad_norm 0.0000 (0.0000) [2022-10-12 00:02:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3366 (0.3442) loss 3.6537 (3.5033) grad_norm 0.0000 (0.0000) [2022-10-12 00:02:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3178 (0.3410) loss 3.6761 (3.4953) grad_norm 0.0000 (0.0000) [2022-10-12 00:03:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3259 (0.3392) loss 3.7566 (3.4943) grad_norm 0.0000 (0.0000) [2022-10-12 00:04:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3186 (0.3379) loss 3.6522 (3.4951) grad_norm 0.0000 (0.0000) [2022-10-12 00:04:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3354 (0.3372) loss 3.3520 (3.4932) grad_norm 0.0000 (0.0000) [2022-10-12 00:05:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3449 (0.3367) loss 3.3524 (3.4944) grad_norm 0.0000 (0.0000) [2022-10-12 00:05:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3421 (0.3363) loss 3.2908 (3.4941) grad_norm 0.0000 (0.0000) [2022-10-12 00:06:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3242 (0.3358) loss 3.5902 (3.4950) grad_norm 0.0000 (0.0000) [2022-10-12 00:06:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3380 (0.3355) loss 3.5899 (3.4955) grad_norm 0.0000 (0.0000) [2022-10-12 00:07:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [189/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3389 (0.3353) loss 3.4969 (3.4974) grad_norm 0.0000 (0.0000) [2022-10-12 00:07:38 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 189 training takes 0:06:59 [2022-10-12 00:07:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.925 (2.925) Loss 0.9380 (0.9380) Acc@1 80.078 (80.078) Acc@5 93.457 (93.457) [2022-10-12 00:07:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.148 Acc@5 93.996 [2022-10-12 00:07:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-12 00:07:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.15% [2022-10-12 00:07:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][0/1251] eta 1:13:27 lr 0.000001 time 3.5232 (3.5232) loss 3.4716 (3.4716) grad_norm 0.0000 (0.0000) [2022-10-12 00:08:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3313 (0.3663) loss 3.5473 (3.5007) grad_norm 0.0000 (0.0000) [2022-10-12 00:09:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3603 (0.3489) loss 3.5229 (3.5052) grad_norm 0.0000 (0.0000) [2022-10-12 00:09:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3237 (0.3429) loss 3.5132 (3.5004) grad_norm 0.0000 (0.0000) [2022-10-12 00:10:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3475 (0.3395) loss 3.3659 (3.5025) grad_norm 0.0000 (0.0000) [2022-10-12 00:10:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3260 (0.3377) loss 3.3990 (3.5060) grad_norm 0.0000 (0.0000) [2022-10-12 00:11:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3384 (0.3368) loss 3.6224 (3.5030) grad_norm 0.0000 (0.0000) [2022-10-12 00:11:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3178 (0.3359) loss 3.5257 (3.5012) grad_norm 0.0000 (0.0000) [2022-10-12 00:12:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3485 (0.3353) loss 3.6897 (3.5006) grad_norm 0.0000 (0.0000) [2022-10-12 00:12:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3283 (0.3347) loss 3.2031 (3.5000) grad_norm 0.0000 (0.0000) [2022-10-12 00:13:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3281 (0.3343) loss 3.4305 (3.5017) grad_norm 0.0000 (0.0000) [2022-10-12 00:14:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3440 (0.3342) loss 3.7241 (3.5020) grad_norm 0.0000 (0.0000) [2022-10-12 00:14:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [190/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3386 (0.3340) loss 3.5042 (3.5024) grad_norm 0.0000 (0.0000) [2022-10-12 00:14:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 190 training takes 0:06:57 [2022-10-12 00:14:51 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_190 saving...... [2022-10-12 00:14:51 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_190 saved !!! [2022-10-12 00:14:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.167 (3.167) Loss 0.9218 (0.9218) Acc@1 77.441 (77.441) Acc@5 94.043 (94.043) [2022-10-12 00:15:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.964 Acc@5 93.980 [2022-10-12 00:15:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-12 00:15:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.15% [2022-10-12 00:15:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][0/1251] eta 1:02:53 lr 0.000001 time 3.0164 (3.0164) loss 3.5671 (3.5671) grad_norm 0.0000 (0.0000) [2022-10-12 00:15:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3417 (0.3671) loss 3.7672 (3.4481) grad_norm 0.0000 (0.0000) [2022-10-12 00:16:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3484 (0.3507) loss 3.1310 (3.4572) grad_norm 0.0000 (0.0000) [2022-10-12 00:16:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3430 (0.3447) loss 3.4691 (3.4703) grad_norm 0.0000 (0.0000) [2022-10-12 00:17:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3291 (0.3418) loss 3.5679 (3.4798) grad_norm 0.0000 (0.0000) [2022-10-12 00:17:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][500/1251] eta 0:04:15 lr 0.000001 time 0.2952 (0.3397) loss 3.4457 (3.4803) grad_norm 0.0000 (0.0000) [2022-10-12 00:18:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3257 (0.3383) loss 3.6379 (3.4810) grad_norm 0.0000 (0.0000) [2022-10-12 00:19:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3271 (0.3375) loss 3.4449 (3.4846) grad_norm 0.0000 (0.0000) [2022-10-12 00:19:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3321 (0.3370) loss 3.8436 (3.4887) grad_norm 0.0000 (0.0000) [2022-10-12 00:20:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3311 (0.3365) loss 3.5140 (3.4921) grad_norm 0.0000 (0.0000) [2022-10-12 00:20:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3126 (0.3360) loss 3.8525 (3.4933) grad_norm 0.0000 (0.0000) [2022-10-12 00:21:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3374 (0.3356) loss 3.4996 (3.4939) grad_norm 0.0000 (0.0000) [2022-10-12 00:21:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [191/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3528 (0.3353) loss 3.3060 (3.4950) grad_norm 0.0000 (0.0000) [2022-10-12 00:22:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 191 training takes 0:06:59 [2022-10-12 00:22:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.334 (3.334) Loss 0.9668 (0.9668) Acc@1 78.516 (78.516) Acc@5 94.141 (94.141) [2022-10-12 00:22:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 77.954 Acc@5 93.928 [2022-10-12 00:22:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-12 00:22:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.15% [2022-10-12 00:22:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][0/1251] eta 1:15:36 lr 0.000001 time 3.6265 (3.6265) loss 3.5555 (3.5555) grad_norm 0.0000 (0.0000) [2022-10-12 00:22:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3205 (0.3679) loss 3.4761 (3.4593) grad_norm 0.0000 (0.0000) [2022-10-12 00:23:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3183 (0.3482) loss 3.5752 (3.4791) grad_norm 0.0000 (0.0000) [2022-10-12 00:24:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3338 (0.3423) loss 3.3557 (3.4882) grad_norm 0.0000 (0.0000) [2022-10-12 00:24:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3265 (0.3396) loss 3.5772 (3.4813) grad_norm 0.0000 (0.0000) [2022-10-12 00:25:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3357 (0.3378) loss 3.6645 (3.4813) grad_norm 0.0000 (0.0000) [2022-10-12 00:25:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3597 (0.3372) loss 3.7459 (3.4847) grad_norm 0.0000 (0.0000) [2022-10-12 00:26:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3160 (0.3363) loss 3.2794 (3.4867) grad_norm 0.0000 (0.0000) [2022-10-12 00:26:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3043 (0.3359) loss 3.3601 (3.4865) grad_norm 0.0000 (0.0000) [2022-10-12 00:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3492 (0.3358) loss 3.4190 (3.4877) grad_norm 0.0000 (0.0000) [2022-10-12 00:27:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3095 (0.3356) loss 3.3555 (3.4874) grad_norm 0.0000 (0.0000) [2022-10-12 00:28:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3444 (0.3354) loss 3.2996 (3.4899) grad_norm 0.0000 (0.0000) [2022-10-12 00:29:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [192/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3352 (0.3351) loss 3.1972 (3.4923) grad_norm 0.0000 (0.0000) [2022-10-12 00:29:20 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 192 training takes 0:06:59 [2022-10-12 00:29:23 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.070 (3.070) Loss 0.8727 (0.8727) Acc@1 79.688 (79.688) Acc@5 95.410 (95.410) [2022-10-12 00:29:35 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.122 Acc@5 94.034 [2022-10-12 00:29:35 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-12 00:29:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.15% [2022-10-12 00:29:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][0/1251] eta 1:13:20 lr 0.000001 time 3.5180 (3.5180) loss 3.5867 (3.5867) grad_norm 0.0000 (0.0000) [2022-10-12 00:30:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3294 (0.3672) loss 3.6521 (3.4864) grad_norm 0.0000 (0.0000) [2022-10-12 00:30:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][200/1251] eta 0:06:07 lr 0.000001 time 0.2998 (0.3498) loss 3.4630 (3.4781) grad_norm 0.0000 (0.0000) [2022-10-12 00:31:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3294 (0.3437) loss 3.2299 (3.4872) grad_norm 0.0000 (0.0000) [2022-10-12 00:31:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3048 (0.3407) loss 3.4808 (3.4851) grad_norm 0.0000 (0.0000) [2022-10-12 00:32:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3444 (0.3385) loss 3.5928 (3.4834) grad_norm 0.0000 (0.0000) [2022-10-12 00:32:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3324 (0.3374) loss 3.5499 (3.4863) grad_norm 0.0000 (0.0000) [2022-10-12 00:33:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3270 (0.3365) loss 3.5160 (3.4834) grad_norm 0.0000 (0.0000) [2022-10-12 00:34:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3140 (0.3357) loss 3.4677 (3.4847) grad_norm 0.0000 (0.0000) [2022-10-12 00:34:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3298 (0.3352) loss 3.8435 (3.4875) grad_norm 0.0000 (0.0000) [2022-10-12 00:35:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3259 (0.3347) loss 3.2924 (3.4881) grad_norm 0.0000 (0.0000) [2022-10-12 00:35:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3199 (0.3344) loss 3.5757 (3.4900) grad_norm 0.0000 (0.0000) [2022-10-12 00:36:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [193/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3332 (0.3344) loss 3.2568 (3.4891) grad_norm 0.0000 (0.0000) [2022-10-12 00:36:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 193 training takes 0:06:58 [2022-10-12 00:36:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.166 (3.166) Loss 0.9527 (0.9527) Acc@1 79.004 (79.004) Acc@5 94.824 (94.824) [2022-10-12 00:36:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.178 Acc@5 94.062 [2022-10-12 00:36:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-12 00:36:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.18% [2022-10-12 00:36:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][0/1251] eta 1:11:30 lr 0.000001 time 3.4295 (3.4295) loss 3.5276 (3.5276) grad_norm 0.0000 (0.0000) [2022-10-12 00:37:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3260 (0.3691) loss 3.4046 (3.4828) grad_norm 0.0000 (0.0000) [2022-10-12 00:37:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3613 (0.3510) loss 3.6158 (3.4877) grad_norm 0.0000 (0.0000) [2022-10-12 00:38:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3395 (0.3443) loss 3.5612 (3.4787) grad_norm 0.0000 (0.0000) [2022-10-12 00:39:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3399 (0.3409) loss 3.6357 (3.4743) grad_norm 0.0000 (0.0000) [2022-10-12 00:39:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3278 (0.3394) loss 3.7078 (3.4778) grad_norm 0.0000 (0.0000) [2022-10-12 00:40:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3602 (0.3381) loss 3.4622 (3.4773) grad_norm 0.0000 (0.0000) [2022-10-12 00:40:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3320 (0.3373) loss 3.5976 (3.4731) grad_norm 0.0000 (0.0000) [2022-10-12 00:41:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3298 (0.3368) loss 3.1804 (3.4732) grad_norm 0.0000 (0.0000) [2022-10-12 00:41:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3440 (0.3362) loss 3.6944 (3.4711) grad_norm 0.0000 (0.0000) [2022-10-12 00:42:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3607 (0.3357) loss 3.4147 (3.4733) grad_norm 0.0000 (0.0000) [2022-10-12 00:42:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3202 (0.3356) loss 3.1717 (3.4720) grad_norm 0.0000 (0.0000) [2022-10-12 00:43:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [194/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3361 (0.3354) loss 3.3167 (3.4730) grad_norm 0.0000 (0.0000) [2022-10-12 00:43:48 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 194 training takes 0:06:59 [2022-10-12 00:43:51 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.330 (3.330) Loss 0.9151 (0.9151) Acc@1 79.980 (79.980) Acc@5 95.117 (95.117) [2022-10-12 00:44:03 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.092 Acc@5 94.040 [2022-10-12 00:44:03 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-12 00:44:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.18% [2022-10-12 00:44:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][0/1251] eta 1:15:59 lr 0.000001 time 3.6447 (3.6447) loss 3.5374 (3.5374) grad_norm 0.0000 (0.0000) [2022-10-12 00:44:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3242 (0.3680) loss 3.7126 (3.4697) grad_norm 0.0000 (0.0000) [2022-10-12 00:45:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3332 (0.3497) loss 3.4583 (3.4683) grad_norm 0.0000 (0.0000) [2022-10-12 00:45:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][300/1251] eta 0:05:26 lr 0.000001 time 0.2922 (0.3433) loss 3.4821 (3.4744) grad_norm 0.0000 (0.0000) [2022-10-12 00:46:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3166 (0.3402) loss 3.4249 (3.4788) grad_norm 0.0000 (0.0000) [2022-10-12 00:46:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3150 (0.3384) loss 3.5991 (3.4784) grad_norm 0.0000 (0.0000) [2022-10-12 00:47:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3147 (0.3372) loss 3.6053 (3.4759) grad_norm 0.0000 (0.0000) [2022-10-12 00:47:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3277 (0.3364) loss 3.3995 (3.4790) grad_norm 0.0000 (0.0000) [2022-10-12 00:48:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3288 (0.3360) loss 3.3489 (3.4799) grad_norm 0.0000 (0.0000) [2022-10-12 00:49:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3303 (0.3354) loss 3.2734 (3.4784) grad_norm 0.0000 (0.0000) [2022-10-12 00:49:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3192 (0.3353) loss 3.6988 (3.4788) grad_norm 0.0000 (0.0000) [2022-10-12 00:50:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3368 (0.3350) loss 3.6144 (3.4774) grad_norm 0.0000 (0.0000) [2022-10-12 00:50:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [195/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3178 (0.3348) loss 3.2237 (3.4768) grad_norm 0.0000 (0.0000) [2022-10-12 00:51:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 195 training takes 0:06:58 [2022-10-12 00:51:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.124 (3.124) Loss 0.8405 (0.8405) Acc@1 80.664 (80.664) Acc@5 95.020 (95.020) [2022-10-12 00:51:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.110 Acc@5 94.106 [2022-10-12 00:51:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-12 00:51:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.18% [2022-10-12 00:51:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][0/1251] eta 1:15:33 lr 0.000001 time 3.6236 (3.6236) loss 3.4985 (3.4985) grad_norm 0.0000 (0.0000) [2022-10-12 00:51:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3149 (0.3669) loss 3.4063 (3.4576) grad_norm 0.0000 (0.0000) [2022-10-12 00:52:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3218 (0.3490) loss 3.1368 (3.4569) grad_norm 0.0000 (0.0000) [2022-10-12 00:53:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3106 (0.3425) loss 3.5370 (3.4563) grad_norm 0.0000 (0.0000) [2022-10-12 00:53:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3346 (0.3396) loss 3.6362 (3.4593) grad_norm 0.0000 (0.0000) [2022-10-12 00:54:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3558 (0.3377) loss 3.8107 (3.4631) grad_norm 0.0000 (0.0000) [2022-10-12 00:54:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3530 (0.3368) loss 3.5781 (3.4656) grad_norm 0.0000 (0.0000) [2022-10-12 00:55:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3069 (0.3360) loss 3.3430 (3.4703) grad_norm 0.0000 (0.0000) [2022-10-12 00:55:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3120 (0.3356) loss 3.4810 (3.4694) grad_norm 0.0000 (0.0000) [2022-10-12 00:56:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3317 (0.3349) loss 3.5473 (3.4671) grad_norm 0.0000 (0.0000) [2022-10-12 00:56:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3635 (0.3347) loss 3.1800 (3.4698) grad_norm 0.0000 (0.0000) [2022-10-12 00:57:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3182 (0.3347) loss 3.4306 (3.4697) grad_norm 0.0000 (0.0000) [2022-10-12 00:57:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [196/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3307 (0.3345) loss 3.6660 (3.4723) grad_norm 0.0000 (0.0000) [2022-10-12 00:58:15 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 196 training takes 0:06:58 [2022-10-12 00:58:19 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.449 (3.449) Loss 0.9482 (0.9482) Acc@1 79.785 (79.785) Acc@5 94.336 (94.336) [2022-10-12 00:58:31 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.428 Acc@5 94.130 [2022-10-12 00:58:31 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-12 00:58:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 00:58:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][0/1251] eta 1:14:41 lr 0.000001 time 3.5821 (3.5821) loss 3.5616 (3.5616) grad_norm 0.0000 (0.0000) [2022-10-12 00:59:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3516 (0.3685) loss 3.5279 (3.4459) grad_norm 0.0000 (0.0000) [2022-10-12 00:59:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3532 (0.3503) loss 3.4408 (3.4552) grad_norm 0.0000 (0.0000) [2022-10-12 01:00:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3322 (0.3439) loss 3.4144 (3.4531) grad_norm 0.0000 (0.0000) [2022-10-12 01:00:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3372 (0.3408) loss 3.3592 (3.4572) grad_norm 0.0000 (0.0000) [2022-10-12 01:01:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3141 (0.3384) loss 3.3611 (3.4612) grad_norm 0.0000 (0.0000) [2022-10-12 01:01:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3264 (0.3373) loss 3.4718 (3.4560) grad_norm 0.0000 (0.0000) [2022-10-12 01:02:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3435 (0.3364) loss 3.5357 (3.4570) grad_norm 0.0000 (0.0000) [2022-10-12 01:03:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3158 (0.3357) loss 3.2958 (3.4575) grad_norm 0.0000 (0.0000) [2022-10-12 01:03:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3373 (0.3353) loss 3.7237 (3.4584) grad_norm 0.0000 (0.0000) [2022-10-12 01:04:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3560 (0.3349) loss 3.3347 (3.4597) grad_norm 0.0000 (0.0000) [2022-10-12 01:04:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3403 (0.3348) loss 3.2809 (3.4611) grad_norm 0.0000 (0.0000) [2022-10-12 01:05:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [197/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3193 (0.3347) loss 3.2715 (3.4627) grad_norm 0.0000 (0.0000) [2022-10-12 01:05:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 197 training takes 0:06:58 [2022-10-12 01:05:33 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.302 (3.302) Loss 0.9878 (0.9878) Acc@1 75.781 (75.781) Acc@5 94.043 (94.043) [2022-10-12 01:05:44 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.304 Acc@5 94.320 [2022-10-12 01:05:44 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-10-12 01:05:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 01:05:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][0/1251] eta 1:10:48 lr 0.000001 time 3.3962 (3.3962) loss 3.2927 (3.2927) grad_norm 0.0000 (0.0000) [2022-10-12 01:06:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3558 (0.3663) loss 3.3023 (3.4422) grad_norm 0.0000 (0.0000) [2022-10-12 01:06:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3495 (0.3490) loss 3.6435 (3.4528) grad_norm 0.0000 (0.0000) [2022-10-12 01:07:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3262 (0.3431) loss 3.5212 (3.4540) grad_norm 0.0000 (0.0000) [2022-10-12 01:08:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3357 (0.3401) loss 3.5433 (3.4555) grad_norm 0.0000 (0.0000) [2022-10-12 01:08:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3570 (0.3381) loss 3.4476 (3.4590) grad_norm 0.0000 (0.0000) [2022-10-12 01:09:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3500 (0.3370) loss 3.5511 (3.4615) grad_norm 0.0000 (0.0000) [2022-10-12 01:09:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3290 (0.3365) loss 3.5150 (3.4621) grad_norm 0.0000 (0.0000) [2022-10-12 01:10:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3574 (0.3358) loss 3.4107 (3.4633) grad_norm 0.0000 (0.0000) [2022-10-12 01:10:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3212 (0.3354) loss 3.4595 (3.4646) grad_norm 0.0000 (0.0000) [2022-10-12 01:11:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3491 (0.3351) loss 3.3581 (3.4625) grad_norm 0.0000 (0.0000) [2022-10-12 01:11:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3250 (0.3347) loss 3.5438 (3.4620) grad_norm 0.0000 (0.0000) [2022-10-12 01:12:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [198/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3350 (0.3345) loss 3.4399 (3.4632) grad_norm 0.0000 (0.0000) [2022-10-12 01:12:43 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 198 training takes 0:06:58 [2022-10-12 01:12:46 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.316 (3.316) Loss 0.9372 (0.9372) Acc@1 78.613 (78.613) Acc@5 94.531 (94.531) [2022-10-12 01:12:58 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.248 Acc@5 94.022 [2022-10-12 01:12:58 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-12 01:12:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 01:13:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][0/1251] eta 1:12:19 lr 0.000001 time 3.4692 (3.4692) loss 3.0060 (3.0060) grad_norm 0.0000 (0.0000) [2022-10-12 01:13:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3255 (0.3676) loss 3.3571 (3.4223) grad_norm 0.0000 (0.0000) [2022-10-12 01:14:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3223 (0.3499) loss 3.1313 (3.4454) grad_norm 0.0000 (0.0000) [2022-10-12 01:14:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3399 (0.3435) loss 3.4886 (3.4440) grad_norm 0.0000 (0.0000) [2022-10-12 01:15:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3501 (0.3404) loss 3.3822 (3.4475) grad_norm 0.0000 (0.0000) [2022-10-12 01:15:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3290 (0.3384) loss 3.5368 (3.4462) grad_norm 0.0000 (0.0000) [2022-10-12 01:16:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3268 (0.3369) loss 3.5589 (3.4483) grad_norm 0.0000 (0.0000) [2022-10-12 01:16:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3308 (0.3361) loss 3.6189 (3.4536) grad_norm 0.0000 (0.0000) [2022-10-12 01:17:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3248 (0.3355) loss 3.6971 (3.4534) grad_norm 0.0000 (0.0000) [2022-10-12 01:18:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3297 (0.3352) loss 3.1274 (3.4540) grad_norm 0.0000 (0.0000) [2022-10-12 01:18:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3497 (0.3349) loss 3.4965 (3.4534) grad_norm 0.0000 (0.0000) [2022-10-12 01:19:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3399 (0.3345) loss 3.3561 (3.4546) grad_norm 0.0000 (0.0000) [2022-10-12 01:19:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [199/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3070 (0.3342) loss 3.4052 (3.4525) grad_norm 0.0000 (0.0000) [2022-10-12 01:19:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 199 training takes 0:06:57 [2022-10-12 01:19:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.164 (3.164) Loss 1.0465 (1.0465) Acc@1 75.000 (75.000) Acc@5 93.848 (93.848) [2022-10-12 01:20:12 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.336 Acc@5 94.094 [2022-10-12 01:20:12 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-10-12 01:20:12 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 01:20:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][0/1251] eta 1:14:10 lr 0.000001 time 3.5579 (3.5579) loss 3.3123 (3.3123) grad_norm 0.0000 (0.0000) [2022-10-12 01:20:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3170 (0.3688) loss 3.2613 (3.4482) grad_norm 0.0000 (0.0000) [2022-10-12 01:21:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3579 (0.3507) loss 3.2683 (3.4430) grad_norm 0.0000 (0.0000) [2022-10-12 01:21:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3418 (0.3444) loss 3.4697 (3.4437) grad_norm 0.0000 (0.0000) [2022-10-12 01:22:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3445 (0.3412) loss 3.1822 (3.4444) grad_norm 0.0000 (0.0000) [2022-10-12 01:23:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3231 (0.3389) loss 3.2507 (3.4462) grad_norm 0.0000 (0.0000) [2022-10-12 01:23:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3231 (0.3374) loss 3.6434 (3.4449) grad_norm 0.0000 (0.0000) [2022-10-12 01:24:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3446 (0.3368) loss 3.4379 (3.4475) grad_norm 0.0000 (0.0000) [2022-10-12 01:24:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3212 (0.3361) loss 3.5624 (3.4489) grad_norm 0.0000 (0.0000) [2022-10-12 01:25:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][900/1251] eta 0:01:57 lr 0.000001 time 0.4063 (0.3355) loss 3.5831 (3.4472) grad_norm 0.0000 (0.0000) [2022-10-12 01:25:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3425 (0.3352) loss 3.4428 (3.4472) grad_norm 0.0000 (0.0000) [2022-10-12 01:26:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3357 (0.3350) loss 3.4045 (3.4486) grad_norm 0.0000 (0.0000) [2022-10-12 01:26:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [200/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3428 (0.3348) loss 3.4483 (3.4493) grad_norm 0.0000 (0.0000) [2022-10-12 01:27:10 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 200 training takes 0:06:58 [2022-10-12 01:27:10 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_200 saving...... [2022-10-12 01:27:10 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_200 saved !!! [2022-10-12 01:27:13 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.872 (2.872) Loss 0.9482 (0.9482) Acc@1 77.441 (77.441) Acc@5 94.043 (94.043) [2022-10-12 01:27:25 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.416 Acc@5 94.188 [2022-10-12 01:27:25 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-12 01:27:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 01:27:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][0/1251] eta 0:59:53 lr 0.000001 time 2.8722 (2.8722) loss 3.3673 (3.3673) grad_norm 0.0000 (0.0000) [2022-10-12 01:28:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3431 (0.3664) loss 3.5109 (3.4380) grad_norm 0.0000 (0.0000) [2022-10-12 01:28:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3299 (0.3501) loss 3.6522 (3.4495) grad_norm 0.0000 (0.0000) [2022-10-12 01:29:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3137 (0.3442) loss 3.4436 (3.4442) grad_norm 0.0000 (0.0000) [2022-10-12 01:29:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3539 (0.3413) loss 3.8145 (3.4447) grad_norm 0.0000 (0.0000) [2022-10-12 01:30:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3133 (0.3389) loss 3.4697 (3.4499) grad_norm 0.0000 (0.0000) [2022-10-12 01:30:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3476 (0.3378) loss 3.4591 (3.4504) grad_norm 0.0000 (0.0000) [2022-10-12 01:31:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3367 (0.3366) loss 3.1911 (3.4482) grad_norm 0.0000 (0.0000) [2022-10-12 01:31:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3350 (0.3360) loss 3.6606 (3.4474) grad_norm 0.0000 (0.0000) [2022-10-12 01:32:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3161 (0.3354) loss 3.5278 (3.4478) grad_norm 0.0000 (0.0000) [2022-10-12 01:33:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3321 (0.3348) loss 3.5474 (3.4493) grad_norm 0.0000 (0.0000) [2022-10-12 01:33:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3369 (0.3344) loss 3.6521 (3.4462) grad_norm 0.0000 (0.0000) [2022-10-12 01:34:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [201/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3294 (0.3341) loss 3.5897 (3.4468) grad_norm 0.0000 (0.0000) [2022-10-12 01:34:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 201 training takes 0:06:57 [2022-10-12 01:34:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.194 (3.194) Loss 0.9711 (0.9711) Acc@1 78.027 (78.027) Acc@5 93.652 (93.652) [2022-10-12 01:34:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.266 Acc@5 94.244 [2022-10-12 01:34:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-10-12 01:34:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.43% [2022-10-12 01:34:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][0/1251] eta 1:08:35 lr 0.000001 time 3.2897 (3.2897) loss 3.6195 (3.6195) grad_norm 0.0000 (0.0000) [2022-10-12 01:35:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][100/1251] eta 0:07:07 lr 0.000001 time 0.3109 (0.3713) loss 3.5564 (3.4602) grad_norm 0.0000 (0.0000) [2022-10-12 01:35:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3839 (0.3525) loss 3.1584 (3.4443) grad_norm 0.0000 (0.0000) [2022-10-12 01:36:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3255 (0.3460) loss 3.4657 (3.4346) grad_norm 0.0000 (0.0000) [2022-10-12 01:36:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3174 (0.3429) loss 3.5675 (3.4297) grad_norm 0.0000 (0.0000) [2022-10-12 01:37:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3321 (0.3408) loss 3.7815 (3.4309) grad_norm 0.0000 (0.0000) [2022-10-12 01:38:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3152 (0.3394) loss 3.4736 (3.4324) grad_norm 0.0000 (0.0000) [2022-10-12 01:38:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3258 (0.3381) loss 3.3546 (3.4323) grad_norm 0.0000 (0.0000) [2022-10-12 01:39:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3422 (0.3372) loss 3.2634 (3.4314) grad_norm 0.0000 (0.0000) [2022-10-12 01:39:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3130 (0.3365) loss 3.5570 (3.4357) grad_norm 0.0000 (0.0000) [2022-10-12 01:40:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3528 (0.3361) loss 3.5285 (3.4378) grad_norm 0.0000 (0.0000) [2022-10-12 01:40:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3418 (0.3358) loss 3.4877 (3.4406) grad_norm 0.0000 (0.0000) [2022-10-12 01:41:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [202/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3160 (0.3354) loss 3.4702 (3.4424) grad_norm 0.0000 (0.0000) [2022-10-12 01:41:38 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 202 training takes 0:06:59 [2022-10-12 01:41:41 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.283 (3.283) Loss 0.9912 (0.9912) Acc@1 77.246 (77.246) Acc@5 93.555 (93.555) [2022-10-12 01:41:53 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.504 Acc@5 94.272 [2022-10-12 01:41:53 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-12 01:41:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.50% [2022-10-12 01:41:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][0/1251] eta 1:12:42 lr 0.000001 time 3.4871 (3.4871) loss 3.2410 (3.2410) grad_norm 0.0000 (0.0000) [2022-10-12 01:42:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3303 (0.3659) loss 3.4414 (3.4000) grad_norm 0.0000 (0.0000) [2022-10-12 01:43:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3523 (0.3493) loss 3.4108 (3.4124) grad_norm 0.0000 (0.0000) [2022-10-12 01:43:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3211 (0.3439) loss 3.4194 (3.4182) grad_norm 0.0000 (0.0000) [2022-10-12 01:44:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3368 (0.3408) loss 3.2402 (3.4222) grad_norm 0.0000 (0.0000) [2022-10-12 01:44:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3162 (0.3390) loss 3.3852 (3.4245) grad_norm 0.0000 (0.0000) [2022-10-12 01:45:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3477 (0.3382) loss 3.5156 (3.4279) grad_norm 0.0000 (0.0000) [2022-10-12 01:45:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3205 (0.3372) loss 3.5244 (3.4306) grad_norm 0.0000 (0.0000) [2022-10-12 01:46:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3506 (0.3367) loss 3.2198 (3.4312) grad_norm 0.0000 (0.0000) [2022-10-12 01:46:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3177 (0.3362) loss 3.1678 (3.4311) grad_norm 0.0000 (0.0000) [2022-10-12 01:47:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3371 (0.3356) loss 3.2746 (3.4319) grad_norm 0.0000 (0.0000) [2022-10-12 01:48:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3052 (0.3352) loss 3.1569 (3.4341) grad_norm 0.0000 (0.0000) [2022-10-12 01:48:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [203/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3299 (0.3348) loss 3.4485 (3.4348) grad_norm 0.0000 (0.0000) [2022-10-12 01:48:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 203 training takes 0:06:58 [2022-10-12 01:48:55 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.738 (3.738) Loss 0.9165 (0.9165) Acc@1 79.297 (79.297) Acc@5 94.727 (94.727) [2022-10-12 01:49:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.570 Acc@5 94.290 [2022-10-12 01:49:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-12 01:49:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.57% [2022-10-12 01:49:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][0/1251] eta 1:13:07 lr 0.000001 time 3.5071 (3.5071) loss 3.5670 (3.5670) grad_norm 0.0000 (0.0000) [2022-10-12 01:49:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3194 (0.3673) loss 3.6176 (3.4299) grad_norm 0.0000 (0.0000) [2022-10-12 01:50:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3341 (0.3494) loss 3.7061 (3.4251) grad_norm 0.0000 (0.0000) [2022-10-12 01:50:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3509 (0.3437) loss 2.6884 (3.4106) grad_norm 0.0000 (0.0000) [2022-10-12 01:51:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3434 (0.3401) loss 3.4218 (3.4203) grad_norm 0.0000 (0.0000) [2022-10-12 01:51:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3353 (0.3387) loss 3.6880 (3.4223) grad_norm 0.0000 (0.0000) [2022-10-12 01:52:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3389 (0.3376) loss 3.5192 (3.4264) grad_norm 0.0000 (0.0000) [2022-10-12 01:53:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3141 (0.3369) loss 3.3211 (3.4298) grad_norm 0.0000 (0.0000) [2022-10-12 01:53:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3181 (0.3361) loss 3.2562 (3.4333) grad_norm 0.0000 (0.0000) [2022-10-12 01:54:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3271 (0.3356) loss 3.3025 (3.4332) grad_norm 0.0000 (0.0000) [2022-10-12 01:54:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3511 (0.3352) loss 3.3777 (3.4336) grad_norm 0.0000 (0.0000) [2022-10-12 01:55:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3234 (0.3347) loss 3.3248 (3.4356) grad_norm 0.0000 (0.0000) [2022-10-12 01:55:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [204/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3266 (0.3343) loss 3.2729 (3.4363) grad_norm 0.0000 (0.0000) [2022-10-12 01:56:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 204 training takes 0:06:58 [2022-10-12 01:56:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.313 (3.313) Loss 0.9514 (0.9514) Acc@1 79.688 (79.688) Acc@5 93.262 (93.262) [2022-10-12 01:56:20 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.604 Acc@5 94.150 [2022-10-12 01:56:20 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-12 01:56:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.60% [2022-10-12 01:56:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][0/1251] eta 1:07:32 lr 0.000001 time 3.2394 (3.2394) loss 3.4793 (3.4793) grad_norm 0.0000 (0.0000) [2022-10-12 01:56:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3172 (0.3683) loss 3.2865 (3.4396) grad_norm 0.0000 (0.0000) [2022-10-12 01:57:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3290 (0.3516) loss 3.5122 (3.4362) grad_norm 0.0000 (0.0000) [2022-10-12 01:58:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3423 (0.3456) loss 3.5052 (3.4216) grad_norm 0.0000 (0.0000) [2022-10-12 01:58:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3270 (0.3424) loss 3.5434 (3.4184) grad_norm 0.0000 (0.0000) [2022-10-12 01:59:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3195 (0.3403) loss 3.4780 (3.4246) grad_norm 0.0000 (0.0000) [2022-10-12 01:59:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3280 (0.3389) loss 3.4428 (3.4246) grad_norm 0.0000 (0.0000) [2022-10-12 02:00:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3307 (0.3378) loss 3.7149 (3.4243) grad_norm 0.0000 (0.0000) [2022-10-12 02:00:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3195 (0.3370) loss 3.4638 (3.4241) grad_norm 0.0000 (0.0000) [2022-10-12 02:01:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3540 (0.3363) loss 3.4487 (3.4239) grad_norm 0.0000 (0.0000) [2022-10-12 02:01:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3362 (0.3359) loss 3.1091 (3.4252) grad_norm 0.0000 (0.0000) [2022-10-12 02:02:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3064 (0.3354) loss 3.4422 (3.4266) grad_norm 0.0000 (0.0000) [2022-10-12 02:03:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [205/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3258 (0.3351) loss 3.3486 (3.4267) grad_norm 0.0000 (0.0000) [2022-10-12 02:03:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 205 training takes 0:06:58 [2022-10-12 02:03:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.252 (3.252) Loss 0.8841 (0.8841) Acc@1 80.469 (80.469) Acc@5 95.117 (95.117) [2022-10-12 02:03:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.628 Acc@5 94.254 [2022-10-12 02:03:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-12 02:03:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.63% [2022-10-12 02:03:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][0/1251] eta 1:15:58 lr 0.000001 time 3.6437 (3.6437) loss 3.3799 (3.3799) grad_norm 0.0000 (0.0000) [2022-10-12 02:04:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3210 (0.3687) loss 3.2585 (3.4021) grad_norm 0.0000 (0.0000) [2022-10-12 02:04:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3203 (0.3521) loss 3.3422 (3.4176) grad_norm 0.0000 (0.0000) [2022-10-12 02:05:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3332 (0.3462) loss 3.3095 (3.4163) grad_norm 0.0000 (0.0000) [2022-10-12 02:05:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3612 (0.3430) loss 3.6943 (3.4141) grad_norm 0.0000 (0.0000) [2022-10-12 02:06:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3084 (0.3406) loss 3.2655 (3.4089) grad_norm 0.0000 (0.0000) [2022-10-12 02:06:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3162 (0.3391) loss 3.4442 (3.4187) grad_norm 0.0000 (0.0000) [2022-10-12 02:07:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3340 (0.3381) loss 3.3246 (3.4203) grad_norm 0.0000 (0.0000) [2022-10-12 02:08:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3203 (0.3373) loss 3.3791 (3.4182) grad_norm 0.0000 (0.0000) [2022-10-12 02:08:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3118 (0.3367) loss 3.7380 (3.4214) grad_norm 0.0000 (0.0000) [2022-10-12 02:09:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3400 (0.3362) loss 3.3510 (3.4220) grad_norm 0.0000 (0.0000) [2022-10-12 02:09:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3427 (0.3360) loss 3.3915 (3.4236) grad_norm 0.0000 (0.0000) [2022-10-12 02:10:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [206/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3431 (0.3356) loss 3.3065 (3.4264) grad_norm 0.0000 (0.0000) [2022-10-12 02:10:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 206 training takes 0:06:59 [2022-10-12 02:10:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.363 (3.363) Loss 0.8987 (0.8987) Acc@1 79.688 (79.688) Acc@5 94.434 (94.434) [2022-10-12 02:10:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.722 Acc@5 94.392 [2022-10-12 02:10:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-12 02:10:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.72% [2022-10-12 02:10:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][0/1251] eta 1:12:12 lr 0.000001 time 3.4636 (3.4636) loss 3.4454 (3.4454) grad_norm 0.0000 (0.0000) [2022-10-12 02:11:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3379 (0.3691) loss 3.5264 (3.4044) grad_norm 0.0000 (0.0000) [2022-10-12 02:11:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3136 (0.3511) loss 3.3531 (3.4111) grad_norm 0.0000 (0.0000) [2022-10-12 02:12:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3654 (0.3450) loss 3.6094 (3.4162) grad_norm 0.0000 (0.0000) [2022-10-12 02:13:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3214 (0.3418) loss 3.5735 (3.4201) grad_norm 0.0000 (0.0000) [2022-10-12 02:13:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3369 (0.3399) loss 3.3242 (3.4156) grad_norm 0.0000 (0.0000) [2022-10-12 02:14:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3417 (0.3383) loss 3.2556 (3.4150) grad_norm 0.0000 (0.0000) [2022-10-12 02:14:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3182 (0.3371) loss 3.7898 (3.4152) grad_norm 0.0000 (0.0000) [2022-10-12 02:15:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3321 (0.3361) loss 3.2108 (3.4160) grad_norm 0.0000 (0.0000) [2022-10-12 02:15:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3503 (0.3355) loss 3.7383 (3.4178) grad_norm 0.0000 (0.0000) [2022-10-12 02:16:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3313 (0.3351) loss 3.3276 (3.4186) grad_norm 0.0000 (0.0000) [2022-10-12 02:16:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3189 (0.3346) loss 3.7260 (3.4189) grad_norm 0.0000 (0.0000) [2022-10-12 02:17:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [207/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3527 (0.3345) loss 3.1988 (3.4178) grad_norm 0.0000 (0.0000) [2022-10-12 02:17:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 207 training takes 0:06:58 [2022-10-12 02:17:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.358 (3.358) Loss 0.8461 (0.8461) Acc@1 81.543 (81.543) Acc@5 94.922 (94.922) [2022-10-12 02:18:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.716 Acc@5 94.316 [2022-10-12 02:18:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-12 02:18:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.72% [2022-10-12 02:18:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][0/1251] eta 1:09:50 lr 0.000001 time 3.3495 (3.3495) loss 3.6417 (3.6417) grad_norm 0.0000 (0.0000) [2022-10-12 02:18:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3300 (0.3676) loss 3.3507 (3.4055) grad_norm 0.0000 (0.0000) [2022-10-12 02:19:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3376 (0.3505) loss 3.2641 (3.4006) grad_norm 0.0000 (0.0000) [2022-10-12 02:19:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3499 (0.3449) loss 3.0640 (3.4050) grad_norm 0.0000 (0.0000) [2022-10-12 02:20:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3183 (0.3417) loss 3.2782 (3.4069) grad_norm 0.0000 (0.0000) [2022-10-12 02:20:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3081 (0.3397) loss 3.8377 (3.4060) grad_norm 0.0000 (0.0000) [2022-10-12 02:21:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3341 (0.3384) loss 3.2921 (3.4086) grad_norm 0.0000 (0.0000) [2022-10-12 02:21:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3338 (0.3375) loss 3.6024 (3.4109) grad_norm 0.0000 (0.0000) [2022-10-12 02:22:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3273 (0.3368) loss 3.4704 (3.4097) grad_norm 0.0000 (0.0000) [2022-10-12 02:23:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3495 (0.3363) loss 3.4734 (3.4084) grad_norm 0.0000 (0.0000) [2022-10-12 02:23:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.4164 (0.3359) loss 3.6051 (3.4069) grad_norm 0.0000 (0.0000) [2022-10-12 02:24:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3591 (0.3356) loss 3.0865 (3.4064) grad_norm 0.0000 (0.0000) [2022-10-12 02:24:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [208/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3421 (0.3354) loss 3.6108 (3.4085) grad_norm 0.0000 (0.0000) [2022-10-12 02:25:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 208 training takes 0:06:59 [2022-10-12 02:25:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.023 (3.023) Loss 0.9204 (0.9204) Acc@1 79.004 (79.004) Acc@5 94.922 (94.922) [2022-10-12 02:25:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.548 Acc@5 94.354 [2022-10-12 02:25:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-12 02:25:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.72% [2022-10-12 02:25:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][0/1251] eta 1:14:02 lr 0.000001 time 3.5509 (3.5509) loss 3.4034 (3.4034) grad_norm 0.0000 (0.0000) [2022-10-12 02:25:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3407 (0.3686) loss 3.2721 (3.4003) grad_norm 0.0000 (0.0000) [2022-10-12 02:26:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3457 (0.3514) loss 3.1576 (3.3832) grad_norm 0.0000 (0.0000) [2022-10-12 02:27:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3275 (0.3447) loss 3.4581 (3.3819) grad_norm 0.0000 (0.0000) [2022-10-12 02:27:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3481 (0.3418) loss 3.4162 (3.3818) grad_norm 0.0000 (0.0000) [2022-10-12 02:28:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3329 (0.3405) loss 3.5115 (3.3893) grad_norm 0.0000 (0.0000) [2022-10-12 02:28:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3304 (0.3394) loss 3.3486 (3.3983) grad_norm 0.0000 (0.0000) [2022-10-12 02:29:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3088 (0.3381) loss 3.2402 (3.4022) grad_norm 0.0000 (0.0000) [2022-10-12 02:29:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3428 (0.3377) loss 3.5580 (3.4012) grad_norm 0.0000 (0.0000) [2022-10-12 02:30:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3248 (0.3371) loss 3.3950 (3.4005) grad_norm 0.0000 (0.0000) [2022-10-12 02:30:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.5682 (0.3367) loss 3.2467 (3.4033) grad_norm 0.0000 (0.0000) [2022-10-12 02:31:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3199 (0.3364) loss 3.6902 (3.4056) grad_norm 0.0000 (0.0000) [2022-10-12 02:32:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [209/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3376 (0.3361) loss 3.5246 (3.4065) grad_norm 0.0000 (0.0000) [2022-10-12 02:32:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 209 training takes 0:07:00 [2022-10-12 02:32:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.325 (3.325) Loss 0.9342 (0.9342) Acc@1 78.906 (78.906) Acc@5 94.434 (94.434) [2022-10-12 02:32:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.904 Acc@5 94.448 [2022-10-12 02:32:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-12 02:32:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 78.90% [2022-10-12 02:32:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][0/1251] eta 1:12:55 lr 0.000001 time 3.4973 (3.4973) loss 3.3985 (3.3985) grad_norm 0.0000 (0.0000) [2022-10-12 02:33:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3238 (0.3678) loss 3.4337 (3.3949) grad_norm 0.0000 (0.0000) [2022-10-12 02:33:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3330 (0.3507) loss 3.5996 (3.4111) grad_norm 0.0000 (0.0000) [2022-10-12 02:34:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3311 (0.3437) loss 3.2981 (3.4067) grad_norm 0.0000 (0.0000) [2022-10-12 02:34:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3228 (0.3404) loss 3.4979 (3.4011) grad_norm 0.0000 (0.0000) [2022-10-12 02:35:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3023 (0.3377) loss 3.6758 (3.4012) grad_norm 0.0000 (0.0000) [2022-10-12 02:35:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3403 (0.3366) loss 3.4416 (3.4037) grad_norm 0.0000 (0.0000) [2022-10-12 02:36:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3268 (0.3357) loss 3.7053 (3.4050) grad_norm 0.0000 (0.0000) [2022-10-12 02:37:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3383 (0.3350) loss 3.3475 (3.4057) grad_norm 0.0000 (0.0000) [2022-10-12 02:37:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3551 (0.3345) loss 3.3946 (3.4079) grad_norm 0.0000 (0.0000) [2022-10-12 02:38:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3543 (0.3340) loss 3.5549 (3.4072) grad_norm 0.0000 (0.0000) [2022-10-12 02:38:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3105 (0.3336) loss 3.4416 (3.4079) grad_norm 0.0000 (0.0000) [2022-10-12 02:39:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [210/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3227 (0.3335) loss 3.4417 (3.4093) grad_norm 0.0000 (0.0000) [2022-10-12 02:39:29 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 210 training takes 0:06:57 [2022-10-12 02:39:29 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_210 saving...... [2022-10-12 02:39:29 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_210 saved !!! [2022-10-12 02:39:33 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.266 (3.266) Loss 0.9758 (0.9758) Acc@1 77.734 (77.734) Acc@5 94.141 (94.141) [2022-10-12 02:39:45 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.010 Acc@5 94.384 [2022-10-12 02:39:45 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-12 02:39:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 02:39:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][0/1251] eta 1:10:23 lr 0.000001 time 3.3764 (3.3764) loss 3.5446 (3.5446) grad_norm 0.0000 (0.0000) [2022-10-12 02:40:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3338 (0.3680) loss 3.4256 (3.3920) grad_norm 0.0000 (0.0000) [2022-10-12 02:40:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3393 (0.3510) loss 3.3580 (3.3728) grad_norm 0.0000 (0.0000) [2022-10-12 02:41:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3354 (0.3445) loss 3.5068 (3.3801) grad_norm 0.0000 (0.0000) [2022-10-12 02:42:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3272 (0.3409) loss 3.2139 (3.3866) grad_norm 0.0000 (0.0000) [2022-10-12 02:42:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3346 (0.3387) loss 3.5791 (3.3903) grad_norm 0.0000 (0.0000) [2022-10-12 02:43:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3420 (0.3376) loss 3.1728 (3.3907) grad_norm 0.0000 (0.0000) [2022-10-12 02:43:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3127 (0.3364) loss 3.2602 (3.3952) grad_norm 0.0000 (0.0000) [2022-10-12 02:44:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3270 (0.3359) loss 3.3397 (3.3971) grad_norm 0.0000 (0.0000) [2022-10-12 02:44:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3165 (0.3354) loss 3.1568 (3.3961) grad_norm 0.0000 (0.0000) [2022-10-12 02:45:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3302 (0.3352) loss 3.5011 (3.3979) grad_norm 0.0000 (0.0000) [2022-10-12 02:45:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3124 (0.3348) loss 3.3268 (3.3955) grad_norm 0.0000 (0.0000) [2022-10-12 02:46:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [211/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3317 (0.3346) loss 3.2458 (3.3987) grad_norm 0.0000 (0.0000) [2022-10-12 02:46:43 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 211 training takes 0:06:58 [2022-10-12 02:46:46 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.278 (3.278) Loss 0.9928 (0.9928) Acc@1 76.465 (76.465) Acc@5 94.043 (94.043) [2022-10-12 02:46:59 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.968 Acc@5 94.396 [2022-10-12 02:46:59 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-12 02:46:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 02:47:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][0/1251] eta 1:20:52 lr 0.000001 time 3.8787 (3.8787) loss 3.2201 (3.2201) grad_norm 0.0000 (0.0000) [2022-10-12 02:47:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3200 (0.3708) loss 3.3283 (3.3743) grad_norm 0.0000 (0.0000) [2022-10-12 02:48:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3270 (0.3512) loss 3.3110 (3.3767) grad_norm 0.0000 (0.0000) [2022-10-12 02:48:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3218 (0.3445) loss 3.4271 (3.3797) grad_norm 0.0000 (0.0000) [2022-10-12 02:49:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3280 (0.3412) loss 3.7402 (3.3808) grad_norm 0.0000 (0.0000) [2022-10-12 02:49:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3311 (0.3393) loss 3.3874 (3.3813) grad_norm 0.0000 (0.0000) [2022-10-12 02:50:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3272 (0.3379) loss 3.4826 (3.3833) grad_norm 0.0000 (0.0000) [2022-10-12 02:50:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3157 (0.3370) loss 3.3944 (3.3870) grad_norm 0.0000 (0.0000) [2022-10-12 02:51:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3311 (0.3364) loss 3.5189 (3.3880) grad_norm 0.0000 (0.0000) [2022-10-12 02:52:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3457 (0.3360) loss 3.3430 (3.3853) grad_norm 0.0000 (0.0000) [2022-10-12 02:52:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3406 (0.3357) loss 3.5935 (3.3869) grad_norm 0.0000 (0.0000) [2022-10-12 02:53:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3272 (0.3356) loss 3.1675 (3.3879) grad_norm 0.0000 (0.0000) [2022-10-12 02:53:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [212/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3499 (0.3355) loss 3.4491 (3.3895) grad_norm 0.0000 (0.0000) [2022-10-12 02:53:58 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 212 training takes 0:06:59 [2022-10-12 02:54:02 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.514 (3.514) Loss 0.8677 (0.8677) Acc@1 80.371 (80.371) Acc@5 95.410 (95.410) [2022-10-12 02:54:13 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.690 Acc@5 94.360 [2022-10-12 02:54:13 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-12 02:54:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 02:54:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][0/1251] eta 1:12:45 lr 0.000001 time 3.4897 (3.4897) loss 3.3981 (3.3981) grad_norm 0.0000 (0.0000) [2022-10-12 02:54:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3318 (0.3697) loss 3.7032 (3.3808) grad_norm 0.0000 (0.0000) [2022-10-12 02:55:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3118 (0.3495) loss 3.4777 (3.3791) grad_norm 0.0000 (0.0000) [2022-10-12 02:55:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3354 (0.3436) loss 3.6641 (3.3791) grad_norm 0.0000 (0.0000) [2022-10-12 02:56:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3318 (0.3406) loss 3.3098 (3.3868) grad_norm 0.0000 (0.0000) [2022-10-12 02:57:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3324 (0.3390) loss 3.2663 (3.3866) grad_norm 0.0000 (0.0000) [2022-10-12 02:57:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3093 (0.3375) loss 3.6138 (3.3890) grad_norm 0.0000 (0.0000) [2022-10-12 02:58:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3234 (0.3367) loss 3.3993 (3.3906) grad_norm 0.0000 (0.0000) [2022-10-12 02:58:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3589 (0.3363) loss 3.2900 (3.3881) grad_norm 0.0000 (0.0000) [2022-10-12 02:59:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3355 (0.3359) loss 3.2313 (3.3892) grad_norm 0.0000 (0.0000) [2022-10-12 02:59:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3191 (0.3355) loss 3.5084 (3.3900) grad_norm 0.0000 (0.0000) [2022-10-12 03:00:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3027 (0.3351) loss 3.4124 (3.3915) grad_norm 0.0000 (0.0000) [2022-10-12 03:00:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [213/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3481 (0.3349) loss 3.4619 (3.3921) grad_norm 0.0000 (0.0000) [2022-10-12 03:01:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 213 training takes 0:06:58 [2022-10-12 03:01:16 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.269 (3.269) Loss 0.9454 (0.9454) Acc@1 80.273 (80.273) Acc@5 94.043 (94.043) [2022-10-12 03:01:28 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.716 Acc@5 94.350 [2022-10-12 03:01:28 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-12 03:01:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 03:01:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][0/1251] eta 1:12:24 lr 0.000001 time 3.4730 (3.4730) loss 3.4683 (3.4683) grad_norm 0.0000 (0.0000) [2022-10-12 03:02:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3132 (0.3694) loss 3.3841 (3.3748) grad_norm 0.0000 (0.0000) [2022-10-12 03:02:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3469 (0.3516) loss 3.4975 (3.3740) grad_norm 0.0000 (0.0000) [2022-10-12 03:03:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3260 (0.3453) loss 3.3739 (3.3828) grad_norm 0.0000 (0.0000) [2022-10-12 03:03:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3078 (0.3419) loss 3.6708 (3.3830) grad_norm 0.0000 (0.0000) [2022-10-12 03:04:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3085 (0.3400) loss 3.4185 (3.3845) grad_norm 0.0000 (0.0000) [2022-10-12 03:04:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3685 (0.3385) loss 3.3531 (3.3849) grad_norm 0.0000 (0.0000) [2022-10-12 03:05:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3311 (0.3373) loss 3.5819 (3.3856) grad_norm 0.0000 (0.0000) [2022-10-12 03:05:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3375 (0.3365) loss 3.4127 (3.3831) grad_norm 0.0000 (0.0000) [2022-10-12 03:06:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3226 (0.3359) loss 3.3958 (3.3839) grad_norm 0.0000 (0.0000) [2022-10-12 03:07:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3302 (0.3355) loss 3.0917 (3.3812) grad_norm 0.0000 (0.0000) [2022-10-12 03:07:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3275 (0.3352) loss 3.4036 (3.3816) grad_norm 0.0000 (0.0000) [2022-10-12 03:08:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [214/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3480 (0.3349) loss 3.2661 (3.3835) grad_norm 0.0000 (0.0000) [2022-10-12 03:08:27 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 214 training takes 0:06:58 [2022-10-12 03:08:30 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.287 (3.287) Loss 0.9315 (0.9315) Acc@1 78.223 (78.223) Acc@5 94.824 (94.824) [2022-10-12 03:08:42 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.000 Acc@5 94.458 [2022-10-12 03:08:42 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-12 03:08:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 03:08:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][0/1251] eta 1:13:41 lr 0.000001 time 3.5347 (3.5347) loss 3.3613 (3.3613) grad_norm 0.0000 (0.0000) [2022-10-12 03:09:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3545 (0.3697) loss 3.6807 (3.3824) grad_norm 0.0000 (0.0000) [2022-10-12 03:09:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3454 (0.3509) loss 3.6640 (3.3830) grad_norm 0.0000 (0.0000) [2022-10-12 03:10:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3143 (0.3448) loss 3.7491 (3.3833) grad_norm 0.0000 (0.0000) [2022-10-12 03:10:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3220 (0.3412) loss 3.3979 (3.3800) grad_norm 0.0000 (0.0000) [2022-10-12 03:11:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3286 (0.3393) loss 3.2889 (3.3761) grad_norm 0.0000 (0.0000) [2022-10-12 03:12:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3484 (0.3381) loss 3.2047 (3.3832) grad_norm 0.0000 (0.0000) [2022-10-12 03:12:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3462 (0.3372) loss 3.5235 (3.3837) grad_norm 0.0000 (0.0000) [2022-10-12 03:13:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3192 (0.3365) loss 3.4071 (3.3839) grad_norm 0.0000 (0.0000) [2022-10-12 03:13:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3305 (0.3361) loss 3.1189 (3.3869) grad_norm 0.0000 (0.0000) [2022-10-12 03:14:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3394 (0.3358) loss 3.5495 (3.3862) grad_norm 0.0000 (0.0000) [2022-10-12 03:14:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3230 (0.3356) loss 3.3107 (3.3851) grad_norm 0.0000 (0.0000) [2022-10-12 03:15:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [215/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3047 (0.3354) loss 3.3709 (3.3838) grad_norm 0.0000 (0.0000) [2022-10-12 03:15:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 215 training takes 0:06:59 [2022-10-12 03:15:45 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.515 (3.515) Loss 0.9712 (0.9712) Acc@1 76.758 (76.758) Acc@5 94.434 (94.434) [2022-10-12 03:15:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.826 Acc@5 94.294 [2022-10-12 03:15:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-12 03:15:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.01% [2022-10-12 03:16:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][0/1251] eta 1:15:45 lr 0.000001 time 3.6332 (3.6332) loss 3.1327 (3.1327) grad_norm 0.0000 (0.0000) [2022-10-12 03:16:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3412 (0.3684) loss 3.3671 (3.3632) grad_norm 0.0000 (0.0000) [2022-10-12 03:17:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3257 (0.3501) loss 3.6229 (3.3692) grad_norm 0.0000 (0.0000) [2022-10-12 03:17:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3164 (0.3438) loss 3.3186 (3.3637) grad_norm 0.0000 (0.0000) [2022-10-12 03:18:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3248 (0.3408) loss 3.5797 (3.3678) grad_norm 0.0000 (0.0000) [2022-10-12 03:18:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3189 (0.3394) loss 3.5362 (3.3710) grad_norm 0.0000 (0.0000) [2022-10-12 03:19:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3571 (0.3382) loss 3.3439 (3.3706) grad_norm 0.0000 (0.0000) [2022-10-12 03:19:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3495 (0.3375) loss 3.3548 (3.3689) grad_norm 0.0000 (0.0000) [2022-10-12 03:20:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3508 (0.3367) loss 3.2709 (3.3710) grad_norm 0.0000 (0.0000) [2022-10-12 03:20:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3352 (0.3361) loss 3.5629 (3.3732) grad_norm 0.0000 (0.0000) [2022-10-12 03:21:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3410 (0.3356) loss 3.2733 (3.3728) grad_norm 0.0000 (0.0000) [2022-10-12 03:22:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3486 (0.3353) loss 3.1860 (3.3740) grad_norm 0.0000 (0.0000) [2022-10-12 03:22:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [216/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3310 (0.3350) loss 3.4589 (3.3758) grad_norm 0.0000 (0.0000) [2022-10-12 03:22:55 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 216 training takes 0:06:58 [2022-10-12 03:22:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.360 (3.360) Loss 0.9166 (0.9166) Acc@1 79.785 (79.785) Acc@5 94.727 (94.727) [2022-10-12 03:23:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.042 Acc@5 94.412 [2022-10-12 03:23:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-12 03:23:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.04% [2022-10-12 03:23:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][0/1251] eta 1:16:39 lr 0.000001 time 3.6768 (3.6768) loss 3.1665 (3.1665) grad_norm 0.0000 (0.0000) [2022-10-12 03:23:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3379 (0.3680) loss 3.3355 (3.3439) grad_norm 0.0000 (0.0000) [2022-10-12 03:24:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][200/1251] eta 0:06:07 lr 0.000001 time 0.2997 (0.3501) loss 3.6283 (3.3653) grad_norm 0.0000 (0.0000) [2022-10-12 03:24:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3308 (0.3440) loss 3.4114 (3.3749) grad_norm 0.0000 (0.0000) [2022-10-12 03:25:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3500 (0.3411) loss 3.4538 (3.3706) grad_norm 0.0000 (0.0000) [2022-10-12 03:26:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3242 (0.3388) loss 3.1984 (3.3707) grad_norm 0.0000 (0.0000) [2022-10-12 03:26:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3140 (0.3379) loss 3.1375 (3.3736) grad_norm 0.0000 (0.0000) [2022-10-12 03:27:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3636 (0.3370) loss 3.2648 (3.3730) grad_norm 0.0000 (0.0000) [2022-10-12 03:27:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3594 (0.3363) loss 3.2067 (3.3714) grad_norm 0.0000 (0.0000) [2022-10-12 03:28:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3243 (0.3359) loss 3.1210 (3.3702) grad_norm 0.0000 (0.0000) [2022-10-12 03:28:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3117 (0.3356) loss 3.4526 (3.3701) grad_norm 0.0000 (0.0000) [2022-10-12 03:29:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3362 (0.3353) loss 3.2241 (3.3718) grad_norm 0.0000 (0.0000) [2022-10-12 03:29:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [217/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3122 (0.3351) loss 3.1980 (3.3708) grad_norm 0.0000 (0.0000) [2022-10-12 03:30:10 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 217 training takes 0:06:59 [2022-10-12 03:30:13 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.320 (3.320) Loss 0.8971 (0.8971) Acc@1 79.492 (79.492) Acc@5 94.922 (94.922) [2022-10-12 03:30:25 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.098 Acc@5 94.402 [2022-10-12 03:30:25 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-12 03:30:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.10% [2022-10-12 03:30:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][0/1251] eta 1:12:45 lr 0.000001 time 3.4897 (3.4897) loss 3.3475 (3.3475) grad_norm 0.0000 (0.0000) [2022-10-12 03:31:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3267 (0.3647) loss 3.4047 (3.3662) grad_norm 0.0000 (0.0000) [2022-10-12 03:31:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3342 (0.3477) loss 3.4575 (3.3707) grad_norm 0.0000 (0.0000) [2022-10-12 03:32:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3335 (0.3421) loss 3.7105 (3.3620) grad_norm 0.0000 (0.0000) [2022-10-12 03:32:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3198 (0.3394) loss 3.5009 (3.3698) grad_norm 0.0000 (0.0000) [2022-10-12 03:33:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3041 (0.3378) loss 3.5349 (3.3672) grad_norm 0.0000 (0.0000) [2022-10-12 03:33:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3339 (0.3362) loss 3.4346 (3.3654) grad_norm 0.0000 (0.0000) [2022-10-12 03:34:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3379 (0.3356) loss 3.5888 (3.3654) grad_norm 0.0000 (0.0000) [2022-10-12 03:34:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3434 (0.3353) loss 3.2225 (3.3661) grad_norm 0.0000 (0.0000) [2022-10-12 03:35:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3179 (0.3348) loss 3.4881 (3.3649) grad_norm 0.0000 (0.0000) [2022-10-12 03:36:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3375 (0.3347) loss 3.1038 (3.3635) grad_norm 0.0000 (0.0000) [2022-10-12 03:36:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3376 (0.3345) loss 3.1095 (3.3637) grad_norm 0.0000 (0.0000) [2022-10-12 03:37:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [218/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3230 (0.3343) loss 3.5304 (3.3660) grad_norm 0.0000 (0.0000) [2022-10-12 03:37:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 218 training takes 0:06:58 [2022-10-12 03:37:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.491 (3.491) Loss 0.9050 (0.9050) Acc@1 78.711 (78.711) Acc@5 95.020 (95.020) [2022-10-12 03:37:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.428 Acc@5 94.502 [2022-10-12 03:37:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-12 03:37:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 03:37:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][0/1251] eta 1:16:06 lr 0.000001 time 3.6504 (3.6504) loss 3.5204 (3.5204) grad_norm 0.0000 (0.0000) [2022-10-12 03:38:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3341 (0.3694) loss 3.1950 (3.3428) grad_norm 0.0000 (0.0000) [2022-10-12 03:38:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3323 (0.3505) loss 3.5534 (3.3445) grad_norm 0.0000 (0.0000) [2022-10-12 03:39:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3173 (0.3436) loss 3.3372 (3.3478) grad_norm 0.0000 (0.0000) [2022-10-12 03:39:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3510 (0.3405) loss 3.3747 (3.3514) grad_norm 0.0000 (0.0000) [2022-10-12 03:40:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3258 (0.3390) loss 3.3481 (3.3511) grad_norm 0.0000 (0.0000) [2022-10-12 03:41:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3167 (0.3377) loss 3.2124 (3.3534) grad_norm 0.0000 (0.0000) [2022-10-12 03:41:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3335 (0.3368) loss 3.5614 (3.3594) grad_norm 0.0000 (0.0000) [2022-10-12 03:42:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3201 (0.3360) loss 3.4263 (3.3603) grad_norm 0.0000 (0.0000) [2022-10-12 03:42:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3458 (0.3353) loss 3.5183 (3.3597) grad_norm 0.0000 (0.0000) [2022-10-12 03:43:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3499 (0.3350) loss 3.3492 (3.3621) grad_norm 0.0000 (0.0000) [2022-10-12 03:43:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3230 (0.3347) loss 3.3605 (3.3626) grad_norm 0.0000 (0.0000) [2022-10-12 03:44:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [219/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3366 (0.3343) loss 3.2891 (3.3616) grad_norm 0.0000 (0.0000) [2022-10-12 03:44:36 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 219 training takes 0:06:57 [2022-10-12 03:44:39 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.061 (3.061) Loss 0.9377 (0.9377) Acc@1 78.516 (78.516) Acc@5 94.727 (94.727) [2022-10-12 03:44:51 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.098 Acc@5 94.526 [2022-10-12 03:44:51 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-12 03:44:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 03:44:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][0/1251] eta 1:13:00 lr 0.000001 time 3.5016 (3.5016) loss 3.7340 (3.7340) grad_norm 0.0000 (0.0000) [2022-10-12 03:45:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3532 (0.3698) loss 3.5331 (3.3634) grad_norm 0.0000 (0.0000) [2022-10-12 03:46:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3409 (0.3513) loss 3.1059 (3.3562) grad_norm 0.0000 (0.0000) [2022-10-12 03:46:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3346 (0.3449) loss 3.1579 (3.3503) grad_norm 0.0000 (0.0000) [2022-10-12 03:47:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3499 (0.3415) loss 3.0388 (3.3443) grad_norm 0.0000 (0.0000) [2022-10-12 03:47:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3202 (0.3398) loss 3.5488 (3.3473) grad_norm 0.0000 (0.0000) [2022-10-12 03:48:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3386 (0.3384) loss 3.1175 (3.3464) grad_norm 0.0000 (0.0000) [2022-10-12 03:48:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3475 (0.3378) loss 3.3543 (3.3518) grad_norm 0.0000 (0.0000) [2022-10-12 03:49:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3105 (0.3372) loss 3.4106 (3.3536) grad_norm 0.0000 (0.0000) [2022-10-12 03:49:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3446 (0.3369) loss 3.4354 (3.3569) grad_norm 0.0000 (0.0000) [2022-10-12 03:50:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3420 (0.3365) loss 3.2291 (3.3559) grad_norm 0.0000 (0.0000) [2022-10-12 03:51:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3165 (0.3362) loss 3.2656 (3.3549) grad_norm 0.0000 (0.0000) [2022-10-12 03:51:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [220/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3294 (0.3359) loss 3.5809 (3.3550) grad_norm 0.0000 (0.0000) [2022-10-12 03:51:51 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 220 training takes 0:07:00 [2022-10-12 03:51:51 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_220 saving...... [2022-10-12 03:51:51 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_220 saved !!! [2022-10-12 03:51:55 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.335 (3.335) Loss 1.0164 (1.0164) Acc@1 77.148 (77.148) Acc@5 93.848 (93.848) [2022-10-12 03:52:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.920 Acc@5 94.416 [2022-10-12 03:52:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-12 03:52:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 03:52:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][0/1251] eta 1:11:12 lr 0.000001 time 3.4150 (3.4150) loss 3.1981 (3.1981) grad_norm 0.0000 (0.0000) [2022-10-12 03:52:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3317 (0.3665) loss 3.2947 (3.3523) grad_norm 0.0000 (0.0000) [2022-10-12 03:53:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3620 (0.3485) loss 3.0703 (3.3557) grad_norm 0.0000 (0.0000) [2022-10-12 03:53:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3226 (0.3430) loss 3.1909 (3.3413) grad_norm 0.0000 (0.0000) [2022-10-12 03:54:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3409 (0.3397) loss 3.4535 (3.3416) grad_norm 0.0000 (0.0000) [2022-10-12 03:54:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3140 (0.3381) loss 3.1653 (3.3415) grad_norm 0.0000 (0.0000) [2022-10-12 03:55:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3549 (0.3372) loss 3.5771 (3.3435) grad_norm 0.0000 (0.0000) [2022-10-12 03:56:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3326 (0.3361) loss 3.0881 (3.3441) grad_norm 0.0000 (0.0000) [2022-10-12 03:56:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3453 (0.3354) loss 3.7024 (3.3457) grad_norm 0.0000 (0.0000) [2022-10-12 03:57:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3386 (0.3350) loss 3.1773 (3.3441) grad_norm 0.0000 (0.0000) [2022-10-12 03:57:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3455 (0.3346) loss 3.5244 (3.3455) grad_norm 0.0000 (0.0000) [2022-10-12 03:58:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3179 (0.3342) loss 3.2532 (3.3453) grad_norm 0.0000 (0.0000) [2022-10-12 03:58:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [221/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3301 (0.3340) loss 3.2304 (3.3444) grad_norm 0.0000 (0.0000) [2022-10-12 03:59:04 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 221 training takes 0:06:57 [2022-10-12 03:59:07 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.127 (3.127) Loss 0.9187 (0.9187) Acc@1 79.590 (79.590) Acc@5 94.727 (94.727) [2022-10-12 03:59:19 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.106 Acc@5 94.498 [2022-10-12 03:59:19 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-12 03:59:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 03:59:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][0/1251] eta 1:16:28 lr 0.000001 time 3.6680 (3.6680) loss 3.3288 (3.3288) grad_norm 0.0000 (0.0000) [2022-10-12 03:59:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3227 (0.3702) loss 3.4975 (3.3166) grad_norm 0.0000 (0.0000) [2022-10-12 04:00:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3330 (0.3510) loss 3.3430 (3.3385) grad_norm 0.0000 (0.0000) [2022-10-12 04:01:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3049 (0.3449) loss 3.2874 (3.3474) grad_norm 0.0000 (0.0000) [2022-10-12 04:01:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3461 (0.3413) loss 3.3917 (3.3434) grad_norm 0.0000 (0.0000) [2022-10-12 04:02:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3256 (0.3398) loss 3.2821 (3.3441) grad_norm 0.0000 (0.0000) [2022-10-12 04:02:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3011 (0.3387) loss 3.6072 (3.3393) grad_norm 0.0000 (0.0000) [2022-10-12 04:03:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3239 (0.3376) loss 3.3615 (3.3406) grad_norm 0.0000 (0.0000) [2022-10-12 04:03:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3262 (0.3368) loss 3.3979 (3.3440) grad_norm 0.0000 (0.0000) [2022-10-12 04:04:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3324 (0.3363) loss 3.3105 (3.3454) grad_norm 0.0000 (0.0000) [2022-10-12 04:04:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3301 (0.3356) loss 3.1356 (3.3442) grad_norm 0.0000 (0.0000) [2022-10-12 04:05:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3289 (0.3354) loss 3.5611 (3.3426) grad_norm 0.0000 (0.0000) [2022-10-12 04:06:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [222/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3429 (0.3352) loss 3.1711 (3.3440) grad_norm 0.0000 (0.0000) [2022-10-12 04:06:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 222 training takes 0:06:59 [2022-10-12 04:06:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.226 (3.226) Loss 1.0211 (1.0211) Acc@1 77.051 (77.051) Acc@5 92.969 (92.969) [2022-10-12 04:06:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 78.954 Acc@5 94.516 [2022-10-12 04:06:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-12 04:06:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:06:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][0/1251] eta 1:13:58 lr 0.000001 time 3.5482 (3.5482) loss 3.0955 (3.0955) grad_norm 0.0000 (0.0000) [2022-10-12 04:07:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3306 (0.3683) loss 3.4204 (3.3198) grad_norm 0.0000 (0.0000) [2022-10-12 04:07:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3644 (0.3509) loss 3.3771 (3.3247) grad_norm 0.0000 (0.0000) [2022-10-12 04:08:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3301 (0.3448) loss 3.5224 (3.3299) grad_norm 0.0000 (0.0000) [2022-10-12 04:08:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3297 (0.3415) loss 3.1014 (3.3346) grad_norm 0.0000 (0.0000) [2022-10-12 04:09:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3053 (0.3398) loss 3.5998 (3.3364) grad_norm 0.0000 (0.0000) [2022-10-12 04:09:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3593 (0.3384) loss 3.2973 (3.3351) grad_norm 0.0000 (0.0000) [2022-10-12 04:10:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3423 (0.3377) loss 3.3173 (3.3384) grad_norm 0.0000 (0.0000) [2022-10-12 04:11:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3804 (0.3370) loss 3.5651 (3.3381) grad_norm 0.0000 (0.0000) [2022-10-12 04:11:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3502 (0.3366) loss 3.6186 (3.3410) grad_norm 0.0000 (0.0000) [2022-10-12 04:12:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3306 (0.3361) loss 3.3408 (3.3405) grad_norm 0.0000 (0.0000) [2022-10-12 04:12:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3511 (0.3358) loss 3.1690 (3.3390) grad_norm 0.0000 (0.0000) [2022-10-12 04:13:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [223/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3454 (0.3354) loss 3.1770 (3.3386) grad_norm 0.0000 (0.0000) [2022-10-12 04:13:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 223 training takes 0:06:59 [2022-10-12 04:13:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.220 (3.220) Loss 0.9533 (0.9533) Acc@1 78.711 (78.711) Acc@5 94.727 (94.727) [2022-10-12 04:13:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.288 Acc@5 94.592 [2022-10-12 04:13:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-12 04:13:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:13:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][0/1251] eta 1:09:39 lr 0.000001 time 3.3409 (3.3409) loss 3.1350 (3.1350) grad_norm 0.0000 (0.0000) [2022-10-12 04:14:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3195 (0.3660) loss 3.3548 (3.3245) grad_norm 0.0000 (0.0000) [2022-10-12 04:14:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3483 (0.3488) loss 3.3830 (3.3184) grad_norm 0.0000 (0.0000) [2022-10-12 04:15:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3090 (0.3424) loss 3.4202 (3.3238) grad_norm 0.0000 (0.0000) [2022-10-12 04:16:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3431 (0.3392) loss 3.5862 (3.3246) grad_norm 0.0000 (0.0000) [2022-10-12 04:16:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3223 (0.3378) loss 3.3745 (3.3233) grad_norm 0.0000 (0.0000) [2022-10-12 04:17:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3516 (0.3371) loss 3.5674 (3.3262) grad_norm 0.0000 (0.0000) [2022-10-12 04:17:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3343 (0.3360) loss 3.6255 (3.3296) grad_norm 0.0000 (0.0000) [2022-10-12 04:18:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3209 (0.3356) loss 3.0129 (3.3287) grad_norm 0.0000 (0.0000) [2022-10-12 04:18:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3185 (0.3351) loss 3.1546 (3.3299) grad_norm 0.0000 (0.0000) [2022-10-12 04:19:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.2998 (0.3349) loss 3.3977 (3.3315) grad_norm 0.0000 (0.0000) [2022-10-12 04:19:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3186 (0.3349) loss 3.4807 (3.3308) grad_norm 0.0000 (0.0000) [2022-10-12 04:20:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [224/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3181 (0.3347) loss 3.2675 (3.3322) grad_norm 0.0000 (0.0000) [2022-10-12 04:20:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 224 training takes 0:06:58 [2022-10-12 04:20:51 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.382 (3.382) Loss 0.8606 (0.8606) Acc@1 81.152 (81.152) Acc@5 94.531 (94.531) [2022-10-12 04:21:03 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.390 Acc@5 94.542 [2022-10-12 04:21:03 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-12 04:21:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:21:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][0/1251] eta 1:12:35 lr 0.000001 time 3.4817 (3.4817) loss 3.1449 (3.1449) grad_norm 0.0000 (0.0000) [2022-10-12 04:21:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3648 (0.3687) loss 3.1775 (3.3119) grad_norm 0.0000 (0.0000) [2022-10-12 04:22:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3579 (0.3503) loss 3.4864 (3.3045) grad_norm 0.0000 (0.0000) [2022-10-12 04:22:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3276 (0.3445) loss 3.0609 (3.3101) grad_norm 0.0000 (0.0000) [2022-10-12 04:23:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3447 (0.3419) loss 3.2413 (3.3057) grad_norm 0.0000 (0.0000) [2022-10-12 04:23:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3525 (0.3402) loss 3.4393 (3.3093) grad_norm 0.0000 (0.0000) [2022-10-12 04:24:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3213 (0.3391) loss 3.3378 (3.3106) grad_norm 0.0000 (0.0000) [2022-10-12 04:25:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3360 (0.3381) loss 3.0722 (3.3139) grad_norm 0.0000 (0.0000) [2022-10-12 04:25:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3266 (0.3371) loss 3.4070 (3.3186) grad_norm 0.0000 (0.0000) [2022-10-12 04:26:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3364 (0.3365) loss 3.3943 (3.3209) grad_norm 0.0000 (0.0000) [2022-10-12 04:26:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3279 (0.3360) loss 3.5782 (3.3234) grad_norm 0.0000 (0.0000) [2022-10-12 04:27:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3240 (0.3357) loss 3.3122 (3.3275) grad_norm 0.0000 (0.0000) [2022-10-12 04:27:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [225/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3324 (0.3353) loss 3.3707 (3.3277) grad_norm 0.0000 (0.0000) [2022-10-12 04:28:02 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 225 training takes 0:06:59 [2022-10-12 04:28:05 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.344 (3.344) Loss 0.9816 (0.9816) Acc@1 78.125 (78.125) Acc@5 93.945 (93.945) [2022-10-12 04:28:17 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.288 Acc@5 94.570 [2022-10-12 04:28:17 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-12 04:28:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:28:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][0/1251] eta 1:11:17 lr 0.000001 time 3.4193 (3.4193) loss 3.2629 (3.2629) grad_norm 0.0000 (0.0000) [2022-10-12 04:28:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3474 (0.3694) loss 3.2974 (3.2942) grad_norm 0.0000 (0.0000) [2022-10-12 04:29:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3272 (0.3508) loss 3.6573 (3.3093) grad_norm 0.0000 (0.0000) [2022-10-12 04:30:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3259 (0.3440) loss 3.2989 (3.3137) grad_norm 0.0000 (0.0000) [2022-10-12 04:30:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3125 (0.3413) loss 3.5029 (3.3192) grad_norm 0.0000 (0.0000) [2022-10-12 04:31:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3238 (0.3391) loss 3.3713 (3.3213) grad_norm 0.0000 (0.0000) [2022-10-12 04:31:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3024 (0.3377) loss 3.3654 (3.3232) grad_norm 0.0000 (0.0000) [2022-10-12 04:32:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3338 (0.3367) loss 3.3230 (3.3305) grad_norm 0.0000 (0.0000) [2022-10-12 04:32:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3420 (0.3360) loss 3.3081 (3.3307) grad_norm 0.0000 (0.0000) [2022-10-12 04:33:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3136 (0.3356) loss 3.3726 (3.3329) grad_norm 0.0000 (0.0000) [2022-10-12 04:33:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3115 (0.3351) loss 3.3531 (3.3306) grad_norm 0.0000 (0.0000) [2022-10-12 04:34:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3416 (0.3349) loss 3.5253 (3.3314) grad_norm 0.0000 (0.0000) [2022-10-12 04:34:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [226/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3430 (0.3346) loss 3.2357 (3.3320) grad_norm 0.0000 (0.0000) [2022-10-12 04:35:16 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 226 training takes 0:06:58 [2022-10-12 04:35:19 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.235 (3.235) Loss 0.8718 (0.8718) Acc@1 80.176 (80.176) Acc@5 95.605 (95.605) [2022-10-12 04:35:31 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.290 Acc@5 94.564 [2022-10-12 04:35:31 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-12 04:35:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:35:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][0/1251] eta 1:19:00 lr 0.000001 time 3.7893 (3.7893) loss 3.4058 (3.4058) grad_norm 0.0000 (0.0000) [2022-10-12 04:36:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3403 (0.3698) loss 3.4052 (3.3157) grad_norm 0.0000 (0.0000) [2022-10-12 04:36:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3252 (0.3514) loss 3.2865 (3.3195) grad_norm 0.0000 (0.0000) [2022-10-12 04:37:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3460 (0.3446) loss 3.5057 (3.3164) grad_norm 0.0000 (0.0000) [2022-10-12 04:37:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3147 (0.3417) loss 3.4657 (3.3090) grad_norm 0.0000 (0.0000) [2022-10-12 04:38:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3135 (0.3402) loss 3.4217 (3.3159) grad_norm 0.0000 (0.0000) [2022-10-12 04:38:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3505 (0.3389) loss 3.3516 (3.3153) grad_norm 0.0000 (0.0000) [2022-10-12 04:39:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3169 (0.3378) loss 3.2539 (3.3181) grad_norm 0.0000 (0.0000) [2022-10-12 04:40:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3153 (0.3371) loss 3.2060 (3.3195) grad_norm 0.0000 (0.0000) [2022-10-12 04:40:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3451 (0.3365) loss 3.3201 (3.3245) grad_norm 0.0000 (0.0000) [2022-10-12 04:41:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3282 (0.3362) loss 3.4785 (3.3224) grad_norm 0.0000 (0.0000) [2022-10-12 04:41:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3554 (0.3359) loss 3.1077 (3.3228) grad_norm 0.0000 (0.0000) [2022-10-12 04:42:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [227/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3446 (0.3355) loss 3.2725 (3.3228) grad_norm 0.0000 (0.0000) [2022-10-12 04:42:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 227 training takes 0:06:59 [2022-10-12 04:42:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.141 (3.141) Loss 0.8773 (0.8773) Acc@1 81.055 (81.055) Acc@5 94.824 (94.824) [2022-10-12 04:42:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.414 Acc@5 94.570 [2022-10-12 04:42:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-12 04:42:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.43% [2022-10-12 04:42:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][0/1251] eta 1:12:59 lr 0.000001 time 3.5011 (3.5011) loss 3.4574 (3.4574) grad_norm 0.0000 (0.0000) [2022-10-12 04:43:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3606 (0.3684) loss 3.3684 (3.3025) grad_norm 0.0000 (0.0000) [2022-10-12 04:43:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3148 (0.3498) loss 3.1938 (3.2948) grad_norm 0.0000 (0.0000) [2022-10-12 04:44:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3318 (0.3441) loss 3.3428 (3.3182) grad_norm 0.0000 (0.0000) [2022-10-12 04:45:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3069 (0.3414) loss 3.3144 (3.3166) grad_norm 0.0000 (0.0000) [2022-10-12 04:45:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3494 (0.3398) loss 3.3721 (3.3129) grad_norm 0.0000 (0.0000) [2022-10-12 04:46:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3235 (0.3384) loss 3.1911 (3.3136) grad_norm 0.0000 (0.0000) [2022-10-12 04:46:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3279 (0.3371) loss 3.2586 (3.3159) grad_norm 0.0000 (0.0000) [2022-10-12 04:47:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3421 (0.3365) loss 3.2288 (3.3132) grad_norm 0.0000 (0.0000) [2022-10-12 04:47:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3107 (0.3359) loss 3.2182 (3.3101) grad_norm 0.0000 (0.0000) [2022-10-12 04:48:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3315 (0.3354) loss 3.0497 (3.3099) grad_norm 0.0000 (0.0000) [2022-10-12 04:48:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3891 (0.3352) loss 3.2570 (3.3114) grad_norm 0.0000 (0.0000) [2022-10-12 04:49:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [228/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3511 (0.3349) loss 3.3956 (3.3140) grad_norm 0.0000 (0.0000) [2022-10-12 04:49:44 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 228 training takes 0:06:58 [2022-10-12 04:49:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.230 (3.230) Loss 0.9434 (0.9434) Acc@1 80.176 (80.176) Acc@5 93.457 (93.457) [2022-10-12 04:49:59 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.444 Acc@5 94.650 [2022-10-12 04:49:59 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-12 04:49:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.44% [2022-10-12 04:50:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][0/1251] eta 1:14:32 lr 0.000001 time 3.5752 (3.5752) loss 3.1202 (3.1202) grad_norm 0.0000 (0.0000) [2022-10-12 04:50:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3233 (0.3671) loss 3.5098 (3.3295) grad_norm 0.0000 (0.0000) [2022-10-12 04:51:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3257 (0.3500) loss 3.0108 (3.3020) grad_norm 0.0000 (0.0000) [2022-10-12 04:51:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3129 (0.3441) loss 3.3235 (3.3023) grad_norm 0.0000 (0.0000) [2022-10-12 04:52:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3223 (0.3411) loss 3.5708 (3.2986) grad_norm 0.0000 (0.0000) [2022-10-12 04:52:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3352 (0.3390) loss 3.4458 (3.3043) grad_norm 0.0000 (0.0000) [2022-10-12 04:53:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3786 (0.3378) loss 3.0693 (3.3072) grad_norm 0.0000 (0.0000) [2022-10-12 04:53:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3531 (0.3371) loss 3.2857 (3.3084) grad_norm 0.0000 (0.0000) [2022-10-12 04:54:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3511 (0.3365) loss 3.5006 (3.3080) grad_norm 0.0000 (0.0000) [2022-10-12 04:55:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3214 (0.3362) loss 3.5516 (3.3081) grad_norm 0.0000 (0.0000) [2022-10-12 04:55:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3666 (0.3358) loss 3.4130 (3.3081) grad_norm 0.0000 (0.0000) [2022-10-12 04:56:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3440 (0.3355) loss 3.4027 (3.3090) grad_norm 0.0000 (0.0000) [2022-10-12 04:56:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [229/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3635 (0.3354) loss 3.4215 (3.3114) grad_norm 0.0000 (0.0000) [2022-10-12 04:56:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 229 training takes 0:06:59 [2022-10-12 04:57:02 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.251 (3.251) Loss 0.9080 (0.9080) Acc@1 79.590 (79.590) Acc@5 94.824 (94.824) [2022-10-12 04:57:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.350 Acc@5 94.492 [2022-10-12 04:57:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-12 04:57:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.44% [2022-10-12 04:57:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][0/1251] eta 1:14:02 lr 0.000001 time 3.5508 (3.5508) loss 3.2599 (3.2599) grad_norm 0.0000 (0.0000) [2022-10-12 04:57:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3453 (0.3666) loss 3.2577 (3.2887) grad_norm 0.0000 (0.0000) [2022-10-12 04:58:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3150 (0.3502) loss 3.5185 (3.2932) grad_norm 0.0000 (0.0000) [2022-10-12 04:58:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3438 (0.3439) loss 3.1386 (3.2948) grad_norm 0.0000 (0.0000) [2022-10-12 04:59:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3261 (0.3408) loss 3.2235 (3.2942) grad_norm 0.0000 (0.0000) [2022-10-12 05:00:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3211 (0.3387) loss 3.5364 (3.2924) grad_norm 0.0000 (0.0000) [2022-10-12 05:00:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3584 (0.3376) loss 3.4328 (3.2964) grad_norm 0.0000 (0.0000) [2022-10-12 05:01:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3160 (0.3368) loss 3.4291 (3.2947) grad_norm 0.0000 (0.0000) [2022-10-12 05:01:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3400 (0.3361) loss 3.2970 (3.2939) grad_norm 0.0000 (0.0000) [2022-10-12 05:02:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3469 (0.3355) loss 3.4114 (3.2935) grad_norm 0.0000 (0.0000) [2022-10-12 05:02:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3376 (0.3352) loss 3.5271 (3.2940) grad_norm 0.0000 (0.0000) [2022-10-12 05:03:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3431 (0.3348) loss 3.1530 (3.2927) grad_norm 0.0000 (0.0000) [2022-10-12 05:03:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [230/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3170 (0.3345) loss 3.6056 (3.2952) grad_norm 0.0000 (0.0000) [2022-10-12 05:04:12 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 230 training takes 0:06:58 [2022-10-12 05:04:12 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_230 saving...... [2022-10-12 05:04:12 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_230 saved !!! [2022-10-12 05:04:16 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.323 (3.323) Loss 0.9594 (0.9594) Acc@1 79.102 (79.102) Acc@5 94.336 (94.336) [2022-10-12 05:04:27 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.420 Acc@5 94.582 [2022-10-12 05:04:27 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-12 05:04:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.44% [2022-10-12 05:04:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][0/1251] eta 1:09:45 lr 0.000001 time 3.3457 (3.3457) loss 3.3700 (3.3700) grad_norm 0.0000 (0.0000) [2022-10-12 05:05:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3311 (0.3652) loss 3.4466 (3.3064) grad_norm 0.0000 (0.0000) [2022-10-12 05:05:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3490 (0.3481) loss 3.2711 (3.3037) grad_norm 0.0000 (0.0000) [2022-10-12 05:06:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3771 (0.3426) loss 3.0883 (3.3025) grad_norm 0.0000 (0.0000) [2022-10-12 05:06:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3068 (0.3393) loss 3.0387 (3.3052) grad_norm 0.0000 (0.0000) [2022-10-12 05:07:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3049 (0.3373) loss 3.4416 (3.2992) grad_norm 0.0000 (0.0000) [2022-10-12 05:07:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3213 (0.3361) loss 3.2642 (3.3014) grad_norm 0.0000 (0.0000) [2022-10-12 05:08:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3384 (0.3352) loss 3.2244 (3.3010) grad_norm 0.0000 (0.0000) [2022-10-12 05:08:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3290 (0.3346) loss 3.5055 (3.3009) grad_norm 0.0000 (0.0000) [2022-10-12 05:09:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3304 (0.3342) loss 3.1606 (3.3007) grad_norm 0.0000 (0.0000) [2022-10-12 05:10:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3505 (0.3338) loss 3.2139 (3.2994) grad_norm 0.0000 (0.0000) [2022-10-12 05:10:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3230 (0.3334) loss 3.0883 (3.2983) grad_norm 0.0000 (0.0000) [2022-10-12 05:11:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [231/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3237 (0.3332) loss 3.2124 (3.2991) grad_norm 0.0000 (0.0000) [2022-10-12 05:11:24 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 231 training takes 0:06:56 [2022-10-12 05:11:27 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.459 (3.459) Loss 0.8976 (0.8976) Acc@1 79.297 (79.297) Acc@5 94.824 (94.824) [2022-10-12 05:11:39 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.546 Acc@5 94.706 [2022-10-12 05:11:39 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-12 05:11:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.55% [2022-10-12 05:11:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][0/1251] eta 1:13:16 lr 0.000001 time 3.5142 (3.5142) loss 3.1346 (3.1346) grad_norm 0.0000 (0.0000) [2022-10-12 05:12:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3582 (0.3677) loss 3.1698 (3.2720) grad_norm 0.0000 (0.0000) [2022-10-12 05:12:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3187 (0.3500) loss 3.4519 (3.2753) grad_norm 0.0000 (0.0000) [2022-10-12 05:13:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3412 (0.3447) loss 3.0557 (3.2789) grad_norm 0.0000 (0.0000) [2022-10-12 05:13:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3561 (0.3419) loss 3.1312 (3.2816) grad_norm 0.0000 (0.0000) [2022-10-12 05:14:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3335 (0.3401) loss 3.1185 (3.2825) grad_norm 0.0000 (0.0000) [2022-10-12 05:15:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3309 (0.3388) loss 3.3487 (3.2870) grad_norm 0.0000 (0.0000) [2022-10-12 05:15:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3696 (0.3378) loss 3.3540 (3.2897) grad_norm 0.0000 (0.0000) [2022-10-12 05:16:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3152 (0.3372) loss 3.0102 (3.2893) grad_norm 0.0000 (0.0000) [2022-10-12 05:16:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3621 (0.3368) loss 3.3328 (3.2911) grad_norm 0.0000 (0.0000) [2022-10-12 05:17:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3309 (0.3364) loss 3.3760 (3.2923) grad_norm 0.0000 (0.0000) [2022-10-12 05:17:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3308 (0.3360) loss 3.1594 (3.2928) grad_norm 0.0000 (0.0000) [2022-10-12 05:18:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [232/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3437 (0.3355) loss 3.4153 (3.2925) grad_norm 0.0000 (0.0000) [2022-10-12 05:18:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 232 training takes 0:06:59 [2022-10-12 05:18:42 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.416 (3.416) Loss 0.9341 (0.9341) Acc@1 79.785 (79.785) Acc@5 94.238 (94.238) [2022-10-12 05:18:54 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.786 Acc@5 94.746 [2022-10-12 05:18:54 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 05:18:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:18:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][0/1251] eta 1:11:17 lr 0.000001 time 3.4193 (3.4193) loss 3.3214 (3.3214) grad_norm 0.0000 (0.0000) [2022-10-12 05:19:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3622 (0.3677) loss 3.5427 (3.2806) grad_norm 0.0000 (0.0000) [2022-10-12 05:20:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3224 (0.3494) loss 3.2747 (3.2885) grad_norm 0.0000 (0.0000) [2022-10-12 05:20:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3022 (0.3434) loss 3.2500 (3.2974) grad_norm 0.0000 (0.0000) [2022-10-12 05:21:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3410 (0.3409) loss 3.5046 (3.2985) grad_norm 0.0000 (0.0000) [2022-10-12 05:21:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3150 (0.3390) loss 3.3009 (3.2963) grad_norm 0.0000 (0.0000) [2022-10-12 05:22:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3467 (0.3379) loss 3.5591 (3.2916) grad_norm 0.0000 (0.0000) [2022-10-12 05:22:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3127 (0.3367) loss 3.3389 (3.2926) grad_norm 0.0000 (0.0000) [2022-10-12 05:23:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3383 (0.3358) loss 3.3953 (3.2888) grad_norm 0.0000 (0.0000) [2022-10-12 05:23:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3415 (0.3354) loss 3.1619 (3.2890) grad_norm 0.0000 (0.0000) [2022-10-12 05:24:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3198 (0.3350) loss 3.0881 (3.2917) grad_norm 0.0000 (0.0000) [2022-10-12 05:25:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3190 (0.3346) loss 3.3665 (3.2915) grad_norm 0.0000 (0.0000) [2022-10-12 05:25:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [233/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3328 (0.3342) loss 3.4010 (3.2949) grad_norm 0.0000 (0.0000) [2022-10-12 05:25:52 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 233 training takes 0:06:57 [2022-10-12 05:25:55 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.294 (3.294) Loss 0.9776 (0.9776) Acc@1 79.883 (79.883) Acc@5 94.531 (94.531) [2022-10-12 05:26:07 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.634 Acc@5 94.672 [2022-10-12 05:26:07 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-12 05:26:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:26:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][0/1251] eta 1:09:49 lr 0.000001 time 3.3490 (3.3490) loss 3.2485 (3.2485) grad_norm 0.0000 (0.0000) [2022-10-12 05:26:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][100/1251] eta 0:06:59 lr 0.000001 time 0.3335 (0.3643) loss 3.1745 (3.2826) grad_norm 0.0000 (0.0000) [2022-10-12 05:27:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3566 (0.3487) loss 3.0115 (3.2897) grad_norm 0.0000 (0.0000) [2022-10-12 05:27:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3124 (0.3430) loss 3.3185 (3.2872) grad_norm 0.0000 (0.0000) [2022-10-12 05:28:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3143 (0.3404) loss 3.2667 (3.2895) grad_norm 0.0000 (0.0000) [2022-10-12 05:28:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3353 (0.3390) loss 3.5018 (3.2849) grad_norm 0.0000 (0.0000) [2022-10-12 05:29:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3309 (0.3378) loss 3.3005 (3.2863) grad_norm 0.0000 (0.0000) [2022-10-12 05:30:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3218 (0.3368) loss 3.0940 (3.2905) grad_norm 0.0000 (0.0000) [2022-10-12 05:30:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3222 (0.3363) loss 3.2282 (3.2931) grad_norm 0.0000 (0.0000) [2022-10-12 05:31:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3341 (0.3358) loss 3.0107 (3.2916) grad_norm 0.0000 (0.0000) [2022-10-12 05:31:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3071 (0.3351) loss 3.3943 (3.2882) grad_norm 0.0000 (0.0000) [2022-10-12 05:32:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3263 (0.3348) loss 3.3070 (3.2887) grad_norm 0.0000 (0.0000) [2022-10-12 05:32:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [234/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3520 (0.3347) loss 3.2594 (3.2882) grad_norm 0.0000 (0.0000) [2022-10-12 05:33:06 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 234 training takes 0:06:58 [2022-10-12 05:33:09 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.363 (3.363) Loss 0.9380 (0.9380) Acc@1 79.590 (79.590) Acc@5 93.848 (93.848) [2022-10-12 05:33:21 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.738 Acc@5 94.718 [2022-10-12 05:33:21 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-12 05:33:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:33:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][0/1251] eta 1:09:35 lr 0.000001 time 3.3378 (3.3378) loss 3.5152 (3.5152) grad_norm 0.0000 (0.0000) [2022-10-12 05:33:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3216 (0.3656) loss 3.2227 (3.2702) grad_norm 0.0000 (0.0000) [2022-10-12 05:34:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3338 (0.3491) loss 3.3158 (3.2743) grad_norm 0.0000 (0.0000) [2022-10-12 05:35:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3327 (0.3434) loss 3.4747 (3.2785) grad_norm 0.0000 (0.0000) [2022-10-12 05:35:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3562 (0.3405) loss 3.4032 (3.2764) grad_norm 0.0000 (0.0000) [2022-10-12 05:36:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3290 (0.3388) loss 3.0496 (3.2707) grad_norm 0.0000 (0.0000) [2022-10-12 05:36:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3604 (0.3378) loss 3.2893 (3.2694) grad_norm 0.0000 (0.0000) [2022-10-12 05:37:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3246 (0.3367) loss 3.2675 (3.2685) grad_norm 0.0000 (0.0000) [2022-10-12 05:37:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3468 (0.3356) loss 3.1271 (3.2669) grad_norm 0.0000 (0.0000) [2022-10-12 05:38:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3645 (0.3350) loss 3.4314 (3.2692) grad_norm 0.0000 (0.0000) [2022-10-12 05:38:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3341 (0.3344) loss 3.3842 (3.2711) grad_norm 0.0000 (0.0000) [2022-10-12 05:39:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3172 (0.3340) loss 3.1281 (3.2736) grad_norm 0.0000 (0.0000) [2022-10-12 05:40:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [235/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3062 (0.3337) loss 3.3507 (3.2747) grad_norm 0.0000 (0.0000) [2022-10-12 05:40:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 235 training takes 0:06:57 [2022-10-12 05:40:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.566 (3.566) Loss 0.9288 (0.9288) Acc@1 79.102 (79.102) Acc@5 93.945 (93.945) [2022-10-12 05:40:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.594 Acc@5 94.618 [2022-10-12 05:40:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-12 05:40:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:40:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][0/1251] eta 1:12:30 lr 0.000001 time 3.4780 (3.4780) loss 3.4792 (3.4792) grad_norm 0.0000 (0.0000) [2022-10-12 05:41:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3261 (0.3693) loss 3.2634 (3.2493) grad_norm 0.0000 (0.0000) [2022-10-12 05:41:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3357 (0.3505) loss 3.4338 (3.2541) grad_norm 0.0000 (0.0000) [2022-10-12 05:42:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3500 (0.3442) loss 3.4851 (3.2644) grad_norm 0.0000 (0.0000) [2022-10-12 05:42:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3533 (0.3412) loss 3.5295 (3.2734) grad_norm 0.0000 (0.0000) [2022-10-12 05:43:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3131 (0.3390) loss 3.0882 (3.2764) grad_norm 0.0000 (0.0000) [2022-10-12 05:43:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3290 (0.3382) loss 3.5844 (3.2726) grad_norm 0.0000 (0.0000) [2022-10-12 05:44:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3254 (0.3373) loss 2.9997 (3.2708) grad_norm 0.0000 (0.0000) [2022-10-12 05:45:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3312 (0.3366) loss 3.1482 (3.2699) grad_norm 0.0000 (0.0000) [2022-10-12 05:45:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3249 (0.3361) loss 3.2045 (3.2690) grad_norm 0.0000 (0.0000) [2022-10-12 05:46:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3302 (0.3358) loss 3.5013 (3.2713) grad_norm 0.0000 (0.0000) [2022-10-12 05:46:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3300 (0.3354) loss 3.3522 (3.2726) grad_norm 0.0000 (0.0000) [2022-10-12 05:47:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [236/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3403 (0.3353) loss 3.2795 (3.2727) grad_norm 0.0000 (0.0000) [2022-10-12 05:47:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 236 training takes 0:06:59 [2022-10-12 05:47:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.300 (3.300) Loss 0.9787 (0.9787) Acc@1 78.418 (78.418) Acc@5 93.359 (93.359) [2022-10-12 05:47:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.572 Acc@5 94.632 [2022-10-12 05:47:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-12 05:47:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:47:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][0/1251] eta 1:16:40 lr 0.000001 time 3.6773 (3.6773) loss 3.4574 (3.4574) grad_norm 0.0000 (0.0000) [2022-10-12 05:48:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3215 (0.3691) loss 3.0166 (3.2515) grad_norm 0.0000 (0.0000) [2022-10-12 05:48:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3039 (0.3508) loss 2.9377 (3.2500) grad_norm 0.0000 (0.0000) [2022-10-12 05:49:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3411 (0.3447) loss 3.3454 (3.2552) grad_norm 0.0000 (0.0000) [2022-10-12 05:50:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3464 (0.3419) loss 3.1872 (3.2542) grad_norm 0.0000 (0.0000) [2022-10-12 05:50:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3314 (0.3397) loss 3.3628 (3.2603) grad_norm 0.0000 (0.0000) [2022-10-12 05:51:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3106 (0.3383) loss 2.9301 (3.2653) grad_norm 0.0000 (0.0000) [2022-10-12 05:51:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3283 (0.3374) loss 3.4096 (3.2659) grad_norm 0.0000 (0.0000) [2022-10-12 05:52:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3524 (0.3366) loss 3.4159 (3.2682) grad_norm 0.0000 (0.0000) [2022-10-12 05:52:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3396 (0.3359) loss 3.1522 (3.2691) grad_norm 0.0000 (0.0000) [2022-10-12 05:53:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3114 (0.3355) loss 3.2681 (3.2681) grad_norm 0.0000 (0.0000) [2022-10-12 05:53:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3066 (0.3353) loss 3.2327 (3.2680) grad_norm 0.0000 (0.0000) [2022-10-12 05:54:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [237/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3707 (0.3352) loss 3.2521 (3.2691) grad_norm 0.0000 (0.0000) [2022-10-12 05:54:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 237 training takes 0:06:59 [2022-10-12 05:54:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.490 (3.490) Loss 1.0016 (1.0016) Acc@1 76.758 (76.758) Acc@5 93.359 (93.359) [2022-10-12 05:55:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.634 Acc@5 94.666 [2022-10-12 05:55:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-12 05:55:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 05:55:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][0/1251] eta 1:12:37 lr 0.000001 time 3.4832 (3.4832) loss 3.2546 (3.2546) grad_norm 0.0000 (0.0000) [2022-10-12 05:55:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3249 (0.3694) loss 3.2946 (3.2653) grad_norm 0.0000 (0.0000) [2022-10-12 05:56:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3237 (0.3512) loss 3.0286 (3.2597) grad_norm 0.0000 (0.0000) [2022-10-12 05:56:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3693 (0.3454) loss 3.1835 (3.2613) grad_norm 0.0000 (0.0000) [2022-10-12 05:57:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3536 (0.3418) loss 3.1642 (3.2649) grad_norm 0.0000 (0.0000) [2022-10-12 05:57:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3268 (0.3398) loss 3.3286 (3.2600) grad_norm 0.0000 (0.0000) [2022-10-12 05:58:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3254 (0.3383) loss 3.2500 (3.2638) grad_norm 0.0000 (0.0000) [2022-10-12 05:58:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3476 (0.3373) loss 3.1776 (3.2653) grad_norm 0.0000 (0.0000) [2022-10-12 05:59:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3177 (0.3364) loss 3.3057 (3.2659) grad_norm 0.0000 (0.0000) [2022-10-12 06:00:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3271 (0.3357) loss 3.1325 (3.2643) grad_norm 0.0000 (0.0000) [2022-10-12 06:00:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3361 (0.3353) loss 3.0805 (3.2632) grad_norm 0.0000 (0.0000) [2022-10-12 06:01:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3626 (0.3350) loss 3.4373 (3.2653) grad_norm 0.0000 (0.0000) [2022-10-12 06:01:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [238/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3315 (0.3346) loss 3.4267 (3.2690) grad_norm 0.0000 (0.0000) [2022-10-12 06:02:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 238 training takes 0:06:58 [2022-10-12 06:02:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.139 (3.139) Loss 0.9421 (0.9421) Acc@1 80.469 (80.469) Acc@5 94.238 (94.238) [2022-10-12 06:02:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.602 Acc@5 94.692 [2022-10-12 06:02:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-12 06:02:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.79% [2022-10-12 06:02:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][0/1251] eta 1:13:09 lr 0.000001 time 3.5087 (3.5087) loss 3.2627 (3.2627) grad_norm 0.0000 (0.0000) [2022-10-12 06:02:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3166 (0.3664) loss 3.1692 (3.2405) grad_norm 0.0000 (0.0000) [2022-10-12 06:03:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3145 (0.3493) loss 3.1924 (3.2662) grad_norm 0.0000 (0.0000) [2022-10-12 06:03:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3181 (0.3432) loss 3.2943 (3.2623) grad_norm 0.0000 (0.0000) [2022-10-12 06:04:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3198 (0.3404) loss 3.3459 (3.2620) grad_norm 0.0000 (0.0000) [2022-10-12 06:05:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3168 (0.3388) loss 2.8522 (3.2617) grad_norm 0.0000 (0.0000) [2022-10-12 06:05:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3350 (0.3377) loss 3.4439 (3.2660) grad_norm 0.0000 (0.0000) [2022-10-12 06:06:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3326 (0.3368) loss 3.5850 (3.2646) grad_norm 0.0000 (0.0000) [2022-10-12 06:06:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3447 (0.3357) loss 3.5052 (3.2639) grad_norm 0.0000 (0.0000) [2022-10-12 06:07:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3287 (0.3351) loss 3.1179 (3.2634) grad_norm 0.0000 (0.0000) [2022-10-12 06:07:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3538 (0.3347) loss 3.1806 (3.2605) grad_norm 0.0000 (0.0000) [2022-10-12 06:08:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3258 (0.3344) loss 3.3029 (3.2615) grad_norm 0.0000 (0.0000) [2022-10-12 06:08:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [239/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3271 (0.3343) loss 3.2686 (3.2631) grad_norm 0.0000 (0.0000) [2022-10-12 06:09:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 239 training takes 0:06:58 [2022-10-12 06:09:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.529 (3.529) Loss 0.9559 (0.9559) Acc@1 76.660 (76.660) Acc@5 94.531 (94.531) [2022-10-12 06:09:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.796 Acc@5 94.616 [2022-10-12 06:09:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 06:09:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.80% [2022-10-12 06:09:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][0/1251] eta 1:14:29 lr 0.000001 time 3.5726 (3.5726) loss 3.1351 (3.1351) grad_norm 0.0000 (0.0000) [2022-10-12 06:10:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3174 (0.3664) loss 3.3081 (3.2563) grad_norm 0.0000 (0.0000) [2022-10-12 06:10:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3587 (0.3496) loss 3.1034 (3.2456) grad_norm 0.0000 (0.0000) [2022-10-12 06:11:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3100 (0.3440) loss 3.3874 (3.2561) grad_norm 0.0000 (0.0000) [2022-10-12 06:11:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3225 (0.3412) loss 3.4765 (3.2595) grad_norm 0.0000 (0.0000) [2022-10-12 06:12:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3220 (0.3391) loss 3.2562 (3.2582) grad_norm 0.0000 (0.0000) [2022-10-12 06:12:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3282 (0.3377) loss 3.4809 (3.2581) grad_norm 0.0000 (0.0000) [2022-10-12 06:13:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3308 (0.3368) loss 3.2463 (3.2566) grad_norm 0.0000 (0.0000) [2022-10-12 06:13:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3190 (0.3362) loss 3.0108 (3.2573) grad_norm 0.0000 (0.0000) [2022-10-12 06:14:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3135 (0.3356) loss 3.4412 (3.2622) grad_norm 0.0000 (0.0000) [2022-10-12 06:15:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3083 (0.3352) loss 3.1758 (3.2601) grad_norm 0.0000 (0.0000) [2022-10-12 06:15:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3532 (0.3350) loss 3.4082 (3.2605) grad_norm 0.0000 (0.0000) [2022-10-12 06:16:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [240/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3472 (0.3349) loss 3.2026 (3.2600) grad_norm 0.0000 (0.0000) [2022-10-12 06:16:28 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 240 training takes 0:06:58 [2022-10-12 06:16:28 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_240 saving...... [2022-10-12 06:16:28 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_240 saved !!! [2022-10-12 06:16:31 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.210 (3.210) Loss 0.8511 (0.8511) Acc@1 81.543 (81.543) Acc@5 95.215 (95.215) [2022-10-12 06:16:43 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.726 Acc@5 94.560 [2022-10-12 06:16:43 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-12 06:16:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.80% [2022-10-12 06:16:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][0/1251] eta 1:08:38 lr 0.000001 time 3.2922 (3.2922) loss 3.0860 (3.0860) grad_norm 0.0000 (0.0000) [2022-10-12 06:17:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3200 (0.3653) loss 2.8789 (3.2343) grad_norm 0.0000 (0.0000) [2022-10-12 06:17:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3169 (0.3473) loss 3.6697 (3.2412) grad_norm 0.0000 (0.0000) [2022-10-12 06:18:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][300/1251] eta 0:05:24 lr 0.000001 time 0.3439 (0.3415) loss 3.2321 (3.2407) grad_norm 0.0000 (0.0000) [2022-10-12 06:18:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][400/1251] eta 0:04:48 lr 0.000001 time 0.3243 (0.3388) loss 3.0268 (3.2443) grad_norm 0.0000 (0.0000) [2022-10-12 06:19:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][500/1251] eta 0:04:13 lr 0.000001 time 0.3202 (0.3370) loss 3.1961 (3.2469) grad_norm 0.0000 (0.0000) [2022-10-12 06:20:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][600/1251] eta 0:03:38 lr 0.000001 time 0.3279 (0.3358) loss 3.3966 (3.2446) grad_norm 0.0000 (0.0000) [2022-10-12 06:20:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][700/1251] eta 0:03:04 lr 0.000001 time 0.3145 (0.3350) loss 3.0978 (3.2471) grad_norm 0.0000 (0.0000) [2022-10-12 06:21:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][800/1251] eta 0:02:30 lr 0.000001 time 0.3291 (0.3345) loss 3.3905 (3.2471) grad_norm 0.0000 (0.0000) [2022-10-12 06:21:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3417 (0.3342) loss 3.3899 (3.2471) grad_norm 0.0000 (0.0000) [2022-10-12 06:22:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3300 (0.3338) loss 3.1740 (3.2468) grad_norm 0.0000 (0.0000) [2022-10-12 06:22:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3287 (0.3334) loss 3.4350 (3.2461) grad_norm 0.0000 (0.0000) [2022-10-12 06:23:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [241/300][1200/1251] eta 0:00:16 lr 0.000001 time 0.3397 (0.3332) loss 3.1920 (3.2472) grad_norm 0.0000 (0.0000) [2022-10-12 06:23:39 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 241 training takes 0:06:56 [2022-10-12 06:23:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.484 (3.484) Loss 0.9754 (0.9754) Acc@1 78.027 (78.027) Acc@5 93.359 (93.359) [2022-10-12 06:23:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.542 Acc@5 94.614 [2022-10-12 06:23:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-12 06:23:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.80% [2022-10-12 06:23:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][0/1251] eta 1:09:50 lr 0.000001 time 3.3498 (3.3498) loss 3.4001 (3.4001) grad_norm 0.0000 (0.0000) [2022-10-12 06:24:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3334 (0.3669) loss 3.4964 (3.2438) grad_norm 0.0000 (0.0000) [2022-10-12 06:25:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3432 (0.3501) loss 3.4831 (3.2411) grad_norm 0.0000 (0.0000) [2022-10-12 06:25:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3407 (0.3443) loss 3.0607 (3.2474) grad_norm 0.0000 (0.0000) [2022-10-12 06:26:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3402 (0.3410) loss 3.1829 (3.2464) grad_norm 0.0000 (0.0000) [2022-10-12 06:26:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3355 (0.3390) loss 3.4572 (3.2462) grad_norm 0.0000 (0.0000) [2022-10-12 06:27:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3273 (0.3380) loss 3.3244 (3.2488) grad_norm 0.0000 (0.0000) [2022-10-12 06:27:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3268 (0.3369) loss 3.2773 (3.2461) grad_norm 0.0000 (0.0000) [2022-10-12 06:28:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3152 (0.3362) loss 3.0526 (3.2452) grad_norm 0.0000 (0.0000) [2022-10-12 06:28:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3207 (0.3357) loss 3.0194 (3.2445) grad_norm 0.0000 (0.0000) [2022-10-12 06:29:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3532 (0.3353) loss 3.3836 (3.2455) grad_norm 0.0000 (0.0000) [2022-10-12 06:30:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3410 (0.3349) loss 2.7621 (3.2443) grad_norm 0.0000 (0.0000) [2022-10-12 06:30:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [242/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3523 (0.3346) loss 3.0304 (3.2425) grad_norm 0.0000 (0.0000) [2022-10-12 06:30:53 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 242 training takes 0:06:58 [2022-10-12 06:30:56 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.320 (3.320) Loss 0.9333 (0.9333) Acc@1 80.566 (80.566) Acc@5 94.238 (94.238) [2022-10-12 06:31:08 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.750 Acc@5 94.556 [2022-10-12 06:31:08 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 06:31:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.80% [2022-10-12 06:31:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][0/1251] eta 1:12:45 lr 0.000001 time 3.4895 (3.4895) loss 3.2798 (3.2798) grad_norm 0.0000 (0.0000) [2022-10-12 06:31:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3338 (0.3687) loss 3.3926 (3.2579) grad_norm 0.0000 (0.0000) [2022-10-12 06:32:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3301 (0.3509) loss 3.2018 (3.2379) grad_norm 0.0000 (0.0000) [2022-10-12 06:32:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3097 (0.3448) loss 3.3493 (3.2296) grad_norm 0.0000 (0.0000) [2022-10-12 06:33:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3299 (0.3417) loss 3.2183 (3.2345) grad_norm 0.0000 (0.0000) [2022-10-12 06:33:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3181 (0.3401) loss 3.2575 (3.2372) grad_norm 0.0000 (0.0000) [2022-10-12 06:34:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3407 (0.3386) loss 3.1341 (3.2378) grad_norm 0.0000 (0.0000) [2022-10-12 06:35:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3510 (0.3379) loss 3.1238 (3.2358) grad_norm 0.0000 (0.0000) [2022-10-12 06:35:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3480 (0.3369) loss 3.2007 (3.2379) grad_norm 0.0000 (0.0000) [2022-10-12 06:36:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3381 (0.3362) loss 3.0803 (3.2354) grad_norm 0.0000 (0.0000) [2022-10-12 06:36:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3377 (0.3356) loss 3.0651 (3.2377) grad_norm 0.0000 (0.0000) [2022-10-12 06:37:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3366 (0.3354) loss 3.4402 (3.2372) grad_norm 0.0000 (0.0000) [2022-10-12 06:37:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [243/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3539 (0.3351) loss 3.0632 (3.2374) grad_norm 0.0000 (0.0000) [2022-10-12 06:38:07 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 243 training takes 0:06:58 [2022-10-12 06:38:10 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.494 (3.494) Loss 0.9184 (0.9184) Acc@1 80.664 (80.664) Acc@5 94.629 (94.629) [2022-10-12 06:38:22 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.874 Acc@5 94.640 [2022-10-12 06:38:22 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-12 06:38:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.87% [2022-10-12 06:38:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][0/1251] eta 1:11:46 lr 0.000001 time 3.4427 (3.4427) loss 2.9843 (2.9843) grad_norm 0.0000 (0.0000) [2022-10-12 06:38:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3127 (0.3680) loss 3.0718 (3.2334) grad_norm 0.0000 (0.0000) [2022-10-12 06:39:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3337 (0.3506) loss 3.0414 (3.2343) grad_norm 0.0000 (0.0000) [2022-10-12 06:40:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3224 (0.3440) loss 3.2061 (3.2345) grad_norm 0.0000 (0.0000) [2022-10-12 06:40:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3053 (0.3411) loss 3.2093 (3.2310) grad_norm 0.0000 (0.0000) [2022-10-12 06:41:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3274 (0.3389) loss 3.1693 (3.2300) grad_norm 0.0000 (0.0000) [2022-10-12 06:41:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3341 (0.3374) loss 3.5434 (3.2314) grad_norm 0.0000 (0.0000) [2022-10-12 06:42:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3288 (0.3364) loss 3.1855 (3.2297) grad_norm 0.0000 (0.0000) [2022-10-12 06:42:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3283 (0.3358) loss 3.0360 (3.2288) grad_norm 0.0000 (0.0000) [2022-10-12 06:43:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3238 (0.3350) loss 3.4564 (3.2309) grad_norm 0.0000 (0.0000) [2022-10-12 06:43:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3117 (0.3346) loss 3.0239 (3.2347) grad_norm 0.0000 (0.0000) [2022-10-12 06:44:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3470 (0.3341) loss 3.5503 (3.2331) grad_norm 0.0000 (0.0000) [2022-10-12 06:45:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [244/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3227 (0.3338) loss 3.1740 (3.2349) grad_norm 0.0000 (0.0000) [2022-10-12 06:45:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 244 training takes 0:06:57 [2022-10-12 06:45:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.977 (2.977) Loss 0.9549 (0.9549) Acc@1 77.930 (77.930) Acc@5 94.141 (94.141) [2022-10-12 06:45:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.694 Acc@5 94.706 [2022-10-12 06:45:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-12 06:45:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.87% [2022-10-12 06:45:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][0/1251] eta 1:12:18 lr 0.000001 time 3.4679 (3.4679) loss 3.1749 (3.1749) grad_norm 0.0000 (0.0000) [2022-10-12 06:46:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3555 (0.3652) loss 3.1928 (3.2153) grad_norm 0.0000 (0.0000) [2022-10-12 06:46:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][200/1251] eta 0:06:05 lr 0.000001 time 0.3276 (0.3479) loss 3.1019 (3.2217) grad_norm 0.0000 (0.0000) [2022-10-12 06:47:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3097 (0.3428) loss 3.1911 (3.2227) grad_norm 0.0000 (0.0000) [2022-10-12 06:47:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3218 (0.3405) loss 3.3576 (3.2219) grad_norm 0.0000 (0.0000) [2022-10-12 06:48:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3150 (0.3387) loss 2.8845 (3.2222) grad_norm 0.0000 (0.0000) [2022-10-12 06:48:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3328 (0.3369) loss 3.2179 (3.2197) grad_norm 0.0000 (0.0000) [2022-10-12 06:49:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3171 (0.3364) loss 3.4328 (3.2220) grad_norm 0.0000 (0.0000) [2022-10-12 06:50:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3104 (0.3361) loss 3.1344 (3.2205) grad_norm 0.0000 (0.0000) [2022-10-12 06:50:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3235 (0.3354) loss 3.1392 (3.2203) grad_norm 0.0000 (0.0000) [2022-10-12 06:51:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3515 (0.3350) loss 3.2577 (3.2204) grad_norm 0.0000 (0.0000) [2022-10-12 06:51:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3264 (0.3347) loss 3.1213 (3.2197) grad_norm 0.0000 (0.0000) [2022-10-12 06:52:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [245/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3393 (0.3347) loss 3.4493 (3.2227) grad_norm 0.0000 (0.0000) [2022-10-12 06:52:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 245 training takes 0:06:58 [2022-10-12 06:52:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.462 (3.462) Loss 0.8238 (0.8238) Acc@1 82.422 (82.422) Acc@5 96.094 (96.094) [2022-10-12 06:52:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.816 Acc@5 94.746 [2022-10-12 06:52:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 06:52:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.87% [2022-10-12 06:52:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][0/1251] eta 1:15:16 lr 0.000001 time 3.6105 (3.6105) loss 3.3370 (3.3370) grad_norm 0.0000 (0.0000) [2022-10-12 06:53:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3461 (0.3674) loss 3.1055 (3.2203) grad_norm 0.0000 (0.0000) [2022-10-12 06:53:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3330 (0.3496) loss 3.1768 (3.2152) grad_norm 0.0000 (0.0000) [2022-10-12 06:54:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3337 (0.3435) loss 3.4196 (3.2269) grad_norm 0.0000 (0.0000) [2022-10-12 06:55:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3158 (0.3404) loss 3.1885 (3.2219) grad_norm 0.0000 (0.0000) [2022-10-12 06:55:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3066 (0.3389) loss 3.0970 (3.2254) grad_norm 0.0000 (0.0000) [2022-10-12 06:56:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3284 (0.3376) loss 3.3894 (3.2231) grad_norm 0.0000 (0.0000) [2022-10-12 06:56:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3354 (0.3368) loss 3.2138 (3.2198) grad_norm 0.0000 (0.0000) [2022-10-12 06:57:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3449 (0.3366) loss 3.3926 (3.2229) grad_norm 0.0000 (0.0000) [2022-10-12 06:57:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3329 (0.3359) loss 3.2433 (3.2222) grad_norm 0.0000 (0.0000) [2022-10-12 06:58:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3146 (0.3355) loss 3.2315 (3.2219) grad_norm 0.0000 (0.0000) [2022-10-12 06:58:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3148 (0.3354) loss 3.3163 (3.2216) grad_norm 0.0000 (0.0000) [2022-10-12 06:59:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [246/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3286 (0.3353) loss 3.5350 (3.2236) grad_norm 0.0000 (0.0000) [2022-10-12 06:59:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 246 training takes 0:06:59 [2022-10-12 06:59:51 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.296 (3.296) Loss 0.8633 (0.8633) Acc@1 81.348 (81.348) Acc@5 94.824 (94.824) [2022-10-12 07:00:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.754 Acc@5 94.714 [2022-10-12 07:00:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 07:00:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.87% [2022-10-12 07:00:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][0/1251] eta 1:12:04 lr 0.000001 time 3.4571 (3.4571) loss 3.0300 (3.0300) grad_norm 0.0000 (0.0000) [2022-10-12 07:00:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3236 (0.3667) loss 3.4817 (3.1983) grad_norm 0.0000 (0.0000) [2022-10-12 07:01:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3450 (0.3508) loss 3.5320 (3.2169) grad_norm 0.0000 (0.0000) [2022-10-12 07:01:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3250 (0.3448) loss 3.3127 (3.2189) grad_norm 0.0000 (0.0000) [2022-10-12 07:02:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3175 (0.3426) loss 3.2457 (3.2177) grad_norm 0.0000 (0.0000) [2022-10-12 07:02:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3223 (0.3407) loss 3.1967 (3.2133) grad_norm 0.0000 (0.0000) [2022-10-12 07:03:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3133 (0.3393) loss 3.2307 (3.2143) grad_norm 0.0000 (0.0000) [2022-10-12 07:03:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3386 (0.3382) loss 3.0011 (3.2143) grad_norm 0.0000 (0.0000) [2022-10-12 07:04:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3280 (0.3382) loss 3.0685 (3.2166) grad_norm 0.0000 (0.0000) [2022-10-12 07:05:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3438 (0.3373) loss 3.0622 (3.2170) grad_norm 0.0000 (0.0000) [2022-10-12 07:05:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3207 (0.3366) loss 3.2806 (3.2187) grad_norm 0.0000 (0.0000) [2022-10-12 07:06:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3362 (0.3367) loss 3.2032 (3.2180) grad_norm 0.0000 (0.0000) [2022-10-12 07:06:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [247/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3540 (0.3367) loss 3.0406 (3.2165) grad_norm 0.0000 (0.0000) [2022-10-12 07:07:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 247 training takes 0:07:01 [2022-10-12 07:07:07 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.440 (3.440) Loss 0.9124 (0.9124) Acc@1 80.957 (80.957) Acc@5 94.434 (94.434) [2022-10-12 07:07:19 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.934 Acc@5 94.698 [2022-10-12 07:07:19 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-12 07:07:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.93% [2022-10-12 07:07:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][0/1251] eta 1:11:12 lr 0.000001 time 3.4149 (3.4149) loss 3.3213 (3.3213) grad_norm 0.0000 (0.0000) [2022-10-12 07:07:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3286 (0.3678) loss 3.0786 (3.2036) grad_norm 0.0000 (0.0000) [2022-10-12 07:08:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3219 (0.3503) loss 3.4130 (3.2204) grad_norm 0.0000 (0.0000) [2022-10-12 07:09:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3353 (0.3442) loss 2.9703 (3.2167) grad_norm 0.0000 (0.0000) [2022-10-12 07:09:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3245 (0.3413) loss 3.4178 (3.2181) grad_norm 0.0000 (0.0000) [2022-10-12 07:10:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3224 (0.3395) loss 3.5052 (3.2226) grad_norm 0.0000 (0.0000) [2022-10-12 07:10:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3382 (0.3385) loss 3.3093 (3.2227) grad_norm 0.0000 (0.0000) [2022-10-12 07:11:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3337 (0.3371) loss 3.2433 (3.2232) grad_norm 0.0000 (0.0000) [2022-10-12 07:11:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3126 (0.3366) loss 3.2314 (3.2194) grad_norm 0.0000 (0.0000) [2022-10-12 07:12:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3355 (0.3360) loss 3.3274 (3.2208) grad_norm 0.0000 (0.0000) [2022-10-12 07:12:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3383 (0.3357) loss 3.1170 (3.2230) grad_norm 0.0000 (0.0000) [2022-10-12 07:13:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3462 (0.3355) loss 3.3260 (3.2242) grad_norm 0.0000 (0.0000) [2022-10-12 07:14:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [248/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3449 (0.3352) loss 3.4853 (3.2242) grad_norm 0.0000 (0.0000) [2022-10-12 07:14:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 248 training takes 0:06:59 [2022-10-12 07:14:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.712 (2.712) Loss 0.9057 (0.9057) Acc@1 81.250 (81.250) Acc@5 95.020 (95.020) [2022-10-12 07:14:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.846 Acc@5 94.738 [2022-10-12 07:14:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-12 07:14:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.93% [2022-10-12 07:14:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][0/1251] eta 1:10:59 lr 0.000001 time 3.4049 (3.4049) loss 3.1359 (3.1359) grad_norm 0.0000 (0.0000) [2022-10-12 07:15:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3328 (0.3667) loss 3.4303 (3.1930) grad_norm 0.0000 (0.0000) [2022-10-12 07:15:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3098 (0.3488) loss 3.2267 (3.2017) grad_norm 0.0000 (0.0000) [2022-10-12 07:16:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3616 (0.3433) loss 3.1082 (3.2108) grad_norm 0.0000 (0.0000) [2022-10-12 07:16:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3461 (0.3400) loss 3.2866 (3.2041) grad_norm 0.0000 (0.0000) [2022-10-12 07:17:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3333 (0.3385) loss 3.3465 (3.2039) grad_norm 0.0000 (0.0000) [2022-10-12 07:17:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3495 (0.3375) loss 3.2049 (3.2056) grad_norm 0.0000 (0.0000) [2022-10-12 07:18:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3414 (0.3365) loss 3.2797 (3.2129) grad_norm 0.0000 (0.0000) [2022-10-12 07:19:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3326 (0.3357) loss 3.1093 (3.2115) grad_norm 0.0000 (0.0000) [2022-10-12 07:19:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3372 (0.3354) loss 3.2239 (3.2136) grad_norm 0.0000 (0.0000) [2022-10-12 07:20:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3168 (0.3353) loss 3.3243 (3.2145) grad_norm 0.0000 (0.0000) [2022-10-12 07:20:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3275 (0.3349) loss 3.1698 (3.2154) grad_norm 0.0000 (0.0000) [2022-10-12 07:21:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [249/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3074 (0.3347) loss 3.2722 (3.2153) grad_norm 0.0000 (0.0000) [2022-10-12 07:21:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 249 training takes 0:06:58 [2022-10-12 07:21:35 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.505 (3.505) Loss 0.8715 (0.8715) Acc@1 81.152 (81.152) Acc@5 95.508 (95.508) [2022-10-12 07:21:47 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.946 Acc@5 94.738 [2022-10-12 07:21:47 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-12 07:21:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.95% [2022-10-12 07:21:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][0/1251] eta 1:13:54 lr 0.000001 time 3.5446 (3.5446) loss 3.2292 (3.2292) grad_norm 0.0000 (0.0000) [2022-10-12 07:22:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][100/1251] eta 0:07:00 lr 0.000001 time 0.3451 (0.3657) loss 3.4867 (3.2071) grad_norm 0.0000 (0.0000) [2022-10-12 07:22:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3086 (0.3490) loss 3.3279 (3.2081) grad_norm 0.0000 (0.0000) [2022-10-12 07:23:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3100 (0.3431) loss 3.3099 (3.1992) grad_norm 0.0000 (0.0000) [2022-10-12 07:24:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3700 (0.3402) loss 3.3285 (3.2069) grad_norm 0.0000 (0.0000) [2022-10-12 07:24:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3111 (0.3385) loss 3.2936 (3.2103) grad_norm 0.0000 (0.0000) [2022-10-12 07:25:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3379 (0.3373) loss 3.1486 (3.2129) grad_norm 0.0000 (0.0000) [2022-10-12 07:25:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3225 (0.3360) loss 3.2221 (3.2146) grad_norm 0.0000 (0.0000) [2022-10-12 07:26:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3233 (0.3354) loss 3.0699 (3.2140) grad_norm 0.0000 (0.0000) [2022-10-12 07:26:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3892 (0.3353) loss 3.5365 (3.2177) grad_norm 0.0000 (0.0000) [2022-10-12 07:27:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3617 (0.3350) loss 3.1757 (3.2159) grad_norm 0.0000 (0.0000) [2022-10-12 07:27:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3356 (0.3347) loss 3.1837 (3.2148) grad_norm 0.0000 (0.0000) [2022-10-12 07:28:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [250/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3274 (0.3344) loss 3.4386 (3.2141) grad_norm 0.0000 (0.0000) [2022-10-12 07:28:45 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 250 training takes 0:06:58 [2022-10-12 07:28:45 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_250 saving...... [2022-10-12 07:28:46 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_250 saved !!! [2022-10-12 07:28:49 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.168 (3.168) Loss 0.8394 (0.8394) Acc@1 81.641 (81.641) Acc@5 95.996 (95.996) [2022-10-12 07:29:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.912 Acc@5 94.662 [2022-10-12 07:29:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-12 07:29:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 79.95% [2022-10-12 07:29:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][0/1251] eta 1:16:19 lr 0.000001 time 3.6605 (3.6605) loss 3.3002 (3.3002) grad_norm 0.0000 (0.0000) [2022-10-12 07:29:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3705 (0.3685) loss 3.1353 (3.2127) grad_norm 0.0000 (0.0000) [2022-10-12 07:30:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3137 (0.3503) loss 3.2122 (3.2120) grad_norm 0.0000 (0.0000) [2022-10-12 07:30:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3555 (0.3441) loss 2.5992 (3.2023) grad_norm 0.0000 (0.0000) [2022-10-12 07:31:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3623 (0.3408) loss 3.2093 (3.1963) grad_norm 0.0000 (0.0000) [2022-10-12 07:31:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3132 (0.3390) loss 3.3596 (3.1992) grad_norm 0.0000 (0.0000) [2022-10-12 07:32:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3313 (0.3375) loss 3.0968 (3.1974) grad_norm 0.0000 (0.0000) [2022-10-12 07:32:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3530 (0.3364) loss 3.2570 (3.2002) grad_norm 0.0000 (0.0000) [2022-10-12 07:33:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3326 (0.3360) loss 3.2035 (3.2022) grad_norm 0.0000 (0.0000) [2022-10-12 07:34:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3408 (0.3359) loss 3.0307 (3.2009) grad_norm 0.0000 (0.0000) [2022-10-12 07:34:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3317 (0.3356) loss 3.0505 (3.2004) grad_norm 0.0000 (0.0000) [2022-10-12 07:35:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3318 (0.3353) loss 2.9072 (3.2005) grad_norm 0.0000 (0.0000) [2022-10-12 07:35:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [251/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3289 (0.3349) loss 3.4230 (3.2016) grad_norm 0.0000 (0.0000) [2022-10-12 07:36:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 251 training takes 0:06:58 [2022-10-12 07:36:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.204 (3.204) Loss 0.9149 (0.9149) Acc@1 80.957 (80.957) Acc@5 95.117 (95.117) [2022-10-12 07:36:15 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.094 Acc@5 94.778 [2022-10-12 07:36:15 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 07:36:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.09% [2022-10-12 07:36:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][0/1251] eta 1:13:15 lr 0.000001 time 3.5132 (3.5132) loss 3.1885 (3.1885) grad_norm 0.0000 (0.0000) [2022-10-12 07:36:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3226 (0.3700) loss 3.2996 (3.2140) grad_norm 0.0000 (0.0000) [2022-10-12 07:37:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3646 (0.3525) loss 3.3474 (3.2042) grad_norm 0.0000 (0.0000) [2022-10-12 07:37:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3205 (0.3463) loss 3.4515 (3.2036) grad_norm 0.0000 (0.0000) [2022-10-12 07:38:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3349 (0.3426) loss 3.2313 (3.2046) grad_norm 0.0000 (0.0000) [2022-10-12 07:39:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3406 (0.3405) loss 3.2900 (3.1984) grad_norm 0.0000 (0.0000) [2022-10-12 07:39:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3482 (0.3390) loss 3.2241 (3.1994) grad_norm 0.0000 (0.0000) [2022-10-12 07:40:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3351 (0.3382) loss 3.0707 (3.1965) grad_norm 0.0000 (0.0000) [2022-10-12 07:40:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3254 (0.3378) loss 2.9844 (3.1964) grad_norm 0.0000 (0.0000) [2022-10-12 07:41:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3256 (0.3376) loss 3.0343 (3.1981) grad_norm 0.0000 (0.0000) [2022-10-12 07:41:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3428 (0.3376) loss 3.1662 (3.1969) grad_norm 0.0000 (0.0000) [2022-10-12 07:42:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3031 (0.3373) loss 3.1069 (3.1962) grad_norm 0.0000 (0.0000) [2022-10-12 07:43:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [252/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3323 (0.3371) loss 3.1535 (3.1963) grad_norm 0.0000 (0.0000) [2022-10-12 07:43:16 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 252 training takes 0:07:01 [2022-10-12 07:43:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.435 (3.435) Loss 0.9237 (0.9237) Acc@1 80.664 (80.664) Acc@5 94.922 (94.922) [2022-10-12 07:43:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.006 Acc@5 94.780 [2022-10-12 07:43:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-12 07:43:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.09% [2022-10-12 07:43:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][0/1251] eta 1:12:13 lr 0.000001 time 3.4637 (3.4637) loss 3.3271 (3.3271) grad_norm 0.0000 (0.0000) [2022-10-12 07:44:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3403 (0.3669) loss 3.1541 (3.1926) grad_norm 0.0000 (0.0000) [2022-10-12 07:44:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3339 (0.3483) loss 2.9678 (3.1840) grad_norm 0.0000 (0.0000) [2022-10-12 07:45:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3524 (0.3431) loss 3.3761 (3.1913) grad_norm 0.0000 (0.0000) [2022-10-12 07:45:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3332 (0.3405) loss 3.2351 (3.1919) grad_norm 0.0000 (0.0000) [2022-10-12 07:46:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3407 (0.3387) loss 3.1101 (3.1926) grad_norm 0.0000 (0.0000) [2022-10-12 07:46:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3196 (0.3376) loss 3.3478 (3.1954) grad_norm 0.0000 (0.0000) [2022-10-12 07:47:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3493 (0.3368) loss 3.3061 (3.1977) grad_norm 0.0000 (0.0000) [2022-10-12 07:48:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3713 (0.3366) loss 3.0830 (3.1954) grad_norm 0.0000 (0.0000) [2022-10-12 07:48:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3363 (0.3362) loss 3.0369 (3.1963) grad_norm 0.0000 (0.0000) [2022-10-12 07:49:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3523 (0.3358) loss 3.0964 (3.1955) grad_norm 0.0000 (0.0000) [2022-10-12 07:49:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3089 (0.3354) loss 3.1922 (3.1932) grad_norm 0.0000 (0.0000) [2022-10-12 07:50:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [253/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3323 (0.3351) loss 3.0115 (3.1921) grad_norm 0.0000 (0.0000) [2022-10-12 07:50:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 253 training takes 0:06:58 [2022-10-12 07:50:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.050 (3.050) Loss 0.9312 (0.9312) Acc@1 78.613 (78.613) Acc@5 94.434 (94.434) [2022-10-12 07:50:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.132 Acc@5 94.758 [2022-10-12 07:50:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 07:50:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.13% [2022-10-12 07:50:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][0/1251] eta 1:16:27 lr 0.000001 time 3.6671 (3.6671) loss 3.4843 (3.4843) grad_norm 0.0000 (0.0000) [2022-10-12 07:51:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][100/1251] eta 0:07:02 lr 0.000001 time 0.2999 (0.3674) loss 2.9805 (3.2027) grad_norm 0.0000 (0.0000) [2022-10-12 07:51:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3206 (0.3496) loss 3.3697 (3.2021) grad_norm 0.0000 (0.0000) [2022-10-12 07:52:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3263 (0.3440) loss 2.6742 (3.1917) grad_norm 0.0000 (0.0000) [2022-10-12 07:53:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3300 (0.3409) loss 3.1995 (3.1920) grad_norm 0.0000 (0.0000) [2022-10-12 07:53:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3876 (0.3391) loss 2.9695 (3.1893) grad_norm 0.0000 (0.0000) [2022-10-12 07:54:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3440 (0.3381) loss 3.1786 (3.1836) grad_norm 0.0000 (0.0000) [2022-10-12 07:54:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3166 (0.3377) loss 3.3694 (3.1840) grad_norm 0.0000 (0.0000) [2022-10-12 07:55:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3503 (0.3371) loss 3.0058 (3.1789) grad_norm 0.0000 (0.0000) [2022-10-12 07:55:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3165 (0.3366) loss 3.1784 (3.1811) grad_norm 0.0000 (0.0000) [2022-10-12 07:56:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3624 (0.3364) loss 3.1884 (3.1833) grad_norm 0.0000 (0.0000) [2022-10-12 07:56:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3570 (0.3362) loss 3.0331 (3.1814) grad_norm 0.0000 (0.0000) [2022-10-12 07:57:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [254/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3044 (0.3359) loss 3.1553 (3.1825) grad_norm 0.0000 (0.0000) [2022-10-12 07:57:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 254 training takes 0:07:00 [2022-10-12 07:57:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.339 (3.339) Loss 0.9000 (0.9000) Acc@1 80.664 (80.664) Acc@5 95.117 (95.117) [2022-10-12 07:58:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 79.984 Acc@5 94.698 [2022-10-12 07:58:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-12 07:58:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.13% [2022-10-12 07:58:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][0/1251] eta 1:13:22 lr 0.000001 time 3.5189 (3.5189) loss 3.1970 (3.1970) grad_norm 0.0000 (0.0000) [2022-10-12 07:58:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3202 (0.3659) loss 3.1398 (3.2164) grad_norm 0.0000 (0.0000) [2022-10-12 07:59:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3372 (0.3485) loss 3.1148 (3.1970) grad_norm 0.0000 (0.0000) [2022-10-12 07:59:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3198 (0.3429) loss 3.1213 (3.1930) grad_norm 0.0000 (0.0000) [2022-10-12 08:00:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3533 (0.3402) loss 3.3339 (3.1932) grad_norm 0.0000 (0.0000) [2022-10-12 08:00:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3303 (0.3390) loss 3.1164 (3.1920) grad_norm 0.0000 (0.0000) [2022-10-12 08:01:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3205 (0.3381) loss 3.1809 (3.1915) grad_norm 0.0000 (0.0000) [2022-10-12 08:01:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3340 (0.3376) loss 3.1959 (3.1933) grad_norm 0.0000 (0.0000) [2022-10-12 08:02:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3456 (0.3370) loss 3.1578 (3.1905) grad_norm 0.0000 (0.0000) [2022-10-12 08:03:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3134 (0.3363) loss 3.0369 (3.1897) grad_norm 0.0000 (0.0000) [2022-10-12 08:03:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3385 (0.3358) loss 3.1641 (3.1913) grad_norm 0.0000 (0.0000) [2022-10-12 08:04:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3275 (0.3356) loss 3.4307 (3.1917) grad_norm 0.0000 (0.0000) [2022-10-12 08:04:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [255/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3167 (0.3353) loss 3.2444 (3.1923) grad_norm 0.0000 (0.0000) [2022-10-12 08:05:01 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 255 training takes 0:06:59 [2022-10-12 08:05:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.212 (3.212) Loss 0.8933 (0.8933) Acc@1 82.129 (82.129) Acc@5 94.141 (94.141) [2022-10-12 08:05:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.194 Acc@5 94.762 [2022-10-12 08:05:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 08:05:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.19% [2022-10-12 08:05:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][0/1251] eta 1:09:26 lr 0.000001 time 3.3303 (3.3303) loss 3.4655 (3.4655) grad_norm 0.0000 (0.0000) [2022-10-12 08:05:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3483 (0.3669) loss 3.1649 (3.1965) grad_norm 0.0000 (0.0000) [2022-10-12 08:06:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3369 (0.3497) loss 3.3942 (3.1927) grad_norm 0.0000 (0.0000) [2022-10-12 08:07:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3324 (0.3439) loss 3.2941 (3.1938) grad_norm 0.0000 (0.0000) [2022-10-12 08:07:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3511 (0.3413) loss 2.9092 (3.1893) grad_norm 0.0000 (0.0000) [2022-10-12 08:08:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3329 (0.3400) loss 3.3361 (3.1885) grad_norm 0.0000 (0.0000) [2022-10-12 08:08:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3324 (0.3387) loss 3.1980 (3.1871) grad_norm 0.0000 (0.0000) [2022-10-12 08:09:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3302 (0.3377) loss 3.1308 (3.1864) grad_norm 0.0000 (0.0000) [2022-10-12 08:09:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3457 (0.3374) loss 3.0153 (3.1877) grad_norm 0.0000 (0.0000) [2022-10-12 08:10:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3470 (0.3371) loss 3.3573 (3.1893) grad_norm 0.0000 (0.0000) [2022-10-12 08:10:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3122 (0.3369) loss 3.4280 (3.1875) grad_norm 0.0000 (0.0000) [2022-10-12 08:11:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3444 (0.3366) loss 3.1974 (3.1882) grad_norm 0.0000 (0.0000) [2022-10-12 08:12:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [256/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3475 (0.3365) loss 3.2683 (3.1876) grad_norm 0.0000 (0.0000) [2022-10-12 08:12:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 256 training takes 0:07:00 [2022-10-12 08:12:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.463 (3.463) Loss 0.9108 (0.9108) Acc@1 80.469 (80.469) Acc@5 94.531 (94.531) [2022-10-12 08:12:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.176 Acc@5 94.820 [2022-10-12 08:12:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 08:12:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.19% [2022-10-12 08:12:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][0/1251] eta 1:17:55 lr 0.000001 time 3.7378 (3.7378) loss 3.1225 (3.1225) grad_norm 0.0000 (0.0000) [2022-10-12 08:13:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3322 (0.3664) loss 3.0575 (3.1796) grad_norm 0.0000 (0.0000) [2022-10-12 08:13:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3317 (0.3494) loss 3.0389 (3.1789) grad_norm 0.0000 (0.0000) [2022-10-12 08:14:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3395 (0.3427) loss 3.2416 (3.1807) grad_norm 0.0000 (0.0000) [2022-10-12 08:14:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3900 (0.3401) loss 3.2771 (3.1770) grad_norm 0.0000 (0.0000) [2022-10-12 08:15:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3295 (0.3387) loss 3.0495 (3.1798) grad_norm 0.0000 (0.0000) [2022-10-12 08:15:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3610 (0.3381) loss 3.0427 (3.1784) grad_norm 0.0000 (0.0000) [2022-10-12 08:16:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][700/1251] eta 0:03:05 lr 0.000001 time 0.2980 (0.3371) loss 3.4516 (3.1782) grad_norm 0.0000 (0.0000) [2022-10-12 08:17:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3281 (0.3365) loss 3.1029 (3.1801) grad_norm 0.0000 (0.0000) [2022-10-12 08:17:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3424 (0.3358) loss 3.5412 (3.1814) grad_norm 0.0000 (0.0000) [2022-10-12 08:18:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3594 (0.3354) loss 3.2548 (3.1817) grad_norm 0.0000 (0.0000) [2022-10-12 08:18:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3161 (0.3351) loss 3.2147 (3.1819) grad_norm 0.0000 (0.0000) [2022-10-12 08:19:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [257/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3017 (0.3349) loss 3.4482 (3.1842) grad_norm 0.0000 (0.0000) [2022-10-12 08:19:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 257 training takes 0:06:58 [2022-10-12 08:19:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.211 (3.211) Loss 0.8808 (0.8808) Acc@1 80.762 (80.762) Acc@5 95.703 (95.703) [2022-10-12 08:19:47 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.042 Acc@5 94.766 [2022-10-12 08:19:47 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-12 08:19:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.19% [2022-10-12 08:19:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][0/1251] eta 1:12:53 lr 0.000001 time 3.4959 (3.4959) loss 3.2222 (3.2222) grad_norm 0.0000 (0.0000) [2022-10-12 08:20:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3556 (0.3693) loss 3.1359 (3.1836) grad_norm 0.0000 (0.0000) [2022-10-12 08:20:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3149 (0.3509) loss 3.4719 (3.1811) grad_norm 0.0000 (0.0000) [2022-10-12 08:21:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3321 (0.3443) loss 3.2450 (3.1840) grad_norm 0.0000 (0.0000) [2022-10-12 08:22:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3503 (0.3415) loss 3.1154 (3.1774) grad_norm 0.0000 (0.0000) [2022-10-12 08:22:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3773 (0.3404) loss 2.7758 (3.1729) grad_norm 0.0000 (0.0000) [2022-10-12 08:23:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3391 (0.3392) loss 3.1857 (3.1737) grad_norm 0.0000 (0.0000) [2022-10-12 08:23:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3321 (0.3382) loss 3.1271 (3.1740) grad_norm 0.0000 (0.0000) [2022-10-12 08:24:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3222 (0.3374) loss 3.2297 (3.1743) grad_norm 0.0000 (0.0000) [2022-10-12 08:24:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3350 (0.3366) loss 3.0777 (3.1739) grad_norm 0.0000 (0.0000) [2022-10-12 08:25:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3245 (0.3363) loss 3.1767 (3.1731) grad_norm 0.0000 (0.0000) [2022-10-12 08:25:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3425 (0.3361) loss 3.2491 (3.1742) grad_norm 0.0000 (0.0000) [2022-10-12 08:26:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [258/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3175 (0.3360) loss 3.1113 (3.1728) grad_norm 0.0000 (0.0000) [2022-10-12 08:26:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 258 training takes 0:07:00 [2022-10-12 08:26:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.340 (3.340) Loss 0.9524 (0.9524) Acc@1 77.539 (77.539) Acc@5 94.629 (94.629) [2022-10-12 08:27:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.228 Acc@5 94.832 [2022-10-12 08:27:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 08:27:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.23% [2022-10-12 08:27:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][0/1251] eta 1:07:04 lr 0.000001 time 3.2173 (3.2173) loss 2.9144 (2.9144) grad_norm 0.0000 (0.0000) [2022-10-12 08:27:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3628 (0.3677) loss 3.1181 (3.1589) grad_norm 0.0000 (0.0000) [2022-10-12 08:28:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3147 (0.3508) loss 3.0452 (3.1609) grad_norm 0.0000 (0.0000) [2022-10-12 08:28:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3415 (0.3450) loss 3.5530 (3.1647) grad_norm 0.0000 (0.0000) [2022-10-12 08:29:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3295 (0.3421) loss 3.0108 (3.1703) grad_norm 0.0000 (0.0000) [2022-10-12 08:29:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3307 (0.3407) loss 3.1010 (3.1702) grad_norm 0.0000 (0.0000) [2022-10-12 08:30:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3448 (0.3393) loss 3.2442 (3.1694) grad_norm 0.0000 (0.0000) [2022-10-12 08:30:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3157 (0.3384) loss 3.1826 (3.1663) grad_norm 0.0000 (0.0000) [2022-10-12 08:31:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3321 (0.3377) loss 3.1803 (3.1689) grad_norm 0.0000 (0.0000) [2022-10-12 08:32:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3260 (0.3371) loss 3.2519 (3.1670) grad_norm 0.0000 (0.0000) [2022-10-12 08:32:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3738 (0.3367) loss 2.9407 (3.1679) grad_norm 0.0000 (0.0000) [2022-10-12 08:33:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3386 (0.3365) loss 3.2379 (3.1685) grad_norm 0.0000 (0.0000) [2022-10-12 08:33:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [259/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3185 (0.3360) loss 2.8164 (3.1696) grad_norm 0.0000 (0.0000) [2022-10-12 08:34:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 259 training takes 0:07:00 [2022-10-12 08:34:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.197 (3.197) Loss 0.9819 (0.9819) Acc@1 80.078 (80.078) Acc@5 93.262 (93.262) [2022-10-12 08:34:18 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.172 Acc@5 94.804 [2022-10-12 08:34:18 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 08:34:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.23% [2022-10-12 08:34:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][0/1251] eta 1:15:15 lr 0.000001 time 3.6095 (3.6095) loss 3.0354 (3.0354) grad_norm 0.0000 (0.0000) [2022-10-12 08:34:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3449 (0.3710) loss 3.0402 (3.1627) grad_norm 0.0000 (0.0000) [2022-10-12 08:35:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3172 (0.3520) loss 3.1348 (3.1629) grad_norm 0.0000 (0.0000) [2022-10-12 08:36:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3111 (0.3454) loss 2.9221 (3.1535) grad_norm 0.0000 (0.0000) [2022-10-12 08:36:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][400/1251] eta 0:04:52 lr 0.000001 time 0.3334 (0.3432) loss 3.2178 (3.1550) grad_norm 0.0000 (0.0000) [2022-10-12 08:37:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][500/1251] eta 0:04:16 lr 0.000001 time 0.3279 (0.3410) loss 2.9778 (3.1600) grad_norm 0.0000 (0.0000) [2022-10-12 08:37:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][600/1251] eta 0:03:41 lr 0.000001 time 0.3310 (0.3397) loss 3.1043 (3.1613) grad_norm 0.0000 (0.0000) [2022-10-12 08:38:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3333 (0.3388) loss 3.0884 (3.1612) grad_norm 0.0000 (0.0000) [2022-10-12 08:38:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3878 (0.3378) loss 2.7309 (3.1609) grad_norm 0.0000 (0.0000) [2022-10-12 08:39:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3285 (0.3373) loss 3.0481 (3.1625) grad_norm 0.0000 (0.0000) [2022-10-12 08:39:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3677 (0.3367) loss 3.0759 (3.1626) grad_norm 0.0000 (0.0000) [2022-10-12 08:40:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3310 (0.3362) loss 3.1281 (3.1632) grad_norm 0.0000 (0.0000) [2022-10-12 08:41:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [260/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3611 (0.3360) loss 3.1072 (3.1641) grad_norm 0.0000 (0.0000) [2022-10-12 08:41:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 260 training takes 0:07:00 [2022-10-12 08:41:18 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_260 saving...... [2022-10-12 08:41:18 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_260 saved !!! [2022-10-12 08:41:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.125 (3.125) Loss 0.9007 (0.9007) Acc@1 81.055 (81.055) Acc@5 95.117 (95.117) [2022-10-12 08:41:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.074 Acc@5 94.806 [2022-10-12 08:41:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 08:41:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.23% [2022-10-12 08:41:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][0/1251] eta 1:12:04 lr 0.000001 time 3.4565 (3.4565) loss 2.9588 (2.9588) grad_norm 0.0000 (0.0000) [2022-10-12 08:42:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3335 (0.3666) loss 2.9474 (3.1382) grad_norm 0.0000 (0.0000) [2022-10-12 08:42:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3525 (0.3491) loss 3.1761 (3.1473) grad_norm 0.0000 (0.0000) [2022-10-12 08:43:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3524 (0.3432) loss 3.0779 (3.1647) grad_norm 0.0000 (0.0000) [2022-10-12 08:43:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3160 (0.3405) loss 2.9670 (3.1650) grad_norm 0.0000 (0.0000) [2022-10-12 08:44:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3174 (0.3384) loss 3.1696 (3.1607) grad_norm 0.0000 (0.0000) [2022-10-12 08:44:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3375 (0.3371) loss 3.1628 (3.1601) grad_norm 0.0000 (0.0000) [2022-10-12 08:45:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3403 (0.3363) loss 3.3771 (3.1636) grad_norm 0.0000 (0.0000) [2022-10-12 08:46:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][800/1251] eta 0:02:31 lr 0.000001 time 0.4043 (0.3358) loss 3.0808 (3.1628) grad_norm 0.0000 (0.0000) [2022-10-12 08:46:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3276 (0.3353) loss 3.1814 (3.1661) grad_norm 0.0000 (0.0000) [2022-10-12 08:47:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3265 (0.3350) loss 3.2870 (3.1672) grad_norm 0.0000 (0.0000) [2022-10-12 08:47:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3355 (0.3348) loss 3.1520 (3.1706) grad_norm 0.0000 (0.0000) [2022-10-12 08:48:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [261/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3114 (0.3345) loss 3.0790 (3.1716) grad_norm 0.0000 (0.0000) [2022-10-12 08:48:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 261 training takes 0:06:58 [2022-10-12 08:48:35 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 2.996 (2.996) Loss 0.8924 (0.8924) Acc@1 81.445 (81.445) Acc@5 94.727 (94.727) [2022-10-12 08:48:47 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.066 Acc@5 94.708 [2022-10-12 08:48:47 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 08:48:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.23% [2022-10-12 08:48:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][0/1251] eta 1:17:13 lr 0.000001 time 3.7037 (3.7037) loss 3.1380 (3.1380) grad_norm 0.0000 (0.0000) [2022-10-12 08:49:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3376 (0.3693) loss 3.0660 (3.1605) grad_norm 0.0000 (0.0000) [2022-10-12 08:49:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3373 (0.3503) loss 3.4802 (3.1790) grad_norm 0.0000 (0.0000) [2022-10-12 08:50:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3383 (0.3451) loss 3.1615 (3.1713) grad_norm 0.0000 (0.0000) [2022-10-12 08:51:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3164 (0.3414) loss 2.9796 (3.1691) grad_norm 0.0000 (0.0000) [2022-10-12 08:51:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3263 (0.3397) loss 3.2732 (3.1650) grad_norm 0.0000 (0.0000) [2022-10-12 08:52:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3441 (0.3382) loss 3.2858 (3.1643) grad_norm 0.0000 (0.0000) [2022-10-12 08:52:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3288 (0.3373) loss 3.0429 (3.1616) grad_norm 0.0000 (0.0000) [2022-10-12 08:53:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3049 (0.3368) loss 3.2833 (3.1614) grad_norm 0.0000 (0.0000) [2022-10-12 08:53:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3648 (0.3363) loss 2.9978 (3.1614) grad_norm 0.0000 (0.0000) [2022-10-12 08:54:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.2951 (0.3358) loss 3.1516 (3.1628) grad_norm 0.0000 (0.0000) [2022-10-12 08:54:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3120 (0.3354) loss 3.0334 (3.1611) grad_norm 0.0000 (0.0000) [2022-10-12 08:55:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [262/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3412 (0.3351) loss 3.0320 (3.1604) grad_norm 0.0000 (0.0000) [2022-10-12 08:55:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 262 training takes 0:06:59 [2022-10-12 08:55:49 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.329 (3.329) Loss 0.9298 (0.9298) Acc@1 78.516 (78.516) Acc@5 94.824 (94.824) [2022-10-12 08:56:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.116 Acc@5 94.794 [2022-10-12 08:56:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 08:56:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.23% [2022-10-12 08:56:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][0/1251] eta 1:09:13 lr 0.000001 time 3.3203 (3.3203) loss 3.0230 (3.0230) grad_norm 0.0000 (0.0000) [2022-10-12 08:56:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3182 (0.3671) loss 3.2140 (3.1302) grad_norm 0.0000 (0.0000) [2022-10-12 08:57:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3338 (0.3505) loss 2.8727 (3.1366) grad_norm 0.0000 (0.0000) [2022-10-12 08:57:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3500 (0.3453) loss 3.3482 (3.1433) grad_norm 0.0000 (0.0000) [2022-10-12 08:58:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3389 (0.3419) loss 3.2295 (3.1467) grad_norm 0.0000 (0.0000) [2022-10-12 08:58:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3160 (0.3397) loss 3.3545 (3.1489) grad_norm 0.0000 (0.0000) [2022-10-12 08:59:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3300 (0.3382) loss 2.7478 (3.1514) grad_norm 0.0000 (0.0000) [2022-10-12 08:59:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3328 (0.3372) loss 3.1960 (3.1584) grad_norm 0.0000 (0.0000) [2022-10-12 09:00:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3139 (0.3368) loss 3.0768 (3.1594) grad_norm 0.0000 (0.0000) [2022-10-12 09:01:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3420 (0.3366) loss 3.1585 (3.1580) grad_norm 0.0000 (0.0000) [2022-10-12 09:01:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3256 (0.3363) loss 3.1093 (3.1586) grad_norm 0.0000 (0.0000) [2022-10-12 09:02:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3348 (0.3360) loss 2.9777 (3.1595) grad_norm 0.0000 (0.0000) [2022-10-12 09:02:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [263/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3434 (0.3357) loss 3.1143 (3.1597) grad_norm 0.0000 (0.0000) [2022-10-12 09:03:01 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 263 training takes 0:06:59 [2022-10-12 09:03:04 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.173 (3.173) Loss 0.8800 (0.8800) Acc@1 81.152 (81.152) Acc@5 95.410 (95.410) [2022-10-12 09:03:16 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.284 Acc@5 94.826 [2022-10-12 09:03:16 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 09:03:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:03:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][0/1251] eta 1:16:01 lr 0.000001 time 3.6465 (3.6465) loss 3.3763 (3.3763) grad_norm 0.0000 (0.0000) [2022-10-12 09:03:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3558 (0.3709) loss 2.7940 (3.1457) grad_norm 0.0000 (0.0000) [2022-10-12 09:04:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][200/1251] eta 0:06:11 lr 0.000001 time 0.3305 (0.3530) loss 3.4171 (3.1422) grad_norm 0.0000 (0.0000) [2022-10-12 09:05:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3533 (0.3455) loss 3.2663 (3.1520) grad_norm 0.0000 (0.0000) [2022-10-12 09:05:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3228 (0.3423) loss 2.8633 (3.1434) grad_norm 0.0000 (0.0000) [2022-10-12 09:06:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3452 (0.3404) loss 3.3146 (3.1461) grad_norm 0.0000 (0.0000) [2022-10-12 09:06:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3217 (0.3389) loss 3.0616 (3.1466) grad_norm 0.0000 (0.0000) [2022-10-12 09:07:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3173 (0.3380) loss 3.4290 (3.1457) grad_norm 0.0000 (0.0000) [2022-10-12 09:07:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3147 (0.3373) loss 2.9162 (3.1446) grad_norm 0.0000 (0.0000) [2022-10-12 09:08:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3441 (0.3366) loss 3.1224 (3.1460) grad_norm 0.0000 (0.0000) [2022-10-12 09:08:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3282 (0.3361) loss 3.1383 (3.1498) grad_norm 0.0000 (0.0000) [2022-10-12 09:09:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3201 (0.3358) loss 3.1338 (3.1484) grad_norm 0.0000 (0.0000) [2022-10-12 09:09:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [264/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3408 (0.3356) loss 3.0647 (3.1485) grad_norm 0.0000 (0.0000) [2022-10-12 09:10:16 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 264 training takes 0:06:59 [2022-10-12 09:10:19 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.089 (3.089) Loss 0.8679 (0.8679) Acc@1 81.934 (81.934) Acc@5 94.531 (94.531) [2022-10-12 09:10:31 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.126 Acc@5 94.792 [2022-10-12 09:10:31 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 09:10:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:10:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][0/1251] eta 1:15:31 lr 0.000001 time 3.6227 (3.6227) loss 3.2767 (3.2767) grad_norm 0.0000 (0.0000) [2022-10-12 09:11:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3296 (0.3691) loss 3.2891 (3.1585) grad_norm 0.0000 (0.0000) [2022-10-12 09:11:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3477 (0.3514) loss 3.1502 (3.1496) grad_norm 0.0000 (0.0000) [2022-10-12 09:12:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3238 (0.3446) loss 3.0316 (3.1450) grad_norm 0.0000 (0.0000) [2022-10-12 09:12:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3236 (0.3411) loss 3.2058 (3.1465) grad_norm 0.0000 (0.0000) [2022-10-12 09:13:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3365 (0.3393) loss 2.9547 (3.1465) grad_norm 0.0000 (0.0000) [2022-10-12 09:13:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3262 (0.3381) loss 3.2330 (3.1455) grad_norm 0.0000 (0.0000) [2022-10-12 09:14:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3252 (0.3373) loss 3.2052 (3.1470) grad_norm 0.0000 (0.0000) [2022-10-12 09:15:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3325 (0.3366) loss 3.1538 (3.1499) grad_norm 0.0000 (0.0000) [2022-10-12 09:15:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3328 (0.3362) loss 3.1672 (3.1512) grad_norm 0.0000 (0.0000) [2022-10-12 09:16:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3524 (0.3356) loss 3.2821 (3.1503) grad_norm 0.0000 (0.0000) [2022-10-12 09:16:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3361 (0.3353) loss 3.2991 (3.1516) grad_norm 0.0000 (0.0000) [2022-10-12 09:17:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [265/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3496 (0.3352) loss 3.2861 (3.1523) grad_norm 0.0000 (0.0000) [2022-10-12 09:17:30 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 265 training takes 0:06:59 [2022-10-12 09:17:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.306 (3.306) Loss 0.9845 (0.9845) Acc@1 78.027 (78.027) Acc@5 94.141 (94.141) [2022-10-12 09:17:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.200 Acc@5 94.750 [2022-10-12 09:17:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 09:17:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:17:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][0/1251] eta 1:03:43 lr 0.000001 time 3.0561 (3.0561) loss 3.1925 (3.1925) grad_norm 0.0000 (0.0000) [2022-10-12 09:18:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][100/1251] eta 0:07:07 lr 0.000001 time 0.3280 (0.3717) loss 3.0498 (3.1413) grad_norm 0.0000 (0.0000) [2022-10-12 09:18:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3016 (0.3515) loss 2.9347 (3.1507) grad_norm 0.0000 (0.0000) [2022-10-12 09:19:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3663 (0.3450) loss 3.1987 (3.1504) grad_norm 0.0000 (0.0000) [2022-10-12 09:20:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][400/1251] eta 0:04:52 lr 0.000001 time 0.3565 (0.3434) loss 3.1080 (3.1524) grad_norm 0.0000 (0.0000) [2022-10-12 09:20:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][500/1251] eta 0:04:16 lr 0.000001 time 0.3529 (0.3418) loss 3.2752 (3.1533) grad_norm 0.0000 (0.0000) [2022-10-12 09:21:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][600/1251] eta 0:03:41 lr 0.000001 time 0.3307 (0.3404) loss 3.1293 (3.1527) grad_norm 0.0000 (0.0000) [2022-10-12 09:21:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][700/1251] eta 0:03:07 lr 0.000001 time 0.3389 (0.3396) loss 3.2051 (3.1530) grad_norm 0.0000 (0.0000) [2022-10-12 09:22:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3462 (0.3389) loss 3.2023 (3.1482) grad_norm 0.0000 (0.0000) [2022-10-12 09:22:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3292 (0.3381) loss 3.3903 (3.1503) grad_norm 0.0000 (0.0000) [2022-10-12 09:23:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3307 (0.3374) loss 3.0468 (3.1474) grad_norm 0.0000 (0.0000) [2022-10-12 09:23:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3144 (0.3370) loss 3.0339 (3.1496) grad_norm 0.0000 (0.0000) [2022-10-12 09:24:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [266/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3254 (0.3367) loss 3.1015 (3.1479) grad_norm 0.0000 (0.0000) [2022-10-12 09:24:47 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 266 training takes 0:07:00 [2022-10-12 09:24:50 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.603 (3.603) Loss 0.9191 (0.9191) Acc@1 80.957 (80.957) Acc@5 94.434 (94.434) [2022-10-12 09:25:02 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.076 Acc@5 94.744 [2022-10-12 09:25:02 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 09:25:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:25:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][0/1251] eta 1:22:41 lr 0.000001 time 3.9660 (3.9660) loss 3.1823 (3.1823) grad_norm 0.0000 (0.0000) [2022-10-12 09:25:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][100/1251] eta 0:07:10 lr 0.000001 time 0.3447 (0.3741) loss 3.1306 (3.1295) grad_norm 0.0000 (0.0000) [2022-10-12 09:26:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][200/1251] eta 0:06:12 lr 0.000001 time 0.3420 (0.3546) loss 3.3626 (3.1454) grad_norm 0.0000 (0.0000) [2022-10-12 09:26:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][300/1251] eta 0:05:31 lr 0.000001 time 0.3498 (0.3488) loss 2.9126 (3.1413) grad_norm 0.0000 (0.0000) [2022-10-12 09:27:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][400/1251] eta 0:04:53 lr 0.000001 time 0.3290 (0.3453) loss 3.0053 (3.1361) grad_norm 0.0000 (0.0000) [2022-10-12 09:27:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][500/1251] eta 0:04:17 lr 0.000001 time 0.3289 (0.3428) loss 2.9562 (3.1352) grad_norm 0.0000 (0.0000) [2022-10-12 09:28:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][600/1251] eta 0:03:41 lr 0.000001 time 0.3487 (0.3409) loss 3.1442 (3.1339) grad_norm 0.0000 (0.0000) [2022-10-12 09:29:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][700/1251] eta 0:03:07 lr 0.000001 time 0.3166 (0.3400) loss 3.1653 (3.1366) grad_norm 0.0000 (0.0000) [2022-10-12 09:29:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][800/1251] eta 0:02:33 lr 0.000001 time 0.3226 (0.3396) loss 3.1572 (3.1352) grad_norm 0.0000 (0.0000) [2022-10-12 09:30:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3399 (0.3390) loss 2.9995 (3.1380) grad_norm 0.0000 (0.0000) [2022-10-12 09:30:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3566 (0.3385) loss 3.1718 (3.1395) grad_norm 0.0000 (0.0000) [2022-10-12 09:31:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][1100/1251] eta 0:00:51 lr 0.000001 time 0.3464 (0.3381) loss 3.1181 (3.1404) grad_norm 0.0000 (0.0000) [2022-10-12 09:31:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [267/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.2969 (0.3377) loss 3.0271 (3.1394) grad_norm 0.0000 (0.0000) [2022-10-12 09:32:04 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 267 training takes 0:07:02 [2022-10-12 09:32:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.450 (3.450) Loss 0.9226 (0.9226) Acc@1 79.590 (79.590) Acc@5 94.824 (94.824) [2022-10-12 09:32:20 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.074 Acc@5 94.714 [2022-10-12 09:32:20 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-12 09:32:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:32:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][0/1251] eta 1:18:01 lr 0.000001 time 3.7425 (3.7425) loss 3.1513 (3.1513) grad_norm 0.0000 (0.0000) [2022-10-12 09:32:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][100/1251] eta 0:07:08 lr 0.000001 time 0.3436 (0.3720) loss 3.0907 (3.1369) grad_norm 0.0000 (0.0000) [2022-10-12 09:33:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3134 (0.3530) loss 2.5887 (3.1352) grad_norm 0.0000 (0.0000) [2022-10-12 09:34:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3431 (0.3462) loss 3.2152 (3.1400) grad_norm 0.0000 (0.0000) [2022-10-12 09:34:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3260 (0.3421) loss 3.3319 (3.1423) grad_norm 0.0000 (0.0000) [2022-10-12 09:35:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3183 (0.3398) loss 3.2618 (3.1413) grad_norm 0.0000 (0.0000) [2022-10-12 09:35:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3292 (0.3382) loss 3.2510 (3.1408) grad_norm 0.0000 (0.0000) [2022-10-12 09:36:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3171 (0.3371) loss 3.1715 (3.1418) grad_norm 0.0000 (0.0000) [2022-10-12 09:36:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3385 (0.3362) loss 3.2468 (3.1409) grad_norm 0.0000 (0.0000) [2022-10-12 09:37:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3294 (0.3357) loss 3.1207 (3.1415) grad_norm 0.0000 (0.0000) [2022-10-12 09:37:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3554 (0.3352) loss 3.1412 (3.1417) grad_norm 0.0000 (0.0000) [2022-10-12 09:38:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3270 (0.3348) loss 3.3506 (3.1428) grad_norm 0.0000 (0.0000) [2022-10-12 09:39:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [268/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3376 (0.3344) loss 3.0535 (3.1426) grad_norm 0.0000 (0.0000) [2022-10-12 09:39:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 268 training takes 0:06:57 [2022-10-12 09:39:21 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.407 (3.407) Loss 0.7929 (0.7929) Acc@1 82.812 (82.812) Acc@5 96.191 (96.191) [2022-10-12 09:39:33 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.202 Acc@5 94.798 [2022-10-12 09:39:33 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 09:39:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.28% [2022-10-12 09:39:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][0/1251] eta 1:09:58 lr 0.000001 time 3.3564 (3.3564) loss 3.3338 (3.3338) grad_norm 0.0000 (0.0000) [2022-10-12 09:40:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3802 (0.3700) loss 3.3181 (3.1041) grad_norm 0.0000 (0.0000) [2022-10-12 09:40:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3209 (0.3510) loss 2.9655 (3.1135) grad_norm 0.0000 (0.0000) [2022-10-12 09:41:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3707 (0.3457) loss 3.3004 (3.1205) grad_norm 0.0000 (0.0000) [2022-10-12 09:41:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3121 (0.3424) loss 3.1455 (3.1162) grad_norm 0.0000 (0.0000) [2022-10-12 09:42:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3385 (0.3405) loss 3.0637 (3.1229) grad_norm 0.0000 (0.0000) [2022-10-12 09:42:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3334 (0.3390) loss 3.0886 (3.1243) grad_norm 0.0000 (0.0000) [2022-10-12 09:43:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3263 (0.3381) loss 3.3008 (3.1281) grad_norm 0.0000 (0.0000) [2022-10-12 09:44:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3411 (0.3374) loss 3.1494 (3.1311) grad_norm 0.0000 (0.0000) [2022-10-12 09:44:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3677 (0.3373) loss 3.0055 (3.1325) grad_norm 0.0000 (0.0000) [2022-10-12 09:45:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3445 (0.3369) loss 2.7554 (3.1327) grad_norm 0.0000 (0.0000) [2022-10-12 09:45:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3331 (0.3365) loss 2.8839 (3.1347) grad_norm 0.0000 (0.0000) [2022-10-12 09:46:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [269/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3501 (0.3362) loss 3.1421 (3.1356) grad_norm 0.0000 (0.0000) [2022-10-12 09:46:33 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 269 training takes 0:07:00 [2022-10-12 09:46:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.265 (3.265) Loss 0.9357 (0.9357) Acc@1 81.152 (81.152) Acc@5 94.531 (94.531) [2022-10-12 09:46:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.304 Acc@5 94.824 [2022-10-12 09:46:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 09:46:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.30% [2022-10-12 09:46:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][0/1251] eta 1:10:44 lr 0.000001 time 3.3927 (3.3927) loss 3.2319 (3.2319) grad_norm 0.0000 (0.0000) [2022-10-12 09:47:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3249 (0.3688) loss 3.0558 (3.1211) grad_norm 0.0000 (0.0000) [2022-10-12 09:47:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3434 (0.3505) loss 3.1314 (3.1285) grad_norm 0.0000 (0.0000) [2022-10-12 09:48:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3647 (0.3453) loss 3.1463 (3.1214) grad_norm 0.0000 (0.0000) [2022-10-12 09:49:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3720 (0.3423) loss 2.9950 (3.1232) grad_norm 0.0000 (0.0000) [2022-10-12 09:49:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3322 (0.3401) loss 3.0513 (3.1237) grad_norm 0.0000 (0.0000) [2022-10-12 09:50:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3196 (0.3389) loss 2.9452 (3.1236) grad_norm 0.0000 (0.0000) [2022-10-12 09:50:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3206 (0.3379) loss 3.2346 (3.1244) grad_norm 0.0000 (0.0000) [2022-10-12 09:51:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3276 (0.3371) loss 3.1921 (3.1277) grad_norm 0.0000 (0.0000) [2022-10-12 09:51:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3306 (0.3367) loss 3.1502 (3.1289) grad_norm 0.0000 (0.0000) [2022-10-12 09:52:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3192 (0.3364) loss 2.9321 (3.1279) grad_norm 0.0000 (0.0000) [2022-10-12 09:52:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3287 (0.3362) loss 3.1537 (3.1304) grad_norm 0.0000 (0.0000) [2022-10-12 09:53:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [270/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3474 (0.3359) loss 3.1309 (3.1329) grad_norm 0.0000 (0.0000) [2022-10-12 09:53:49 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 270 training takes 0:06:59 [2022-10-12 09:53:49 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_270 saving...... [2022-10-12 09:53:49 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_270 saved !!! [2022-10-12 09:53:52 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.429 (3.429) Loss 0.8744 (0.8744) Acc@1 81.348 (81.348) Acc@5 94.824 (94.824) [2022-10-12 09:54:04 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.382 Acc@5 94.746 [2022-10-12 09:54:04 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 09:54:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.38% [2022-10-12 09:54:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][0/1251] eta 1:16:01 lr 0.000001 time 3.6460 (3.6460) loss 3.1296 (3.1296) grad_norm 0.0000 (0.0000) [2022-10-12 09:54:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][100/1251] eta 0:07:05 lr 0.000001 time 0.3187 (0.3696) loss 3.0608 (3.1233) grad_norm 0.0000 (0.0000) [2022-10-12 09:55:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3498 (0.3505) loss 3.1259 (3.1218) grad_norm 0.0000 (0.0000) [2022-10-12 09:55:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3306 (0.3442) loss 3.1531 (3.1258) grad_norm 0.0000 (0.0000) [2022-10-12 09:56:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3379 (0.3411) loss 3.1273 (3.1262) grad_norm 0.0000 (0.0000) [2022-10-12 09:56:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3572 (0.3394) loss 2.8892 (3.1196) grad_norm 0.0000 (0.0000) [2022-10-12 09:57:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3536 (0.3383) loss 3.4294 (3.1228) grad_norm 0.0000 (0.0000) [2022-10-12 09:58:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3506 (0.3371) loss 2.9563 (3.1278) grad_norm 0.0000 (0.0000) [2022-10-12 09:58:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3473 (0.3366) loss 3.3097 (3.1283) grad_norm 0.0000 (0.0000) [2022-10-12 09:59:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3617 (0.3362) loss 3.2695 (3.1294) grad_norm 0.0000 (0.0000) [2022-10-12 09:59:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3248 (0.3358) loss 3.0204 (3.1270) grad_norm 0.0000 (0.0000) [2022-10-12 10:00:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3351 (0.3355) loss 3.4368 (3.1274) grad_norm 0.0000 (0.0000) [2022-10-12 10:00:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [271/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3268 (0.3352) loss 3.2617 (3.1254) grad_norm 0.0000 (0.0000) [2022-10-12 10:01:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 271 training takes 0:06:59 [2022-10-12 10:01:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.104 (3.104) Loss 0.9564 (0.9564) Acc@1 78.418 (78.418) Acc@5 93.945 (93.945) [2022-10-12 10:01:19 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.310 Acc@5 94.790 [2022-10-12 10:01:19 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 10:01:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.38% [2022-10-12 10:01:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][0/1251] eta 1:15:23 lr 0.000001 time 3.6160 (3.6160) loss 3.0600 (3.0600) grad_norm 0.0000 (0.0000) [2022-10-12 10:01:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3182 (0.3667) loss 3.2191 (3.1148) grad_norm 0.0000 (0.0000) [2022-10-12 10:02:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3421 (0.3496) loss 3.3404 (3.1166) grad_norm 0.0000 (0.0000) [2022-10-12 10:03:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3339 (0.3447) loss 2.8211 (3.1206) grad_norm 0.0000 (0.0000) [2022-10-12 10:03:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3330 (0.3416) loss 3.2167 (3.1252) grad_norm 0.0000 (0.0000) [2022-10-12 10:04:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3252 (0.3395) loss 2.8362 (3.1236) grad_norm 0.0000 (0.0000) [2022-10-12 10:04:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3160 (0.3386) loss 3.0420 (3.1273) grad_norm 0.0000 (0.0000) [2022-10-12 10:05:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3171 (0.3377) loss 3.0613 (3.1264) grad_norm 0.0000 (0.0000) [2022-10-12 10:05:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3313 (0.3372) loss 3.0492 (3.1222) grad_norm 0.0000 (0.0000) [2022-10-12 10:06:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3331 (0.3367) loss 3.1829 (3.1201) grad_norm 0.0000 (0.0000) [2022-10-12 10:06:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3364 (0.3363) loss 3.1000 (3.1221) grad_norm 0.0000 (0.0000) [2022-10-12 10:07:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3299 (0.3358) loss 3.1466 (3.1240) grad_norm 0.0000 (0.0000) [2022-10-12 10:08:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [272/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3217 (0.3355) loss 3.2522 (3.1230) grad_norm 0.0000 (0.0000) [2022-10-12 10:08:18 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 272 training takes 0:06:59 [2022-10-12 10:08:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.293 (3.293) Loss 0.9908 (0.9908) Acc@1 78.809 (78.809) Acc@5 94.238 (94.238) [2022-10-12 10:08:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.204 Acc@5 94.774 [2022-10-12 10:08:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-12 10:08:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.38% [2022-10-12 10:08:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][0/1251] eta 1:16:00 lr 0.000001 time 3.6451 (3.6451) loss 3.2336 (3.2336) grad_norm 0.0000 (0.0000) [2022-10-12 10:09:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3415 (0.3702) loss 3.2848 (3.1282) grad_norm 0.0000 (0.0000) [2022-10-12 10:09:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3236 (0.3518) loss 3.4398 (3.1377) grad_norm 0.0000 (0.0000) [2022-10-12 10:10:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3321 (0.3456) loss 3.1434 (3.1385) grad_norm 0.0000 (0.0000) [2022-10-12 10:10:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3383 (0.3425) loss 3.0526 (3.1345) grad_norm 0.0000 (0.0000) [2022-10-12 10:11:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3507 (0.3403) loss 2.9510 (3.1353) grad_norm 0.0000 (0.0000) [2022-10-12 10:11:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3226 (0.3389) loss 3.0437 (3.1335) grad_norm 0.0000 (0.0000) [2022-10-12 10:12:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3317 (0.3380) loss 3.3402 (3.1332) grad_norm 0.0000 (0.0000) [2022-10-12 10:13:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3081 (0.3373) loss 3.0057 (3.1295) grad_norm 0.0000 (0.0000) [2022-10-12 10:13:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3382 (0.3367) loss 2.9565 (3.1283) grad_norm 0.0000 (0.0000) [2022-10-12 10:14:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3231 (0.3365) loss 3.3716 (3.1265) grad_norm 0.0000 (0.0000) [2022-10-12 10:14:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3253 (0.3361) loss 3.0630 (3.1265) grad_norm 0.0000 (0.0000) [2022-10-12 10:15:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [273/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3229 (0.3359) loss 3.0843 (3.1257) grad_norm 0.0000 (0.0000) [2022-10-12 10:15:34 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 273 training takes 0:07:00 [2022-10-12 10:15:37 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.260 (3.260) Loss 0.8827 (0.8827) Acc@1 81.641 (81.641) Acc@5 95.215 (95.215) [2022-10-12 10:15:49 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.288 Acc@5 94.860 [2022-10-12 10:15:49 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 10:15:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.38% [2022-10-12 10:15:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][0/1251] eta 1:13:56 lr 0.000001 time 3.5465 (3.5465) loss 3.2198 (3.2198) grad_norm 0.0000 (0.0000) [2022-10-12 10:16:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3264 (0.3673) loss 3.2031 (3.1112) grad_norm 0.0000 (0.0000) [2022-10-12 10:16:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3542 (0.3502) loss 3.0151 (3.1060) grad_norm 0.0000 (0.0000) [2022-10-12 10:17:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3535 (0.3443) loss 3.0626 (3.1091) grad_norm 0.0000 (0.0000) [2022-10-12 10:18:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3166 (0.3418) loss 2.7557 (3.1107) grad_norm 0.0000 (0.0000) [2022-10-12 10:18:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3187 (0.3399) loss 3.2158 (3.1187) grad_norm 0.0000 (0.0000) [2022-10-12 10:19:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3218 (0.3390) loss 3.0485 (3.1202) grad_norm 0.0000 (0.0000) [2022-10-12 10:19:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3393 (0.3383) loss 3.0229 (3.1226) grad_norm 0.0000 (0.0000) [2022-10-12 10:20:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3320 (0.3376) loss 3.3544 (3.1261) grad_norm 0.0000 (0.0000) [2022-10-12 10:20:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3366 (0.3371) loss 3.0759 (3.1239) grad_norm 0.0000 (0.0000) [2022-10-12 10:21:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3552 (0.3365) loss 3.0327 (3.1248) grad_norm 0.0000 (0.0000) [2022-10-12 10:21:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3363 (0.3362) loss 2.9067 (3.1253) grad_norm 0.0000 (0.0000) [2022-10-12 10:22:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [274/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3099 (0.3360) loss 3.0640 (3.1264) grad_norm 0.0000 (0.0000) [2022-10-12 10:22:49 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 274 training takes 0:07:00 [2022-10-12 10:22:52 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.352 (3.352) Loss 0.9013 (0.9013) Acc@1 81.152 (81.152) Acc@5 95.020 (95.020) [2022-10-12 10:23:04 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.402 Acc@5 94.868 [2022-10-12 10:23:04 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 10:23:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:23:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][0/1251] eta 1:13:05 lr 0.000001 time 3.5053 (3.5053) loss 3.2055 (3.2055) grad_norm 0.0000 (0.0000) [2022-10-12 10:23:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3112 (0.3665) loss 3.1978 (3.1113) grad_norm 0.0000 (0.0000) [2022-10-12 10:24:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3396 (0.3492) loss 3.0724 (3.1118) grad_norm 0.0000 (0.0000) [2022-10-12 10:24:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][300/1251] eta 0:05:25 lr 0.000001 time 0.3180 (0.3427) loss 2.8725 (3.1161) grad_norm 0.0000 (0.0000) [2022-10-12 10:25:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3005 (0.3401) loss 3.4082 (3.1175) grad_norm 0.0000 (0.0000) [2022-10-12 10:25:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3422 (0.3385) loss 3.0095 (3.1193) grad_norm 0.0000 (0.0000) [2022-10-12 10:26:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3048 (0.3371) loss 2.9782 (3.1203) grad_norm 0.0000 (0.0000) [2022-10-12 10:27:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3467 (0.3365) loss 2.8394 (3.1201) grad_norm 0.0000 (0.0000) [2022-10-12 10:27:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3197 (0.3357) loss 2.8145 (3.1185) grad_norm 0.0000 (0.0000) [2022-10-12 10:28:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3237 (0.3353) loss 3.0388 (3.1168) grad_norm 0.0000 (0.0000) [2022-10-12 10:28:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3613 (0.3352) loss 3.1385 (3.1173) grad_norm 0.0000 (0.0000) [2022-10-12 10:29:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3838 (0.3349) loss 3.1348 (3.1190) grad_norm 0.0000 (0.0000) [2022-10-12 10:29:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [275/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3165 (0.3347) loss 3.2929 (3.1217) grad_norm 0.0000 (0.0000) [2022-10-12 10:30:03 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 275 training takes 0:06:58 [2022-10-12 10:30:06 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.301 (3.301) Loss 0.9312 (0.9312) Acc@1 78.906 (78.906) Acc@5 94.336 (94.336) [2022-10-12 10:30:18 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.370 Acc@5 94.890 [2022-10-12 10:30:18 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 10:30:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:30:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][0/1251] eta 1:12:21 lr 0.000001 time 3.4704 (3.4704) loss 3.1153 (3.1153) grad_norm 0.0000 (0.0000) [2022-10-12 10:30:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3093 (0.3672) loss 3.0127 (3.0990) grad_norm 0.0000 (0.0000) [2022-10-12 10:31:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3051 (0.3500) loss 3.1352 (3.1206) grad_norm 0.0000 (0.0000) [2022-10-12 10:32:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3388 (0.3447) loss 3.1486 (3.1137) grad_norm 0.0000 (0.0000) [2022-10-12 10:32:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3361 (0.3417) loss 2.8928 (3.1214) grad_norm 0.0000 (0.0000) [2022-10-12 10:33:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3379 (0.3395) loss 2.9284 (3.1215) grad_norm 0.0000 (0.0000) [2022-10-12 10:33:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3163 (0.3386) loss 3.2180 (3.1228) grad_norm 0.0000 (0.0000) [2022-10-12 10:34:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3140 (0.3375) loss 3.2099 (3.1173) grad_norm 0.0000 (0.0000) [2022-10-12 10:34:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3412 (0.3371) loss 3.0002 (3.1196) grad_norm 0.0000 (0.0000) [2022-10-12 10:35:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3335 (0.3364) loss 3.1547 (3.1185) grad_norm 0.0000 (0.0000) [2022-10-12 10:35:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3367 (0.3359) loss 2.7244 (3.1170) grad_norm 0.0000 (0.0000) [2022-10-12 10:36:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3199 (0.3355) loss 3.1247 (3.1191) grad_norm 0.0000 (0.0000) [2022-10-12 10:37:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [276/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3592 (0.3353) loss 3.1957 (3.1192) grad_norm 0.0000 (0.0000) [2022-10-12 10:37:17 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 276 training takes 0:06:59 [2022-10-12 10:37:20 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.347 (3.347) Loss 0.8073 (0.8073) Acc@1 82.520 (82.520) Acc@5 96.094 (96.094) [2022-10-12 10:37:32 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.366 Acc@5 94.794 [2022-10-12 10:37:32 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 10:37:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:37:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][0/1251] eta 1:14:45 lr 0.000001 time 3.5857 (3.5857) loss 3.1738 (3.1738) grad_norm 0.0000 (0.0000) [2022-10-12 10:38:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3235 (0.3673) loss 3.2837 (3.1440) grad_norm 0.0000 (0.0000) [2022-10-12 10:38:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3215 (0.3500) loss 3.2715 (3.1377) grad_norm 0.0000 (0.0000) [2022-10-12 10:39:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3275 (0.3437) loss 3.2860 (3.1301) grad_norm 0.0000 (0.0000) [2022-10-12 10:39:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3436 (0.3408) loss 3.2876 (3.1323) grad_norm 0.0000 (0.0000) [2022-10-12 10:40:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3267 (0.3388) loss 3.0786 (3.1248) grad_norm 0.0000 (0.0000) [2022-10-12 10:40:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3329 (0.3376) loss 3.0967 (3.1208) grad_norm 0.0000 (0.0000) [2022-10-12 10:41:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3388 (0.3369) loss 3.1402 (3.1194) grad_norm 0.0000 (0.0000) [2022-10-12 10:42:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3437 (0.3361) loss 3.2129 (3.1189) grad_norm 0.0000 (0.0000) [2022-10-12 10:42:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3436 (0.3358) loss 3.2363 (3.1168) grad_norm 0.0000 (0.0000) [2022-10-12 10:43:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3154 (0.3353) loss 2.9250 (3.1176) grad_norm 0.0000 (0.0000) [2022-10-12 10:43:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3436 (0.3351) loss 3.1731 (3.1195) grad_norm 0.0000 (0.0000) [2022-10-12 10:44:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [277/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3355 (0.3349) loss 3.1216 (3.1201) grad_norm 0.0000 (0.0000) [2022-10-12 10:44:31 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 277 training takes 0:06:58 [2022-10-12 10:44:34 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.164 (3.164) Loss 0.8722 (0.8722) Acc@1 81.152 (81.152) Acc@5 95.508 (95.508) [2022-10-12 10:44:46 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.366 Acc@5 94.858 [2022-10-12 10:44:46 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 10:44:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:44:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][0/1251] eta 1:10:09 lr 0.000001 time 3.3653 (3.3653) loss 3.2587 (3.2587) grad_norm 0.0000 (0.0000) [2022-10-12 10:45:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3016 (0.3669) loss 3.1848 (3.1195) grad_norm 0.0000 (0.0000) [2022-10-12 10:45:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3217 (0.3499) loss 3.1320 (3.1125) grad_norm 0.0000 (0.0000) [2022-10-12 10:46:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3483 (0.3439) loss 3.2904 (3.1109) grad_norm 0.0000 (0.0000) [2022-10-12 10:47:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3689 (0.3408) loss 3.1552 (3.1165) grad_norm 0.0000 (0.0000) [2022-10-12 10:47:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3075 (0.3391) loss 2.7882 (3.1126) grad_norm 0.0000 (0.0000) [2022-10-12 10:48:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3479 (0.3378) loss 3.2596 (3.1107) grad_norm 0.0000 (0.0000) [2022-10-12 10:48:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3374 (0.3370) loss 3.3917 (3.1109) grad_norm 0.0000 (0.0000) [2022-10-12 10:49:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3217 (0.3364) loss 3.3436 (3.1126) grad_norm 0.0000 (0.0000) [2022-10-12 10:49:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3252 (0.3356) loss 3.0716 (3.1143) grad_norm 0.0000 (0.0000) [2022-10-12 10:50:22 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3325 (0.3351) loss 3.1681 (3.1142) grad_norm 0.0000 (0.0000) [2022-10-12 10:50:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3606 (0.3350) loss 3.1784 (3.1139) grad_norm 0.0000 (0.0000) [2022-10-12 10:51:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [278/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3231 (0.3349) loss 3.2195 (3.1142) grad_norm 0.0000 (0.0000) [2022-10-12 10:51:45 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 278 training takes 0:06:58 [2022-10-12 10:51:48 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.396 (3.396) Loss 0.8910 (0.8910) Acc@1 81.250 (81.250) Acc@5 95.020 (95.020) [2022-10-12 10:52:00 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.332 Acc@5 94.774 [2022-10-12 10:52:00 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 10:52:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:52:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][0/1251] eta 1:13:47 lr 0.000001 time 3.5394 (3.5394) loss 3.0307 (3.0307) grad_norm 0.0000 (0.0000) [2022-10-12 10:52:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3472 (0.3688) loss 2.8627 (3.1092) grad_norm 0.0000 (0.0000) [2022-10-12 10:53:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3414 (0.3513) loss 3.0021 (3.1143) grad_norm 0.0000 (0.0000) [2022-10-12 10:53:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3326 (0.3447) loss 3.2098 (3.1147) grad_norm 0.0000 (0.0000) [2022-10-12 10:54:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3461 (0.3416) loss 2.9898 (3.1132) grad_norm 0.0000 (0.0000) [2022-10-12 10:54:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3086 (0.3397) loss 3.2462 (3.1086) grad_norm 0.0000 (0.0000) [2022-10-12 10:55:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3342 (0.3386) loss 3.1154 (3.1144) grad_norm 0.0000 (0.0000) [2022-10-12 10:55:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3208 (0.3378) loss 3.1474 (3.1154) grad_norm 0.0000 (0.0000) [2022-10-12 10:56:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3285 (0.3369) loss 3.0780 (3.1153) grad_norm 0.0000 (0.0000) [2022-10-12 10:57:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3233 (0.3363) loss 3.1360 (3.1162) grad_norm 0.0000 (0.0000) [2022-10-12 10:57:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3126 (0.3357) loss 3.2886 (3.1144) grad_norm 0.0000 (0.0000) [2022-10-12 10:58:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3292 (0.3354) loss 3.1487 (3.1141) grad_norm 0.0000 (0.0000) [2022-10-12 10:58:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [279/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3425 (0.3351) loss 2.9749 (3.1157) grad_norm 0.0000 (0.0000) [2022-10-12 10:58:59 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 279 training takes 0:06:58 [2022-10-12 10:59:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.383 (3.383) Loss 0.9447 (0.9447) Acc@1 79.395 (79.395) Acc@5 94.336 (94.336) [2022-10-12 10:59:14 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.294 Acc@5 94.856 [2022-10-12 10:59:14 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 10:59:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 10:59:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][0/1251] eta 1:13:36 lr 0.000001 time 3.5307 (3.5307) loss 3.0138 (3.0138) grad_norm 0.0000 (0.0000) [2022-10-12 10:59:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][100/1251] eta 0:07:06 lr 0.000001 time 0.3366 (0.3707) loss 2.9540 (3.0922) grad_norm 0.0000 (0.0000) [2022-10-12 11:00:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][200/1251] eta 0:06:10 lr 0.000001 time 0.3741 (0.3527) loss 3.1025 (3.0886) grad_norm 0.0000 (0.0000) [2022-10-12 11:00:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][300/1251] eta 0:05:29 lr 0.000001 time 0.3497 (0.3463) loss 3.1763 (3.1025) grad_norm 0.0000 (0.0000) [2022-10-12 11:01:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3405 (0.3427) loss 2.8987 (3.1053) grad_norm 0.0000 (0.0000) [2022-10-12 11:02:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3318 (0.3409) loss 3.1841 (3.1033) grad_norm 0.0000 (0.0000) [2022-10-12 11:02:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][600/1251] eta 0:03:41 lr 0.000001 time 0.3262 (0.3395) loss 2.9375 (3.1046) grad_norm 0.0000 (0.0000) [2022-10-12 11:03:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3306 (0.3384) loss 3.0083 (3.1062) grad_norm 0.0000 (0.0000) [2022-10-12 11:03:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3259 (0.3375) loss 3.2102 (3.1021) grad_norm 0.0000 (0.0000) [2022-10-12 11:04:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3341 (0.3368) loss 2.9840 (3.1022) grad_norm 0.0000 (0.0000) [2022-10-12 11:04:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3093 (0.3364) loss 2.9336 (3.1011) grad_norm 0.0000 (0.0000) [2022-10-12 11:05:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3073 (0.3359) loss 3.2293 (3.1004) grad_norm 0.0000 (0.0000) [2022-10-12 11:05:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [280/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3316 (0.3356) loss 3.1323 (3.0982) grad_norm 0.0000 (0.0000) [2022-10-12 11:06:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 280 training takes 0:06:59 [2022-10-12 11:06:14 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_280 saving...... [2022-10-12 11:06:14 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_280 saved !!! [2022-10-12 11:06:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.364 (3.364) Loss 0.8871 (0.8871) Acc@1 80.957 (80.957) Acc@5 95.410 (95.410) [2022-10-12 11:06:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.360 Acc@5 94.822 [2022-10-12 11:06:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 11:06:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:06:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][0/1251] eta 1:09:13 lr 0.000001 time 3.3203 (3.3203) loss 2.9692 (2.9692) grad_norm 0.0000 (0.0000) [2022-10-12 11:07:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3256 (0.3665) loss 3.1294 (3.1328) grad_norm 0.0000 (0.0000) [2022-10-12 11:07:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3322 (0.3496) loss 2.9664 (3.1054) grad_norm 0.0000 (0.0000) [2022-10-12 11:08:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3499 (0.3437) loss 3.1384 (3.1109) grad_norm 0.0000 (0.0000) [2022-10-12 11:08:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3158 (0.3412) loss 2.9722 (3.1112) grad_norm 0.0000 (0.0000) [2022-10-12 11:09:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3200 (0.3396) loss 3.4086 (3.1117) grad_norm 0.0000 (0.0000) [2022-10-12 11:09:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3336 (0.3381) loss 3.0749 (3.1089) grad_norm 0.0000 (0.0000) [2022-10-12 11:10:26 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3376 (0.3373) loss 3.1137 (3.1079) grad_norm 0.0000 (0.0000) [2022-10-12 11:10:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3279 (0.3364) loss 3.2102 (3.1121) grad_norm 0.0000 (0.0000) [2022-10-12 11:11:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3069 (0.3357) loss 3.0762 (3.1121) grad_norm 0.0000 (0.0000) [2022-10-12 11:12:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3393 (0.3354) loss 3.0631 (3.1097) grad_norm 0.0000 (0.0000) [2022-10-12 11:12:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3092 (0.3350) loss 3.0570 (3.1085) grad_norm 0.0000 (0.0000) [2022-10-12 11:13:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [281/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3154 (0.3349) loss 2.9591 (3.1084) grad_norm 0.0000 (0.0000) [2022-10-12 11:13:28 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 281 training takes 0:06:58 [2022-10-12 11:13:31 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.381 (3.381) Loss 0.9620 (0.9620) Acc@1 79.102 (79.102) Acc@5 94.141 (94.141) [2022-10-12 11:13:43 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.278 Acc@5 94.864 [2022-10-12 11:13:43 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 11:13:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:13:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][0/1251] eta 1:14:46 lr 0.000001 time 3.5867 (3.5867) loss 3.0609 (3.0609) grad_norm 0.0000 (0.0000) [2022-10-12 11:14:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3210 (0.3666) loss 2.9500 (3.0978) grad_norm 0.0000 (0.0000) [2022-10-12 11:14:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3119 (0.3497) loss 3.0687 (3.0985) grad_norm 0.0000 (0.0000) [2022-10-12 11:15:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3282 (0.3440) loss 3.1929 (3.1133) grad_norm 0.0000 (0.0000) [2022-10-12 11:16:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3556 (0.3411) loss 3.0651 (3.1094) grad_norm 0.0000 (0.0000) [2022-10-12 11:16:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3487 (0.3393) loss 2.8819 (3.1074) grad_norm 0.0000 (0.0000) [2022-10-12 11:17:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3240 (0.3379) loss 2.9842 (3.1059) grad_norm 0.0000 (0.0000) [2022-10-12 11:17:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3381 (0.3368) loss 3.1017 (3.1050) grad_norm 0.0000 (0.0000) [2022-10-12 11:18:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3279 (0.3360) loss 3.2090 (3.1059) grad_norm 0.0000 (0.0000) [2022-10-12 11:18:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3497 (0.3355) loss 3.1198 (3.1074) grad_norm 0.0000 (0.0000) [2022-10-12 11:19:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3288 (0.3351) loss 3.2852 (3.1083) grad_norm 0.0000 (0.0000) [2022-10-12 11:19:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3507 (0.3346) loss 2.9549 (3.1080) grad_norm 0.0000 (0.0000) [2022-10-12 11:20:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [282/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3122 (0.3343) loss 3.0432 (3.1079) grad_norm 0.0000 (0.0000) [2022-10-12 11:20:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 282 training takes 0:06:57 [2022-10-12 11:20:44 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.264 (3.264) Loss 0.8985 (0.8985) Acc@1 81.152 (81.152) Acc@5 94.531 (94.531) [2022-10-12 11:20:57 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.324 Acc@5 94.796 [2022-10-12 11:20:57 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 11:20:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:21:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][0/1251] eta 1:10:01 lr 0.000001 time 3.3583 (3.3583) loss 2.8084 (2.8084) grad_norm 0.0000 (0.0000) [2022-10-12 11:21:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3498 (0.3686) loss 3.1094 (3.1010) grad_norm 0.0000 (0.0000) [2022-10-12 11:22:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3394 (0.3507) loss 3.1129 (3.1069) grad_norm 0.0000 (0.0000) [2022-10-12 11:22:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3169 (0.3454) loss 2.9086 (3.1075) grad_norm 0.0000 (0.0000) [2022-10-12 11:23:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][400/1251] eta 0:04:51 lr 0.000001 time 0.3255 (0.3423) loss 3.4784 (3.1070) grad_norm 0.0000 (0.0000) [2022-10-12 11:23:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3555 (0.3404) loss 3.0838 (3.1032) grad_norm 0.0000 (0.0000) [2022-10-12 11:24:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3583 (0.3391) loss 3.0777 (3.1007) grad_norm 0.0000 (0.0000) [2022-10-12 11:24:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3121 (0.3382) loss 2.9727 (3.0988) grad_norm 0.0000 (0.0000) [2022-10-12 11:25:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][800/1251] eta 0:02:32 lr 0.000001 time 0.2954 (0.3372) loss 3.1427 (3.0978) grad_norm 0.0000 (0.0000) [2022-10-12 11:26:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3320 (0.3369) loss 3.0294 (3.0949) grad_norm 0.0000 (0.0000) [2022-10-12 11:26:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3238 (0.3363) loss 3.0325 (3.0968) grad_norm 0.0000 (0.0000) [2022-10-12 11:27:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3184 (0.3358) loss 2.7354 (3.0966) grad_norm 0.0000 (0.0000) [2022-10-12 11:27:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [283/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3207 (0.3354) loss 3.1539 (3.0995) grad_norm 0.0000 (0.0000) [2022-10-12 11:27:56 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 283 training takes 0:06:59 [2022-10-12 11:27:59 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.149 (3.149) Loss 0.9348 (0.9348) Acc@1 80.273 (80.273) Acc@5 94.141 (94.141) [2022-10-12 11:28:11 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.276 Acc@5 94.778 [2022-10-12 11:28:11 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 11:28:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:28:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][0/1251] eta 1:14:51 lr 0.000001 time 3.5901 (3.5901) loss 3.0954 (3.0954) grad_norm 0.0000 (0.0000) [2022-10-12 11:28:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3460 (0.3674) loss 3.2330 (3.0926) grad_norm 0.0000 (0.0000) [2022-10-12 11:29:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3087 (0.3494) loss 3.0805 (3.0965) grad_norm 0.0000 (0.0000) [2022-10-12 11:29:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3191 (0.3439) loss 3.2699 (3.1040) grad_norm 0.0000 (0.0000) [2022-10-12 11:30:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3251 (0.3417) loss 2.9132 (3.0942) grad_norm 0.0000 (0.0000) [2022-10-12 11:31:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3306 (0.3403) loss 2.9460 (3.0879) grad_norm 0.0000 (0.0000) [2022-10-12 11:31:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3419 (0.3391) loss 3.2110 (3.0909) grad_norm 0.0000 (0.0000) [2022-10-12 11:32:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3724 (0.3384) loss 3.1654 (3.0900) grad_norm 0.0000 (0.0000) [2022-10-12 11:32:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3286 (0.3378) loss 3.2228 (3.0909) grad_norm 0.0000 (0.0000) [2022-10-12 11:33:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3259 (0.3370) loss 2.9344 (3.0919) grad_norm 0.0000 (0.0000) [2022-10-12 11:33:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3413 (0.3365) loss 2.9251 (3.0939) grad_norm 0.0000 (0.0000) [2022-10-12 11:34:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3356 (0.3361) loss 3.0548 (3.0942) grad_norm 0.0000 (0.0000) [2022-10-12 11:34:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [284/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3294 (0.3357) loss 3.1483 (3.0935) grad_norm 0.0000 (0.0000) [2022-10-12 11:35:11 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 284 training takes 0:06:59 [2022-10-12 11:35:15 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.523 (3.523) Loss 0.9660 (0.9660) Acc@1 79.004 (79.004) Acc@5 94.141 (94.141) [2022-10-12 11:35:26 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.370 Acc@5 94.858 [2022-10-12 11:35:26 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 11:35:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:35:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][0/1251] eta 1:06:53 lr 0.000001 time 3.2079 (3.2079) loss 3.4525 (3.4525) grad_norm 0.0000 (0.0000) [2022-10-12 11:36:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3027 (0.3666) loss 3.1656 (3.0946) grad_norm 0.0000 (0.0000) [2022-10-12 11:36:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3293 (0.3503) loss 3.1723 (3.0837) grad_norm 0.0000 (0.0000) [2022-10-12 11:37:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3493 (0.3447) loss 2.8153 (3.0806) grad_norm 0.0000 (0.0000) [2022-10-12 11:37:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3548 (0.3416) loss 2.9783 (3.0856) grad_norm 0.0000 (0.0000) [2022-10-12 11:38:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3300 (0.3401) loss 3.1276 (3.0861) grad_norm 0.0000 (0.0000) [2022-10-12 11:38:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3072 (0.3387) loss 3.2973 (3.0866) grad_norm 0.0000 (0.0000) [2022-10-12 11:39:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3388 (0.3377) loss 3.2424 (3.0854) grad_norm 0.0000 (0.0000) [2022-10-12 11:39:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][800/1251] eta 0:02:32 lr 0.000001 time 0.3427 (0.3372) loss 3.1203 (3.0867) grad_norm 0.0000 (0.0000) [2022-10-12 11:40:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3490 (0.3364) loss 3.1835 (3.0864) grad_norm 0.0000 (0.0000) [2022-10-12 11:41:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3446 (0.3358) loss 3.2273 (3.0894) grad_norm 0.0000 (0.0000) [2022-10-12 11:41:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3337 (0.3353) loss 3.2319 (3.0898) grad_norm 0.0000 (0.0000) [2022-10-12 11:42:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [285/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3255 (0.3350) loss 3.3850 (3.0912) grad_norm 0.0000 (0.0000) [2022-10-12 11:42:25 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 285 training takes 0:06:58 [2022-10-12 11:42:28 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.084 (3.084) Loss 0.9225 (0.9225) Acc@1 77.734 (77.734) Acc@5 95.117 (95.117) [2022-10-12 11:42:40 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.402 Acc@5 94.854 [2022-10-12 11:42:40 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 11:42:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:42:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][0/1251] eta 1:02:14 lr 0.000001 time 2.9850 (2.9850) loss 3.1912 (3.1912) grad_norm 0.0000 (0.0000) [2022-10-12 11:43:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3336 (0.3667) loss 2.9877 (3.1221) grad_norm 0.0000 (0.0000) [2022-10-12 11:43:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3165 (0.3496) loss 3.0375 (3.0936) grad_norm 0.0000 (0.0000) [2022-10-12 11:44:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3187 (0.3441) loss 3.3797 (3.0928) grad_norm 0.0000 (0.0000) [2022-10-12 11:44:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3262 (0.3413) loss 3.3569 (3.0932) grad_norm 0.0000 (0.0000) [2022-10-12 11:45:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3364 (0.3396) loss 3.1733 (3.0920) grad_norm 0.0000 (0.0000) [2022-10-12 11:46:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3577 (0.3386) loss 3.3640 (3.0916) grad_norm 0.0000 (0.0000) [2022-10-12 11:46:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3331 (0.3378) loss 3.3018 (3.0936) grad_norm 0.0000 (0.0000) [2022-10-12 11:47:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3253 (0.3370) loss 3.2226 (3.0903) grad_norm 0.0000 (0.0000) [2022-10-12 11:47:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3388 (0.3363) loss 3.0486 (3.0902) grad_norm 0.0000 (0.0000) [2022-10-12 11:48:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3414 (0.3359) loss 3.0714 (3.0902) grad_norm 0.0000 (0.0000) [2022-10-12 11:48:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3457 (0.3357) loss 2.9606 (3.0906) grad_norm 0.0000 (0.0000) [2022-10-12 11:49:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [286/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3294 (0.3354) loss 3.1015 (3.0908) grad_norm 0.0000 (0.0000) [2022-10-12 11:49:40 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 286 training takes 0:06:59 [2022-10-12 11:49:43 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.433 (3.433) Loss 0.9221 (0.9221) Acc@1 81.055 (81.055) Acc@5 94.336 (94.336) [2022-10-12 11:49:55 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.384 Acc@5 94.858 [2022-10-12 11:49:55 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 11:49:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:49:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][0/1251] eta 1:13:19 lr 0.000001 time 3.5170 (3.5170) loss 3.0369 (3.0369) grad_norm 0.0000 (0.0000) [2022-10-12 11:50:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3264 (0.3680) loss 3.3109 (3.0608) grad_norm 0.0000 (0.0000) [2022-10-12 11:51:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3207 (0.3513) loss 2.8783 (3.0643) grad_norm 0.0000 (0.0000) [2022-10-12 11:51:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3116 (0.3442) loss 3.1122 (3.0755) grad_norm 0.0000 (0.0000) [2022-10-12 11:52:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3420 (0.3418) loss 3.3416 (3.0769) grad_norm 0.0000 (0.0000) [2022-10-12 11:52:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3365 (0.3401) loss 3.1689 (3.0846) grad_norm 0.0000 (0.0000) [2022-10-12 11:53:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3095 (0.3382) loss 3.2364 (3.0838) grad_norm 0.0000 (0.0000) [2022-10-12 11:53:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3218 (0.3374) loss 3.2971 (3.0864) grad_norm 0.0000 (0.0000) [2022-10-12 11:54:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3313 (0.3368) loss 3.1104 (3.0871) grad_norm 0.0000 (0.0000) [2022-10-12 11:54:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3514 (0.3364) loss 3.0043 (3.0892) grad_norm 0.0000 (0.0000) [2022-10-12 11:55:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3171 (0.3358) loss 3.0112 (3.0900) grad_norm 0.0000 (0.0000) [2022-10-12 11:56:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3365 (0.3356) loss 2.9668 (3.0881) grad_norm 0.0000 (0.0000) [2022-10-12 11:56:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [287/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3538 (0.3354) loss 2.9193 (3.0884) grad_norm 0.0000 (0.0000) [2022-10-12 11:56:54 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 287 training takes 0:06:59 [2022-10-12 11:56:58 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.182 (3.182) Loss 0.9919 (0.9919) Acc@1 78.027 (78.027) Acc@5 94.922 (94.922) [2022-10-12 11:57:10 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.326 Acc@5 94.834 [2022-10-12 11:57:10 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 11:57:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 11:57:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][0/1251] eta 1:12:02 lr 0.000001 time 3.4551 (3.4551) loss 2.9636 (2.9636) grad_norm 0.0000 (0.0000) [2022-10-12 11:57:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3413 (0.3671) loss 3.1805 (3.0769) grad_norm 0.0000 (0.0000) [2022-10-12 11:58:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3039 (0.3502) loss 3.1195 (3.0846) grad_norm 0.0000 (0.0000) [2022-10-12 11:58:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3330 (0.3449) loss 3.0659 (3.0837) grad_norm 0.0000 (0.0000) [2022-10-12 11:59:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3202 (0.3418) loss 3.2135 (3.0863) grad_norm 0.0000 (0.0000) [2022-10-12 12:00:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3374 (0.3400) loss 3.1909 (3.0902) grad_norm 0.0000 (0.0000) [2022-10-12 12:00:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3343 (0.3386) loss 2.9489 (3.0929) grad_norm 0.0000 (0.0000) [2022-10-12 12:01:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3410 (0.3375) loss 3.2725 (3.0946) grad_norm 0.0000 (0.0000) [2022-10-12 12:01:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3207 (0.3369) loss 2.6122 (3.0929) grad_norm 0.0000 (0.0000) [2022-10-12 12:02:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3204 (0.3363) loss 3.1067 (3.0962) grad_norm 0.0000 (0.0000) [2022-10-12 12:02:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3321 (0.3359) loss 2.9188 (3.0951) grad_norm 0.0000 (0.0000) [2022-10-12 12:03:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3286 (0.3358) loss 3.3219 (3.0938) grad_norm 0.0000 (0.0000) [2022-10-12 12:03:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [288/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3318 (0.3353) loss 3.0413 (3.0910) grad_norm 0.0000 (0.0000) [2022-10-12 12:04:09 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 288 training takes 0:06:59 [2022-10-12 12:04:12 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.035 (3.035) Loss 0.8376 (0.8376) Acc@1 81.738 (81.738) Acc@5 95.410 (95.410) [2022-10-12 12:04:24 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.388 Acc@5 94.836 [2022-10-12 12:04:24 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:04:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 12:04:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][0/1251] eta 1:16:03 lr 0.000001 time 3.6476 (3.6476) loss 2.6483 (2.6483) grad_norm 0.0000 (0.0000) [2022-10-12 12:05:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][100/1251] eta 0:07:03 lr 0.000001 time 0.3522 (0.3683) loss 2.8711 (3.0636) grad_norm 0.0000 (0.0000) [2022-10-12 12:05:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3630 (0.3509) loss 2.9314 (3.0873) grad_norm 0.0000 (0.0000) [2022-10-12 12:06:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3508 (0.3446) loss 3.1264 (3.0877) grad_norm 0.0000 (0.0000) [2022-10-12 12:06:41 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3100 (0.3414) loss 3.3722 (3.0869) grad_norm 0.0000 (0.0000) [2022-10-12 12:07:14 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3521 (0.3398) loss 3.1459 (3.0856) grad_norm 0.0000 (0.0000) [2022-10-12 12:07:47 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3269 (0.3383) loss 2.9811 (3.0910) grad_norm 0.0000 (0.0000) [2022-10-12 12:08:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3496 (0.3369) loss 3.2365 (3.0890) grad_norm 0.0000 (0.0000) [2022-10-12 12:08:53 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3132 (0.3362) loss 3.0662 (3.0889) grad_norm 0.0000 (0.0000) [2022-10-12 12:09:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3235 (0.3359) loss 2.9592 (3.0906) grad_norm 0.0000 (0.0000) [2022-10-12 12:10:00 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3581 (0.3356) loss 3.0709 (3.0908) grad_norm 0.0000 (0.0000) [2022-10-12 12:10:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3233 (0.3353) loss 3.0131 (3.0921) grad_norm 0.0000 (0.0000) [2022-10-12 12:11:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [289/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3398 (0.3348) loss 3.1624 (3.0903) grad_norm 0.0000 (0.0000) [2022-10-12 12:11:23 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 289 training takes 0:06:58 [2022-10-12 12:11:26 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.232 (3.232) Loss 0.9010 (0.9010) Acc@1 81.055 (81.055) Acc@5 95.508 (95.508) [2022-10-12 12:11:38 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.362 Acc@5 94.852 [2022-10-12 12:11:38 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:11:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 12:11:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][0/1251] eta 0:50:43 lr 0.000001 time 2.4325 (2.4325) loss 3.0054 (3.0054) grad_norm 0.0000 (0.0000) [2022-10-12 12:12:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3281 (0.3660) loss 3.0341 (3.0907) grad_norm 0.0000 (0.0000) [2022-10-12 12:12:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3273 (0.3484) loss 3.0266 (3.1008) grad_norm 0.0000 (0.0000) [2022-10-12 12:13:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3623 (0.3434) loss 3.0279 (3.0968) grad_norm 0.0000 (0.0000) [2022-10-12 12:13:55 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3392 (0.3410) loss 2.9489 (3.1055) grad_norm 0.0000 (0.0000) [2022-10-12 12:14:28 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3142 (0.3394) loss 3.1762 (3.1047) grad_norm 0.0000 (0.0000) [2022-10-12 12:15:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3378 (0.3382) loss 3.0915 (3.1025) grad_norm 0.0000 (0.0000) [2022-10-12 12:15:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3441 (0.3374) loss 3.1070 (3.1021) grad_norm 0.0000 (0.0000) [2022-10-12 12:16:07 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3291 (0.3363) loss 3.0219 (3.0986) grad_norm 0.0000 (0.0000) [2022-10-12 12:16:40 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3171 (0.3357) loss 3.1359 (3.0986) grad_norm 0.0000 (0.0000) [2022-10-12 12:17:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3196 (0.3352) loss 3.0771 (3.0943) grad_norm 0.0000 (0.0000) [2022-10-12 12:17:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3347 (0.3348) loss 2.9731 (3.0933) grad_norm 0.0000 (0.0000) [2022-10-12 12:18:20 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [290/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3528 (0.3346) loss 3.0779 (3.0921) grad_norm 0.0000 (0.0000) [2022-10-12 12:18:36 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 290 training takes 0:06:58 [2022-10-12 12:18:36 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_290 saving...... [2022-10-12 12:18:37 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_290 saved !!! [2022-10-12 12:18:40 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.372 (3.372) Loss 0.8955 (0.8955) Acc@1 80.566 (80.566) Acc@5 95.215 (95.215) [2022-10-12 12:18:52 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.392 Acc@5 94.824 [2022-10-12 12:18:52 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:18:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 12:18:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][0/1251] eta 1:01:48 lr 0.000001 time 2.9646 (2.9646) loss 3.2274 (3.2274) grad_norm 0.0000 (0.0000) [2022-10-12 12:19:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3297 (0.3672) loss 3.1694 (3.0961) grad_norm 0.0000 (0.0000) [2022-10-12 12:20:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3231 (0.3497) loss 2.7628 (3.0857) grad_norm 0.0000 (0.0000) [2022-10-12 12:20:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3076 (0.3439) loss 3.0640 (3.0910) grad_norm 0.0000 (0.0000) [2022-10-12 12:21:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3486 (0.3414) loss 3.0809 (3.0870) grad_norm 0.0000 (0.0000) [2022-10-12 12:21:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3409 (0.3395) loss 3.2191 (3.0861) grad_norm 0.0000 (0.0000) [2022-10-12 12:22:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3194 (0.3383) loss 2.7605 (3.0826) grad_norm 0.0000 (0.0000) [2022-10-12 12:22:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3249 (0.3374) loss 3.1009 (3.0858) grad_norm 0.0000 (0.0000) [2022-10-12 12:23:21 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3152 (0.3366) loss 3.0782 (3.0883) grad_norm 0.0000 (0.0000) [2022-10-12 12:23:54 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3139 (0.3360) loss 3.2250 (3.0856) grad_norm 0.0000 (0.0000) [2022-10-12 12:24:27 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3397 (0.3356) loss 3.1432 (3.0870) grad_norm 0.0000 (0.0000) [2022-10-12 12:25:01 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3241 (0.3353) loss 3.2547 (3.0877) grad_norm 0.0000 (0.0000) [2022-10-12 12:25:34 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [291/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3208 (0.3351) loss 3.1885 (3.0854) grad_norm 0.0000 (0.0000) [2022-10-12 12:25:50 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 291 training takes 0:06:58 [2022-10-12 12:25:54 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.299 (3.299) Loss 0.9022 (0.9022) Acc@1 80.957 (80.957) Acc@5 95.703 (95.703) [2022-10-12 12:26:06 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.370 Acc@5 94.844 [2022-10-12 12:26:06 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:26:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 12:26:09 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][0/1251] eta 1:14:13 lr 0.000001 time 3.5598 (3.5598) loss 3.3466 (3.3466) grad_norm 0.0000 (0.0000) [2022-10-12 12:26:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3235 (0.3686) loss 3.2895 (3.0791) grad_norm 0.0000 (0.0000) [2022-10-12 12:27:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][200/1251] eta 0:06:09 lr 0.000001 time 0.3166 (0.3514) loss 2.9112 (3.0753) grad_norm 0.0000 (0.0000) [2022-10-12 12:27:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][300/1251] eta 0:05:28 lr 0.000001 time 0.3476 (0.3450) loss 3.0597 (3.0748) grad_norm 0.0000 (0.0000) [2022-10-12 12:28:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3279 (0.3415) loss 3.0213 (3.0808) grad_norm 0.0000 (0.0000) [2022-10-12 12:28:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3214 (0.3399) loss 3.0652 (3.0809) grad_norm 0.0000 (0.0000) [2022-10-12 12:29:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3312 (0.3385) loss 3.1198 (3.0817) grad_norm 0.0000 (0.0000) [2022-10-12 12:30:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3233 (0.3373) loss 3.2624 (3.0795) grad_norm 0.0000 (0.0000) [2022-10-12 12:30:35 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3446 (0.3365) loss 3.1116 (3.0766) grad_norm 0.0000 (0.0000) [2022-10-12 12:31:08 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3693 (0.3360) loss 2.9915 (3.0768) grad_norm 0.0000 (0.0000) [2022-10-12 12:31:42 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3465 (0.3358) loss 3.1555 (3.0769) grad_norm 0.0000 (0.0000) [2022-10-12 12:32:15 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3279 (0.3353) loss 3.1734 (3.0779) grad_norm 0.0000 (0.0000) [2022-10-12 12:32:48 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [292/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3241 (0.3351) loss 3.1653 (3.0802) grad_norm 0.0000 (0.0000) [2022-10-12 12:33:05 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 292 training takes 0:06:58 [2022-10-12 12:33:08 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.244 (3.244) Loss 0.8394 (0.8394) Acc@1 81.641 (81.641) Acc@5 95.410 (95.410) [2022-10-12 12:33:20 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.358 Acc@5 94.808 [2022-10-12 12:33:20 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:33:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.40% [2022-10-12 12:33:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][0/1251] eta 1:13:40 lr 0.000001 time 3.5339 (3.5339) loss 3.2700 (3.2700) grad_norm 0.0000 (0.0000) [2022-10-12 12:33:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3466 (0.3691) loss 3.2662 (3.0864) grad_norm 0.0000 (0.0000) [2022-10-12 12:34:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3367 (0.3511) loss 3.1100 (3.0728) grad_norm 0.0000 (0.0000) [2022-10-12 12:35:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3374 (0.3446) loss 3.1848 (3.0657) grad_norm 0.0000 (0.0000) [2022-10-12 12:35:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3233 (0.3417) loss 3.0080 (3.0681) grad_norm 0.0000 (0.0000) [2022-10-12 12:36:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3243 (0.3400) loss 2.9424 (3.0704) grad_norm 0.0000 (0.0000) [2022-10-12 12:36:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3594 (0.3382) loss 3.2091 (3.0794) grad_norm 0.0000 (0.0000) [2022-10-12 12:37:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3390 (0.3373) loss 3.2379 (3.0791) grad_norm 0.0000 (0.0000) [2022-10-12 12:37:49 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3608 (0.3366) loss 3.1811 (3.0782) grad_norm 0.0000 (0.0000) [2022-10-12 12:38:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3101 (0.3362) loss 2.9226 (3.0778) grad_norm 0.0000 (0.0000) [2022-10-12 12:38:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3294 (0.3358) loss 3.0198 (3.0774) grad_norm 0.0000 (0.0000) [2022-10-12 12:39:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3681 (0.3354) loss 3.1071 (3.0784) grad_norm 0.0000 (0.0000) [2022-10-12 12:40:02 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [293/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3249 (0.3351) loss 3.0706 (3.0795) grad_norm 0.0000 (0.0000) [2022-10-12 12:40:19 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 293 training takes 0:06:58 [2022-10-12 12:40:22 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.493 (3.493) Loss 0.8724 (0.8724) Acc@1 80.371 (80.371) Acc@5 95.508 (95.508) [2022-10-12 12:40:34 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.434 Acc@5 94.822 [2022-10-12 12:40:34 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 12:40:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 12:40:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][0/1251] eta 1:10:25 lr 0.000001 time 3.3780 (3.3780) loss 2.8549 (2.8549) grad_norm 0.0000 (0.0000) [2022-10-12 12:41:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3190 (0.3672) loss 2.9956 (3.0842) grad_norm 0.0000 (0.0000) [2022-10-12 12:41:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3452 (0.3505) loss 3.2514 (3.0864) grad_norm 0.0000 (0.0000) [2022-10-12 12:42:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3371 (0.3446) loss 3.0482 (3.0858) grad_norm 0.0000 (0.0000) [2022-10-12 12:42:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3315 (0.3412) loss 2.9177 (3.0890) grad_norm 0.0000 (0.0000) [2022-10-12 12:43:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3586 (0.3393) loss 2.8978 (3.0797) grad_norm 0.0000 (0.0000) [2022-10-12 12:43:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3235 (0.3381) loss 2.9746 (3.0777) grad_norm 0.0000 (0.0000) [2022-10-12 12:44:30 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3237 (0.3371) loss 3.2453 (3.0777) grad_norm 0.0000 (0.0000) [2022-10-12 12:45:03 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3139 (0.3363) loss 3.0246 (3.0780) grad_norm 0.0000 (0.0000) [2022-10-12 12:45:36 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3340 (0.3356) loss 3.0261 (3.0779) grad_norm 0.0000 (0.0000) [2022-10-12 12:46:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3217 (0.3352) loss 3.2733 (3.0778) grad_norm 0.0000 (0.0000) [2022-10-12 12:46:43 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3371 (0.3349) loss 3.0088 (3.0764) grad_norm 0.0000 (0.0000) [2022-10-12 12:47:16 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [294/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3301 (0.3345) loss 3.1994 (3.0784) grad_norm 0.0000 (0.0000) [2022-10-12 12:47:32 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 294 training takes 0:06:58 [2022-10-12 12:47:36 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.391 (3.391) Loss 0.9329 (0.9329) Acc@1 80.176 (80.176) Acc@5 93.945 (93.945) [2022-10-12 12:47:48 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.344 Acc@5 94.820 [2022-10-12 12:47:48 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 12:47:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 12:47:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][0/1251] eta 1:18:17 lr 0.000001 time 3.7553 (3.7553) loss 2.8192 (2.8192) grad_norm 0.0000 (0.0000) [2022-10-12 12:48:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][100/1251] eta 0:07:04 lr 0.000001 time 0.3558 (0.3690) loss 3.0027 (3.0760) grad_norm 0.0000 (0.0000) [2022-10-12 12:48:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][200/1251] eta 0:06:07 lr 0.000001 time 0.3334 (0.3501) loss 3.1367 (3.0775) grad_norm 0.0000 (0.0000) [2022-10-12 12:49:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3121 (0.3437) loss 3.1902 (3.0816) grad_norm 0.0000 (0.0000) [2022-10-12 12:50:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3325 (0.3410) loss 2.8818 (3.0865) grad_norm 0.0000 (0.0000) [2022-10-12 12:50:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3415 (0.3396) loss 3.0018 (3.0882) grad_norm 0.0000 (0.0000) [2022-10-12 12:51:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3426 (0.3380) loss 2.9228 (3.0869) grad_norm 0.0000 (0.0000) [2022-10-12 12:51:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3314 (0.3373) loss 3.0127 (3.0881) grad_norm 0.0000 (0.0000) [2022-10-12 12:52:17 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3531 (0.3364) loss 2.8978 (3.0847) grad_norm 0.0000 (0.0000) [2022-10-12 12:52:50 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3057 (0.3357) loss 2.9598 (3.0849) grad_norm 0.0000 (0.0000) [2022-10-12 12:53:23 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3250 (0.3353) loss 2.9215 (3.0828) grad_norm 0.0000 (0.0000) [2022-10-12 12:53:56 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3320 (0.3348) loss 2.9617 (3.0823) grad_norm 0.0000 (0.0000) [2022-10-12 12:54:29 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [295/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3450 (0.3346) loss 3.0884 (3.0839) grad_norm 0.0000 (0.0000) [2022-10-12 12:54:46 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 295 training takes 0:06:58 [2022-10-12 12:54:49 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.222 (3.222) Loss 0.8915 (0.8915) Acc@1 80.957 (80.957) Acc@5 94.824 (94.824) [2022-10-12 12:55:01 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.310 Acc@5 94.778 [2022-10-12 12:55:01 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 12:55:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 12:55:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][0/1251] eta 1:12:21 lr 0.000001 time 3.4707 (3.4707) loss 3.0212 (3.0212) grad_norm 0.0000 (0.0000) [2022-10-12 12:55:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3489 (0.3660) loss 2.9188 (3.0892) grad_norm 0.0000 (0.0000) [2022-10-12 12:56:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3539 (0.3492) loss 2.8828 (3.0799) grad_norm 0.0000 (0.0000) [2022-10-12 12:56:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3068 (0.3438) loss 3.1321 (3.0809) grad_norm 0.0000 (0.0000) [2022-10-12 12:57:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3384 (0.3413) loss 3.1495 (3.0807) grad_norm 0.0000 (0.0000) [2022-10-12 12:57:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][500/1251] eta 0:04:15 lr 0.000001 time 0.3366 (0.3399) loss 2.9746 (3.0857) grad_norm 0.0000 (0.0000) [2022-10-12 12:58:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3183 (0.3387) loss 3.1268 (3.0830) grad_norm 0.0000 (0.0000) [2022-10-12 12:58:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][700/1251] eta 0:03:06 lr 0.000001 time 0.3040 (0.3378) loss 3.0297 (3.0848) grad_norm 0.0000 (0.0000) [2022-10-12 12:59:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3394 (0.3370) loss 3.0864 (3.0835) grad_norm 0.0000 (0.0000) [2022-10-12 13:00:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3244 (0.3365) loss 3.1524 (3.0808) grad_norm 0.0000 (0.0000) [2022-10-12 13:00:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3271 (0.3359) loss 3.1349 (3.0817) grad_norm 0.0000 (0.0000) [2022-10-12 13:01:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3449 (0.3356) loss 3.0907 (3.0799) grad_norm 0.0000 (0.0000) [2022-10-12 13:01:44 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [296/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3512 (0.3352) loss 3.1882 (3.0807) grad_norm 0.0000 (0.0000) [2022-10-12 13:02:00 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 296 training takes 0:06:59 [2022-10-12 13:02:03 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.320 (3.320) Loss 0.8832 (0.8832) Acc@1 81.445 (81.445) Acc@5 94.629 (94.629) [2022-10-12 13:02:15 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.384 Acc@5 94.768 [2022-10-12 13:02:15 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 13:02:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 13:02:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][0/1251] eta 1:13:06 lr 0.000001 time 3.5065 (3.5065) loss 2.9657 (2.9657) grad_norm 0.0000 (0.0000) [2022-10-12 13:02:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3408 (0.3672) loss 3.0923 (3.1092) grad_norm 0.0000 (0.0000) [2022-10-12 13:03:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3689 (0.3491) loss 2.9692 (3.0936) grad_norm 0.0000 (0.0000) [2022-10-12 13:03:59 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3300 (0.3439) loss 2.8595 (3.0915) grad_norm 0.0000 (0.0000) [2022-10-12 13:04:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3195 (0.3413) loss 2.9648 (3.0888) grad_norm 0.0000 (0.0000) [2022-10-12 13:05:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3407 (0.3395) loss 3.0782 (3.0852) grad_norm 0.0000 (0.0000) [2022-10-12 13:05:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3254 (0.3381) loss 3.2075 (3.0855) grad_norm 0.0000 (0.0000) [2022-10-12 13:06:11 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3008 (0.3373) loss 2.6364 (3.0845) grad_norm 0.0000 (0.0000) [2022-10-12 13:06:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3166 (0.3366) loss 3.1641 (3.0821) grad_norm 0.0000 (0.0000) [2022-10-12 13:07:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3246 (0.3360) loss 3.0456 (3.0816) grad_norm 0.0000 (0.0000) [2022-10-12 13:07:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3121 (0.3356) loss 3.0483 (3.0823) grad_norm 0.0000 (0.0000) [2022-10-12 13:08:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3295 (0.3352) loss 2.9783 (3.0818) grad_norm 0.0000 (0.0000) [2022-10-12 13:08:57 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [297/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3244 (0.3348) loss 3.0149 (3.0835) grad_norm 0.0000 (0.0000) [2022-10-12 13:09:14 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 297 training takes 0:06:58 [2022-10-12 13:09:17 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.191 (3.191) Loss 0.8953 (0.8953) Acc@1 80.176 (80.176) Acc@5 96.191 (96.191) [2022-10-12 13:09:29 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.384 Acc@5 94.844 [2022-10-12 13:09:29 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 13:09:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 13:09:33 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][0/1251] eta 1:12:17 lr 0.000001 time 3.4672 (3.4672) loss 3.2636 (3.2636) grad_norm 0.0000 (0.0000) [2022-10-12 13:10:06 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][100/1251] eta 0:07:02 lr 0.000001 time 0.3195 (0.3675) loss 3.0279 (3.0767) grad_norm 0.0000 (0.0000) [2022-10-12 13:10:39 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][200/1251] eta 0:06:08 lr 0.000001 time 0.3209 (0.3502) loss 3.3205 (3.0801) grad_norm 0.0000 (0.0000) [2022-10-12 13:11:13 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][300/1251] eta 0:05:27 lr 0.000001 time 0.3312 (0.3445) loss 3.2933 (3.0829) grad_norm 0.0000 (0.0000) [2022-10-12 13:11:46 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][400/1251] eta 0:04:50 lr 0.000001 time 0.3430 (0.3414) loss 3.1527 (3.0850) grad_norm 0.0000 (0.0000) [2022-10-12 13:12:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3066 (0.3393) loss 2.9668 (3.0822) grad_norm 0.0000 (0.0000) [2022-10-12 13:12:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][600/1251] eta 0:03:39 lr 0.000001 time 0.3065 (0.3379) loss 2.9742 (3.0869) grad_norm 0.0000 (0.0000) [2022-10-12 13:13:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3217 (0.3368) loss 3.2496 (3.0826) grad_norm 0.0000 (0.0000) [2022-10-12 13:13:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3502 (0.3360) loss 2.9602 (3.0804) grad_norm 0.0000 (0.0000) [2022-10-12 13:14:31 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][900/1251] eta 0:01:57 lr 0.000001 time 0.3353 (0.3352) loss 3.1018 (3.0789) grad_norm 0.0000 (0.0000) [2022-10-12 13:15:04 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][1000/1251] eta 0:01:23 lr 0.000001 time 0.3174 (0.3346) loss 3.1923 (3.0771) grad_norm 0.0000 (0.0000) [2022-10-12 13:15:37 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3390 (0.3341) loss 3.2943 (3.0764) grad_norm 0.0000 (0.0000) [2022-10-12 13:16:10 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [298/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3050 (0.3339) loss 3.2711 (3.0780) grad_norm 0.0000 (0.0000) [2022-10-12 13:16:27 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 298 training takes 0:06:57 [2022-10-12 13:16:30 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.397 (3.397) Loss 0.9066 (0.9066) Acc@1 80.664 (80.664) Acc@5 94.824 (94.824) [2022-10-12 13:16:42 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.346 Acc@5 94.860 [2022-10-12 13:16:42 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-12 13:16:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 13:16:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][0/1251] eta 1:09:45 lr 0.000001 time 3.3459 (3.3459) loss 3.1599 (3.1599) grad_norm 0.0000 (0.0000) [2022-10-12 13:17:19 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][100/1251] eta 0:07:01 lr 0.000001 time 0.3169 (0.3664) loss 3.0045 (3.0882) grad_norm 0.0000 (0.0000) [2022-10-12 13:17:52 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][200/1251] eta 0:06:06 lr 0.000001 time 0.3411 (0.3491) loss 2.9634 (3.0885) grad_norm 0.0000 (0.0000) [2022-10-12 13:18:25 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][300/1251] eta 0:05:26 lr 0.000001 time 0.3315 (0.3435) loss 2.8858 (3.0798) grad_norm 0.0000 (0.0000) [2022-10-12 13:18:58 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][400/1251] eta 0:04:49 lr 0.000001 time 0.3195 (0.3404) loss 3.1983 (3.0754) grad_norm 0.0000 (0.0000) [2022-10-12 13:19:32 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][500/1251] eta 0:04:14 lr 0.000001 time 0.3481 (0.3393) loss 3.2009 (3.0746) grad_norm 0.0000 (0.0000) [2022-10-12 13:20:05 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][600/1251] eta 0:03:40 lr 0.000001 time 0.3350 (0.3383) loss 3.1027 (3.0735) grad_norm 0.0000 (0.0000) [2022-10-12 13:20:38 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][700/1251] eta 0:03:05 lr 0.000001 time 0.3184 (0.3375) loss 3.0566 (3.0731) grad_norm 0.0000 (0.0000) [2022-10-12 13:21:12 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][800/1251] eta 0:02:31 lr 0.000001 time 0.3470 (0.3370) loss 3.1179 (3.0743) grad_norm 0.0000 (0.0000) [2022-10-12 13:21:45 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][900/1251] eta 0:01:58 lr 0.000001 time 0.3668 (0.3363) loss 3.2659 (3.0748) grad_norm 0.0000 (0.0000) [2022-10-12 13:22:18 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][1000/1251] eta 0:01:24 lr 0.000001 time 0.3430 (0.3357) loss 3.2511 (3.0738) grad_norm 0.0000 (0.0000) [2022-10-12 13:22:51 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][1100/1251] eta 0:00:50 lr 0.000001 time 0.3666 (0.3354) loss 2.8953 (3.0749) grad_norm 0.0000 (0.0000) [2022-10-12 13:23:24 swin_tiny_patch4_window7_224] (main.py 163): INFO Train: [299/300][1200/1251] eta 0:00:17 lr 0.000001 time 0.3143 (0.3350) loss 2.8619 (3.0743) grad_norm 0.0000 (0.0000) [2022-10-12 13:23:41 swin_tiny_patch4_window7_224] (main.py 171): INFO EPOCH 299 training takes 0:06:58 [2022-10-12 13:23:41 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_299 saving...... [2022-10-12 13:23:41 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_graph_fp32/model_299 saved !!! [2022-10-12 13:23:44 swin_tiny_patch4_window7_224] (main.py 232): INFO Test: [0/49] Time 3.190 (3.190) Loss 0.9814 (0.9814) Acc@1 79.102 (79.102) Acc@5 94.336 (94.336) [2022-10-12 13:23:56 swin_tiny_patch4_window7_224] (main.py 239): INFO * Acc@1 80.358 Acc@5 94.834 [2022-10-12 13:23:56 swin_tiny_patch4_window7_224] (main.py 119): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-12 13:23:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Max accuracy: 80.43% [2022-10-12 13:23:56 swin_tiny_patch4_window7_224] (main.py 125): INFO Training time 1 day, 12:06:43