Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

loss is not decrease #21

Closed
GuoQuanhao opened this issue Dec 7, 2021 · 1 comment
Closed

loss is not decrease #21

GuoQuanhao opened this issue Dec 7, 2021 · 1 comment

Comments

@GuoQuanhao
Copy link

I use 4x1080ti

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model xcit_nano_12_p8 --batch-size 64 --drop-path 0.05 --output_dir ./experiments/xcit_nano_12_p8/ --epochs 100

I got this

Epoch: [1]  [  0/390]  eta: 0:27:35  lr: 0.000001  loss: 6.8866 (6.8866)  time: 4.2456  data: 2.6132  max mem: 6128
Epoch: [1]  [ 10/390]  eta: 0:07:40  lr: 0.000001  loss: 6.9260 (6.9281)  time: 1.2121  data: 0.2377  max mem: 6128
Epoch: [1]  [ 20/390]  eta: 0:06:37  lr: 0.000001  loss: 6.9244 (6.9252)  time: 0.9165  data: 0.0001  max mem: 6128
Epoch: [1]  [ 30/390]  eta: 0:06:09  lr: 0.000001  loss: 6.9226 (6.9256)  time: 0.9227  data: 0.0001  max mem: 6128
Epoch: [1]  [ 40/390]  eta: 0:05:49  lr: 0.000001  loss: 6.9321 (6.9277)  time: 0.9192  data: 0.0001  max mem: 6128
Epoch: [1]  [ 50/390]  eta: 0:05:35  lr: 0.000001  loss: 6.9378 (6.9297)  time: 0.9253  data: 0.0001  max mem: 6128
Epoch: [1]  [ 60/390]  eta: 0:05:21  lr: 0.000001  loss: 6.9406 (6.9320)  time: 0.9267  data: 0.0001  max mem: 6128
Epoch: [1]  [ 70/390]  eta: 0:05:10  lr: 0.000001  loss: 6.9406 (6.9321)  time: 0.9265  data: 0.0001  max mem: 6128
Epoch: [1]  [ 80/390]  eta: 0:04:58  lr: 0.000001  loss: 6.9337 (6.9325)  time: 0.9315  data: 0.0001  max mem: 6128
Epoch: [1]  [ 90/390]  eta: 0:04:48  lr: 0.000001  loss: 6.9340 (6.9328)  time: 0.9339  data: 0.0001  max mem: 6128
Epoch: [1]  [100/390]  eta: 0:04:38  lr: 0.000001  loss: 6.9246 (6.9323)  time: 0.9438  data: 0.0001  max mem: 6128
Epoch: [1]  [110/390]  eta: 0:04:27  lr: 0.000001  loss: 6.9255 (6.9317)  time: 0.9371  data: 0.0001  max mem: 6128
Epoch: [1]  [120/390]  eta: 0:04:17  lr: 0.000001  loss: 6.9293 (6.9317)  time: 0.9224  data: 0.0001  max mem: 6128
Epoch: [1]  [130/390]  eta: 0:04:07  lr: 0.000001  loss: 6.9322 (6.9319)  time: 0.9176  data: 0.0001  max mem: 6128
Epoch: [1]  [140/390]  eta: 0:03:57  lr: 0.000001  loss: 6.9306 (6.9320)  time: 0.9286  data: 0.0001  max mem: 6128
Epoch: [1]  [150/390]  eta: 0:03:47  lr: 0.000001  loss: 6.9294 (6.9319)  time: 0.9332  data: 0.0001  max mem: 6128
Epoch: [1]  [160/390]  eta: 0:03:38  lr: 0.000001  loss: 6.9265 (6.9313)  time: 0.9317  data: 0.0001  max mem: 6128
Epoch: [1]  [170/390]  eta: 0:03:28  lr: 0.000001  loss: 6.9265 (6.9318)  time: 0.9253  data: 0.0001  max mem: 6128
Epoch: [1]  [180/390]  eta: 0:03:18  lr: 0.000001  loss: 6.9412 (6.9319)  time: 0.9212  data: 0.0001  max mem: 6128
Epoch: [1]  [190/390]  eta: 0:03:08  lr: 0.000001  loss: 6.9273 (6.9319)  time: 0.9292  data: 0.0001  max mem: 6128
Epoch: [1]  [200/390]  eta: 0:02:59  lr: 0.000001  loss: 6.9255 (6.9318)  time: 0.9226  data: 0.0001  max mem: 6128
Epoch: [1]  [210/390]  eta: 0:02:49  lr: 0.000001  loss: 6.9305 (6.9317)  time: 0.9275  data: 0.0001  max mem: 6128
Epoch: [1]  [220/390]  eta: 0:02:40  lr: 0.000001  loss: 6.9295 (6.9314)  time: 0.9360  data: 0.0001  max mem: 6128
Epoch: [1]  [230/390]  eta: 0:02:30  lr: 0.000001  loss: 6.9290 (6.9312)  time: 0.9446  data: 0.0001  max mem: 6128
Epoch: [1]  [240/390]  eta: 0:02:21  lr: 0.000001  loss: 6.9229 (6.9305)  time: 0.9421  data: 0.0001  max mem: 6128
Epoch: [1]  [250/390]  eta: 0:02:11  lr: 0.000001  loss: 6.9263 (6.9310)  time: 0.9283  data: 0.0001  max mem: 6128
Epoch: [1]  [260/390]  eta: 0:02:02  lr: 0.000001  loss: 6.9225 (6.9305)  time: 0.9232  data: 0.0001  max mem: 6128
Epoch: [1]  [270/390]  eta: 0:01:52  lr: 0.000001  loss: 6.9225 (6.9307)  time: 0.9220  data: 0.0001  max mem: 6128
Epoch: [1]  [280/390]  eta: 0:01:43  lr: 0.000001  loss: 6.9359 (6.9309)  time: 0.9205  data: 0.0001  max mem: 6128
Epoch: [1]  [290/390]  eta: 0:01:33  lr: 0.000001  loss: 6.9323 (6.9307)  time: 0.9232  data: 0.0001  max mem: 6128
Epoch: [1]  [300/390]  eta: 0:01:24  lr: 0.000001  loss: 6.9245 (6.9304)  time: 0.9327  data: 0.0001  max mem: 6128
Epoch: [1]  [310/390]  eta: 0:01:15  lr: 0.000001  loss: 6.9237 (6.9304)  time: 0.9280  data: 0.0001  max mem: 6128
Epoch: [1]  [320/390]  eta: 0:01:05  lr: 0.000001  loss: 6.9333 (6.9307)  time: 0.9234  data: 0.0001  max mem: 6128
Epoch: [1]  [330/390]  eta: 0:00:56  lr: 0.000001  loss: 6.9372 (6.9308)  time: 0.9362  data: 0.0001  max mem: 6128
Epoch: [1]  [340/390]  eta: 0:00:46  lr: 0.000001  loss: 6.9314 (6.9306)  time: 0.9338  data: 0.0001  max mem: 6128
Epoch: [1]  [350/390]  eta: 0:00:37  lr: 0.000001  loss: 6.9309 (6.9308)  time: 0.9319  data: 0.0001  max mem: 6128
Epoch: [1]  [360/390]  eta: 0:00:28  lr: 0.000001  loss: 6.9258 (6.9307)  time: 0.9357  data: 0.0001  max mem: 6128
Epoch: [1]  [370/390]  eta: 0:00:18  lr: 0.000001  loss: 6.9239 (6.9303)  time: 0.9330  data: 0.0002  max mem: 6128
Epoch: [1]  [380/390]  eta: 0:00:09  lr: 0.000001  loss: 6.9205 (6.9301)  time: 0.9424  data: 0.0001  max mem: 6128
Epoch: [1]  [389/390]  eta: 0:00:00  lr: 0.000001  loss: 6.9206 (6.9300)  time: 0.9389  data: 0.0001  max mem: 6128
@GuoQuanhao
Copy link
Author

worked

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant