Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. #283

Closed
1 of 2 tasks
jiyuwangbupt opened this issue Jan 12, 2023 · 17 comments
Labels
bug Something isn't working non-reproducible Bug is not reproducible Stale wontfix This will not be worked on

Comments

@jiyuwangbupt
Copy link

jiyuwangbupt commented Jan 12, 2023

Search before asking

  • I have searched the YOLOv8 issues and found no similar bug report.

YOLOv8 Component

No response

Bug

UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.

class_loss is inf, box_loss is 0, dfl_loss is 0.

By the way, this environment runs fine on yolov5 without warnings. The configuration of the YOLOv8 environment is based on the yolov5 environment, and then use pip install ultralytics (no errors were generated during the period).

Environment

Ultralytics YOLOv8.0.3 🚀 Python-3.8.5 torch-1.11.0+cu102 CUDA:0 (Tesla P40, 12288MiB)

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@Laughing-q
Copy link
Member

Laughing-q commented Jan 12, 2023

@jiyuwangbupt hi I want to know the loss you said is val loss or train loss?

EDIT: if it's val loss, then this should be solved in #279 which we'll merge it to main later today.

@Laughing-q
Copy link
Member

@jiyuwangbupt also actually I reproduced the UserWarning you mentioned with the code of this PR #279 , and my training and val losses look good.

@mehran66
Copy link
Contributor

I see the same warning running training on coco:
nohup yolo task=detect mode=train model=yolov8x.pt epochs=300 batch=64 cache=True workers=20 name='pt_od_4' device='0,1,2,3,4,5,6,7' > ./runs/pt_od_4.out &

@jiyuwangbupt
Copy link
Author

jiyuwangbupt commented Jan 13, 2023

@jiyuwangbupt hi I want to know the loss you said is val loss or train loss?

EDIT: if it's val loss, then this should be solved in #279 which we'll merge it to main later today.

training loss
image

yolo task=init --config-name helmethyp.yaml --config-path /nfs/volume-622-1/lanzhixiong/project/smoking/code/yolov8/
yolo task=detect mode=train model=yolov8n.yaml data=/nfs/volume-622-1/lanzhixiong/project/smoking/code/yolov8/helmet640.yaml device=0 batch=20 workers=0 --config-name=helmethyp.yaml --config-path=/nfs/volume-622-1/lanzhixiong/project/smoking/code/yolov8

@Laughing-q
Copy link
Member

Laughing-q commented Jan 13, 2023

@mehran66 @jiyuwangbupt hey guys, can you try to replace the following line to self.optimizer.step()? and restart training to check if the losses are good. Thanks!

self.scaler.step(self.optimizer)

@RuoLv
Copy link

RuoLv commented Jan 19, 2023

@mehran66 @jiyuwangbupt hey guys, can you try to replace the following line to self.optimizer.step()? and restart training to check if the losses are good. Thanks!

self.scaler.step(self.optimizer)

It still not works for me. I have the same issue as this. And use the datasets, which is directly download from roboflow, his structure have some differences from previous yolo version.

@RuoLv
Copy link

RuoLv commented Jan 19, 2023

@mehran66 @jiyuwangbupt hey guys, can you try to replace the following line to self.optimizer.step()? and restart training to check if the losses are good. Thanks!

self.scaler.step(self.optimizer)

It still not works for me. I have the same issue as this. And use the datasets, which is directly download from roboflow, his structure have some differences from previous yolo version.

My env:
(yolo) C:\Users\Yewen\Desktop\workspace\yolo\rok>yolo checks
Ultralytics YOLOv8.0.11 Python-3.10.9 torch-1.13.1+cu116 CUDA:0 (NVIDIA GeForce RTX 2070 Super, 8192MiB)
Setup complete (16 CPUs, 31.8 GB RAM, 141.6/934.7 GB disk)

@Haibara-z
Copy link

I have the same problem and it still not works

@GraBerry
Copy link

I have the same problem,but it look like harmless, I don't know what impact it will have,please tell me when you get result

@glenn-jocher
Copy link
Member

@GraBerry @Zhu000 this appears sometimes in certain versions of torch, but it's just a warning and you can ignore it. It's not showing up in later versions of torch usually, i.e. you can see our Colab notebook trianing with torch==1.13.1 does not show the warning and torch==2.0.0 doesn't either:
https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb

Screenshot 2023-03-26 at 15 25 21

@glenn-jocher glenn-jocher added wontfix This will not be worked on non-reproducible Bug is not reproducible labels Mar 26, 2023
@AdamMayor2018
Copy link

tried 1.13 , shows the exact same warning.

@glenn-jocher
Copy link
Member

@AdamMayor2018 @Zhu000 This warning can be safely ignored as it does not affect the training process. It appears in some older versions of Pytorch, but it is not showing up in later versions like torch==1.13.1 or torch==2.0.0. So, if you have the latest version of Pytorch, you should not see this warning.

@YIN319
Copy link

YIN319 commented May 10, 2023

I have the same problem, try this(ultralytics\yolo\cfg\default.yaml amp: False)

@glenn-jocher
Copy link
Member

@YIN319 thank you for reporting your issue. The issue you are experiencing is related to the lr_scheduler and optimizer order, which can sometimes show up as a warning in PyTorch versions prior to 1.1.0. However, this warning does not affect the training process and can be safely ignored.

One possible workaround to suppress the warning is to set amp: False in your ultralytics\yolo\cfg\default.yaml file.

Please let us know if you have any further questions or concerns.

@apiszcz
Copy link

apiszcz commented May 19, 2023

+1 same warnings

@glenn-jocher
Copy link
Member

@apiszcz thank you for raising this issue. It appears that you are experiencing the lr_scheduler warning in PyTorch. This warning can safely be ignored as it does not affect the training process.

However, if the warning is bothersome, one possible workaround is setting amp: False in your ultralytics\yolo\cfg\default.yaml file to suppress the warning.

Please let us know if you have any further questions or concerns related to this issue, and we will be happy to assist you further.

@github-actions
Copy link

github-actions bot commented Aug 9, 2023

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Aug 9, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-reproducible Bug is not reproducible Stale wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

10 participants