Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicted boxes or scores contain Inf/NaN. Training has diverged #4

Closed
henanwg opened this issue Oct 15, 2021 · 6 comments
Closed

Predicted boxes or scores contain Inf/NaN. Training has diverged #4

henanwg opened this issue Oct 15, 2021 · 6 comments

Comments

@henanwg
Copy link

henanwg commented Oct 15, 2021

When trying to run the training code, I encountered the following error:
image
In the previous process, my steps were:

  • Refer to the readme, install torch, detectron and other environment dependencies based on conda

  • Data preparation: vocdata in readme is used
    image

  • Modify operating errors:
    1、in the run_*.sh ,change into
    image
    2、as the picture shows(Refer to:https://giters.com/fanq15/FewX/issues/40?amp=1)
    image

  • Intercept part of the training code and try to start training
    image

then i meet the error:Predicted boxes or scores contain Inf/NaN. Training has diverged。I got the same error when I tried on two different servers (cuda10.2, cuda11.1 both torch1.8)

I tried to modify the learning rate-related settings(to different value , vary from 0.00005 to 0.95) in the configuration file, but no effect
image

@henanwg
Copy link
Author

henanwg commented Oct 15, 2021

It may be problem of detectron version, I will try detectron0.3 next

@henanwg
Copy link
Author

henanwg commented Oct 15, 2021

problem solved

@00why00
Copy link

00why00 commented Oct 21, 2021

problem solved

I am facing the same problem as you. Can you tell me how you solved it?

@henanwg
Copy link
Author

henanwg commented Oct 24, 2021

using detectron0.3,the default version is 0.5

@all-for-code
Copy link

using detectron0.3,the default version is 0.5

How to install detectron0.3 when i use the server with cuda11.1

@Ellohiye
Copy link

Ellohiye commented Jul 5, 2023

使用Detectron0.3,默认版本为0.5

当我使用带cuda11.1的服务器时如何安装Detectron0.3
我也遇到这个问题!请问您是怎么解决的呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants