Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical error messages: real problem or just terminal noise? #1621

Closed
fredcallaway opened this issue May 20, 2021 · 7 comments
Closed

Numerical error messages: real problem or just terminal noise? #1621

fredcallaway opened this issue May 20, 2021 · 7 comments

Comments

@fredcallaway
Copy link

I'm running a very simple model inferring a Gamma distribution over response times in a memory task.

When I run HMC (or NUTS) I get hundreds, maybe thousands of Warning: The current proposal will be rejected due to numerical error(s). messages. As a new user, coming from pymc, this was quite disconcerting. Is something terribly wrong with my model?

But upon looking further, it seems that these are showing up all over the place, even in one of the official tutorials (without any explanation). So this makes me think that some amount of numerical error is really not something to worry about. But then, maybe we shouldn't be filling up the user's scrollback history with these messages? For example, maybe a running tally should be kept and reported once along with the other diagnostics? Or maybe this should be set a @debug level message?

@devmotion
Copy link
Member

The messages are caused by AdvancedHMC which unfortunately does not respect the verbosity levels of Turing. The errors can be real and indicate a problem with the model and/or the sampler settings but they can be ignored in the initial phase when the step size is tuned - depending on the model and the initialization, it can happen that a too large step size results in a non-finite gradient of the log density.

Related: #1398, TuringLang/AdvancedHMC.jl#217, #1493

@fredcallaway
Copy link
Author

Thanks! It sounds like the AdvancedHMC devs don't want to fix this, so maybe the solution is (as @xukai92 suggested in TuringLang/AdvancedHMC.jl#217) to wrap the logging messages in Turing? Ideally Turing would do some diagnostics to figure out if the errors are safe to ignore (I have no idea they're happening in the initial tuning phase or not) and print a warning message if they aren't.

If one of the first thing a user encounters when using a library is a huge wall of cryptic warnings, they are less likely to continue using that library. So there is a very real cost here, beyond just the inconvenience of having to wrap your sample calls in a Logging.with_logger call.

@devmotion
Copy link
Member

If one of the first thing a user encounters when using a library is a huge wall of cryptic warnings, they are less likely to continue using that library.

In defense of Turing and AdvancedHMC, in many cases these warnings do not show up and they are not always noise but can help to identify problems in your model, initialization, or sampling hyperparameters. So I'm not sure if the correct approach is to remove these warnings completely by default, I am worried that it might be annoying for users as well if they do not realize that there are problems or why there are problems. Instead maybe it would be helpful to

  • explain more clearly in the message what these warnings indicate, what causes them, when they can be ignored, and how to disable them (maybe the longer explanation only shown once, similar to deprecation warnings)
  • by default ignore the warnings when the step size is tuned

I assume it would be useful to address both points in AdvancedHMC since I imagine this could be helpful also when one uses AdvancedHMC directly, without Turing. What do you think, @xukai92?

@xukai92
Copy link
Member

xukai92 commented May 23, 2021

explain more clearly in the message what these warnings indicate, what causes them, when they can be ignored, and how to disable them (maybe the longer explanation only shown once, similar to deprecation warnings)

I'm worried that explaining it in the warning thoroughly can be hard.
How about we add a page in our doc to explain this and link to that from the warning message.
We can give more detailed guides via this way.

by default ignore the warnings when the step size is tuned

That's a good idea.

@ElOceanografo
Copy link
Contributor

Silencing warnings by default during during tuning is a good idea. The running tally and summary message is also a good idea; R prints out There were 50 or more warnings (use warnings() to see the first 50) in similar situations.

In practice, I totally ignore these warnings when there are just a few of them, but if there's a cascading wall of text while sampling, I take it to mean there's something wrong with my model. Maybe capture them during sampling, then on completion print out a message like

Warning: 666 proposed samples were rejected due to numerical errors (80% of total). 
Too many numerical errors may reflect problems in the model and/or sampler (increasing the number of tuning samples
or reducing step size may help).  Type xxxxx for details, or yyyyy to disable future warnings.

@ParadaCarleton
Copy link
Member

ParadaCarleton commented Jul 17, 2022

Warning: 666 proposed samples were rejected due to numerical errors (80% of total). 
Too many numerical errors may reflect problems in the model and/or sampler (increasing the number of tuning samples
or reducing step size may help).  Type xxxxx for details, or yyyyy to disable future warnings.

I think this is a great message. I think it would also be great to break down the errors by whether they occurred in tuning or sampling.

@yebai
Copy link
Member

yebai commented Nov 13, 2022

Duplicate of #1891

@yebai yebai marked this as a duplicate of #1891 Nov 13, 2022
@yebai yebai closed this as completed Nov 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants