New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG (?): box-cox related BracketError
-s downstream in sktime
#18761
Comments
@mdhaber, not sure whether the root of the problem is in The upstream change condition is fully localized within |
Oops, right. You wrote:
Do you mean that
(My guess is that you mean 1; the SciPy documentation was confused about this point before. In the stack trace I see Do you know what the optimal value of the parameter ( |
@mdhaber, no, I mean a bracket (in the sense of 2), not bounds. You can see me grappling with the confusion in this post: sktime/sktime#4770 (comment) The |
OK. The change in behavior is due to #17704. Is is possible for you to post the values of the arguments that |
Do you mean the current (both result in the same error, albeit with a different traceback in |
Ultimately what I want to do locally is replicate the call to |
uh-oh, I just spotted that the array passed contains negative values, which could be upsetting the box-cox transform. Passing positive values does not cause the error. Still I wonder why it did not result in an exception, pre-1.11.0. Arguments of import numpy as np
x = np.array([0.24835708, -0.06913215, 0.32384427, 0.76151493, -0.11707669, -0.11706848, 0.78960641, 0.38371736, -0.23473719, 0.27128002])
method = "mle"
brack = None |
Sorry, is that the call to SciPy's Yeah, with that data, |
Ok here is what I have: import numpy as np
from scipy import stats, optimize
x = np.array([0.24835708, -0.06913215, 0.32384427, 0.76151493, -0.11707669, -0.11706848, 0.78960641, 0.38371736, -0.23473719, 0.27128002])
from scipy.stats import boxcox_llf
def _eval_mle(lmb, data):
# function to minimize
return -boxcox_llf(lmb, data)
optimize.brent(_eval_mle, brack=(-2, 2), args=(x,)) # 8.472135811722177 |
Well, I get |
I think it doesn't, because there are no infs or nans - only negative values. Just tested it, I think it doesn't in any of the recent |
Right, in SciPy 1.11.0 you get that. I mean previously. I'm just checking that I have the call to lmbda = optimize.brent(_eval_mle, brack=(-2, 2), args=(x,)) # 8.472135811722177
_eval_mle(lmbda, x) # nan So it looks like that "optimal" |
yes, the garbage is due to negative values in the input. The reason that this was not spotted is that this just was test data in tests for interface conformance, return type and such - the resulting lambda is not tested for sensibility but only for the right type. Now, here's the prize question: why did it produce 8.47 in the first place, making it look like non-garbage? |
Negative values cause NaNs in from scipy import stats
x = np.array([0.24835708, -0.06913215, 0.32384427, 0.76151493, -0.11707669, -0.11706848, 0.78960641, 0.38371736, -0.23473719, 0.27128002])
stats.boxcox_normmax(x)
# ValueError: array must not contain infs or NaNs
# because
stats.boxcox_llf(8.47, x) # nan |
ah, yes, it does |
Haha I have so many good answers for this : ) but I'll keep it tame: because the bug reported by gh-14858 was not fixed until gh-17704. Does this resolve the issue? Thanks for the fast iteration. |
Yes, I think it does, we will have to see whether enforcing positive values fixes this in all occurrences throughout the tests. Here is my attempt at a fix: sktime/sktime#4770 Btw, some feedback from this: the error message is not very user friendly - instead of complaining about the intermediate result (there are infs and nans! but there aren't in any user-visible object), |
Yes. That could be fixed by catching and re-raising. Care to submit a PR? If not, I can do that for you. |
Sure, why not: #18764 |
What happens locally? |
Locally, with two elements in from scipy import stats
x = [1, 2]
y = [0, np.nan]
stats.pearsonr(x, y) # PearsonRResult(statistic=nan, pvalue=1.0) as it should, if it's consistent with most other stats functions, whereas from scipy import stats
x = [1, 2, 3]
y = [0, np.nan, 1]
stats.pearsonr(x, y) # ValueError: array must not contain infs or NaNs With your test, there are two elements. |
Ok, @mdhaber, so can you confirm your suggestion: I should just make the negative examples length 3? |
happy for this to be closed, as this turned out not to be bug in |
Yup, I think that's right. When the bug in SciPy was fixed, it revealed the issue in the test input. Thanks @fkiraly! |
Fixes #4769 by using `scipy`'s on-board box-cox-fitter for methods `"mle"` and `"pearsonr"` and enforcing positive values. The reason for the failure was use of negative values in some interface test cases, which previously would produce garbage results but `scipy` would not complain. As of 1.11.0, it complains (albeit with a confusing error message that makes it hard to track down that it is in fact from negative values in some test cases). See further discussion here: scipy/scipy#18761 This is now solved by introducing an `enforce_positive` argument to `BoxCoxTransformer`, which ensures positive values before fitting. This can be turned off to obtain the previous behaviour, but for positive values it will be the same (although with an additional `np.sign` and `np.abs` calculation that should be fast). Also makes the following changes, while we're cleaning this up: * The `sktime` box-cox-fitter in `BoxCoxTransformer` was not using `scipy`'s since the latter did not have settable bounds when the former was originally implemented. I have hence replaced it with `scipy`'s own box-cox-fitter and using a bounded optimizer if bounds are passed, from `scipy` 1.7.0 on where it exists in the form we need. Pre-1.7.0, I've left the old behaviour. * improved docstring - general improvements, removed some wrong math statements, made clear what the choice of method means and when it interfaces `scipy` * added a `"fixed"` method, which allows to use the transformer with a fixed parameter - or tune it via grid search. * input checks on `method` Note: the fix is behaviour changing, but imo does not require deprecation because it changes some behaviour from buggy/misleading (only in the sub-case where negative values were passed) to sensible.
Describe your issue.
Since 1.11.0, the
BoxCoxTransformer
insktime
is failing with aBracketError
, which ultimately comes fromoptimize.brent
andoptimize.fminbound
.We would appreciate help with diagnosing the issue, as none of the more obvious fixes have helped.
Further details:
boxcox_normmax
inscipy
. Replacing it with currentboxcox_normmax
and/or using customoptimize.minimize_scalar
does not fix the error, see [BUG] fixBoxCoxTransformer
failure afterscipy
1.11.0 sktime/sktime#4770.< 1.11.0
(1.0, 2.0)
) does not seem to help either.PS: the docstrings in the current
main
version were not too clear about the "should", i.e., when shouldscipy
be called and when a de-novo implementation. The PR sktime/sktime#4770 also fixes that.Reproducing Code Example
With
sktime 0.20.0
or earlier versions or currentmain
, andscipy 1.11.0
:sktime
bug report: sktime/sktime#4769Error message
BracketError: The algorithm terminated without finding a valid bracket. Consider trying different initial points.
Full traceback:
SciPy/NumPy/Python version and system information
The text was updated successfully, but these errors were encountered: