-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] StatsForecastAutoArima
behaves differently when using sktime
's evaluate
vs. temporal_train_test_split
#5894
Comments
Imo a bug, clearly - but not so clear where it is located. To me, it looks like the What is odd is that the bug depends on which estimator is used, i.e., the condition seems to be a condition of |
a wild guess: it could be that sth produces incorrect indices internally - either the FYI @yarnabrina, any better or other guess? |
I don't use `evaluate` myself, so very not much familiar with its codebase,
but if direct use is working it's unlikely to be in adapter. I don't have
any guess to debug, but I'd try to see if it's statsforecast specific or if
it is happening with few other estimators as well, possibly with some
common tag or etc.
I use CV of course, so it's possibly affecting my work as well. I'll take a
detailed look tomorrow to see if I can find any pattern or any other
guesses.
…On Mon, Feb 5, 2024 at 9:45 PM Franz Király ***@***.***> wrote:
a wild guess: it could be that sth produces incorrect indices internally -
either the fh, or the adapter estimator. Then, loc or integer based
indexing might retrieve - from the data - or request - from a prediction -
the wrong values.
FYI @yarnabrina <https://github.com/yarnabrina>, any better or other
guess?
—
Reply to this email directly, view it on GitHub
<#5894 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJMCQBF47VPDEMPJEW5P7GLYSEATHAVCNFSM6AAAAABC2NIEZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGM2TOMBQGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@yarnabrina, |
I'd like to work on this issue! I've found the cause of the problem, will make a fix and PR tomorrow perhaps. |
nice! |
@Abhay-Lejith, may I kindly ask where we left off with #6029? Did you abandon working on this? Which is fine if it were the case, just want to understand the status. |
Yes, I am no longer working on that issue. |
did we understand the "deeper reason" for this? Since it looks like you had a fix, but it broke other parts of |
Sorry, I did not investigate the matter further. The fix was causing other errors because I changed the splitting logic in the evaluate method, and this seemed to be causing problems elsewhere. Since you had told me the issue was a lot deeper and harder to fix, I did not look into it after that. |
Can you explain perhaps why you changed things in |
I modified how |
Ahhhh, yes, thanks for explaining it again to me. Summarizing for the next person looking at this: the problem is that in There are two approaches to fix this:
@Abhay-Lejith took approach no.2. Changing this to be exactly the same (as done manually by @Abhay-Lejith upon correct diagnosis in the PR #6029) is equivalent to removing the I think that's where our investigation got stuck, i.e., why those specific other places in the code break - and that would be the entry point for further work on approach no.2. That is:
|
@yarnabrina, for approach no.1, we would have to add subsetting in the |
It's a bit difficult to specifically filter with However I am wondering whether it would make sense to do this subsetting in base class itself or not, in case other forecasters also may be doing such things but it was never captured? May be in Also, after I make the change in the adapter how do we test if it's fixed or not? Should we add a test for statsforecast models or for |
@yarnabrina, my attempt to do this in the base class, conditional on a tag, is here: #6044 It is incomplete, but you are welcome to have a look at it. Indeed I am also suspecting that this may be happening elsewhere, too, though it is hard to check, since it impacts primarily logic and is not captured by type checks.
Good question - both, perhaps? We can also magic-mock the estimator. |
Bug report from discussion forum.
Discussed in #5893
Originally posted by markross1 February 5, 2024
Hello,
I've been experimenting with SKTime's
evaluate
function to cross-validate forecasting models. I've encountered differing results when using the Statsforecast AutoArima output fromevaluate
compared to results obtained using thetemporal_train_test_split
function.The discrepancy becomes more pronounced when I include an exogenous regressor. In my tests, the SKTime AutoArima model behaves as expected, but the Statsforecast AutoArima model produces unexpected results. However, when using Statsforecast AutoArima with
temporal_train_test_split
, I get results that match closely with SKTime's AutoArima. This issue seems to specifically arise with the Statsforecast AutoArima with theevaluate
function and an exogenous regressor.SKTime with evaluate:
StatsForecast with evaluate:
StatsForecast with train-test-split
Does anyone have insights on why the Statsforecast AutoArima model behaves differently with the evaluate function with exogenous regressors in this example?
Below is an example that should demonstrate this (unless this is computer-specific somehow):
Thank you in advance for your help.
-Marko
The text was updated successfully, but these errors were encountered: