-
-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto_arima's error_action="ignore" does not work when alternative training methods are specified #312
Comments
Interesting, can you share the stacktrace? |
Performing stepwise search to minimize aic ValueError Traceback (most recent call last) ~\AppData\Local\Continuum\anaconda3\envs\pmdarima_v153\lib\site-packages\pmdarima\arima\auto.py in auto_arima(y, exogenous, start_p, d, start_q, max_p, max_d, max_q, start_P, D, start_Q, max_P, max_D, max_Q, max_order, m, seasonal, stationary, information_criterion, alpha, test, seasonal_test, stepwise, n_jobs, start_params, trend, method, maxiter, offset_test_args, seasonal_test_args, suppress_warnings, error_action, trace, random, random_state, n_fits, return_valid_fits, out_of_sample_size, scoring, scoring_args, with_intercept, sarimax_kwargs, **fit_args) ~\AppData\Local\Continuum\anaconda3\envs\pmdarima_v153\lib\site-packages\pmdarima\arima\auto.py in _post_ppc_arima(a) ValueError: Could not successfully fit ARIMA to input data. It is likely your data is non-stationary. Please induce stationarity or try a different range of model order params. If your data is seasonal, check the period (m) of the data |
Is the error just that no fits at all occurred? Perhaps this is an issue with CSS fitting in statsmodels... |
When I try to run one of those models outside of the auto_arima loop, I get: ValueError: Unknown fit method css Your method is invalid. I'll admit, the confusion is probably the default err message auto_arima is returning. The valid methods are documented here
|
Aha! Okay thanks, forgive me I thought I was specifying method as in statsmodels. However, even using the format I think you're supposed to use, I still cannot get conditional sum of squares (CSS) fitting to work. See the following: fit_kwargs = {'method': 'css'}
model = pmdarima.auto_arima(y=y, error_action="ignore", suppress_warnings=True, trace=True, **fit_kwargs) I get the same kind of trace where no models are fit. The reason I think is that the fit_args = {'method': 'css'}
model = pmdarima.auto_arima(y=y, error_action="ignore", suppress_warnings=True, trace=True, method='lbfgs', **fit_args) then I get an error for multiple values for keyword argument If such fitting methods aren't possible (at the very least MLE should be as it has a likelihood and thus an AIC score can be computed -- default is CSS-MLE, where CSS is used as a starting point for computing MLE), please let me know. |
In Python, this: do_foo(x=1) is functionally the same as: do_foo(**{"x": 1}) So by passing
CSS and CSS-MLE are not supported by the SARIMAX class. |
Hi, Thanks for your comments. Indeed it seems I missed what options would be passed to what fit method. I was looking at https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.fit.html which certainly seems to support training method 'CSS' which is much faster for long time series, though it has some issues associated with it outlined here: https://cran.r-project.org/doc/Rnews/Rnews_2002-2.pdf Forgive me if I am missing something in the API, but unless there is a means to actually specify CSS training, perhaps this means that pmdarima does not replicate R's auto.arima in a very meaningful way. Namely, see the following documentation: https://www.rdocumentation.org/packages/forecast/versions/8.11/topics/auto.arima Auto.arima's "approximation" field triggers CSS training:
It's possible that the above describes what CSS-MLE is supposed to be, but I'm unsure. From the same API linked above:
Can you verify that these are equivalent? If not, shall I close this thread and instead create a feature request? |
As I mentioned above, we use SARIMAX under the hood, not the statmodels ARIMA class. This was changed several minor versions ago for a number of reasons, not the least of which being that the statsmodels ARIMA class is functionally deprecated. No need to open a feature request; we use statsmodels, so we're at their mercy for supported optimization methods.
If the lack of CSS poses a problem for you, perhaps you should consider submitting such a feature in statsmodels' SARIMAX class. |
Or perhaps you could implement the |
Hi @tgsmith61591, Firstly, thanks for your help and correspondence. Some background on my situation may be helpful... I work for Microsoft right now and I'm looking into speeding up ARIMA training for AutoML; we use pmdarima currently. Hence my interest into looking into alternative training methods, some of which maybe faster for use in our platform. There's obviously a bit of a learning curve for me as I'm more comfortable with R than Python (come from a stats background), so forgive me for any confusion regarding holes in my Python knowledge. We are tied to pmdarima version 1.1.1, whereas I think the discussion in this thread applied to 1.5.3; this is my fault -- I hadn't changed the conda environment to the correct one when reporting the above, so forgive me for that please. Note however that indeed the options I mentioned were available in 1.1.1: https://alkaline-ml.com/pmdarima/1.1.1/_modules/pmdarima/arima/auto.html The nevertheless persists in 1.1.1. Here is a minimal example to reproduce:
This produces the following error:
I believe however that this error is just issue 294, as this error happens irrespective of the method argument (or whether it's supplied). At any rate, the need for investigating these options is just an investigation on my part regarding speed. If you have recommendations regarding the "fastest" means of training (as opposed to the most accurate), that'd be helpful and appreciated. In my experience with R, specifying CSS worked a bit better for larger datasets. There also was another approach used by McLeod in the FitAR package outlined in this paper; I haven't seen this in any Python packages. Right now I'm alternatively checking out the two state-space representations in SARIMAX for version 1.5.3, though we cannot upgrade to that bc the most recent version of scikit-learn breaks other things in our pipeline (and thus are tied to earlier versions of pmdarima until these problems are fixed on our end). Nevertheless, recommendations for fitting speed for the most recent version are appreciated for when we are able to upgrade. Thanks again for your help! |
Hi @tgsmith61591, I just wrote a notebook to compare the various fitting options for pmdarima v1.1.1 on the NYC energy data used in this notebook. I was able to get this fitting on a subset of the data (the first 10k samples) w/o issue without the errors happening above (strangely seemed to be an issue for short series or just the wine series?). I used only the first 10k samples for fitting for each of the three options for method (CSS, CSS-MLE, and MLE), and repeated fitting 10 times, storing the min, median, mean, and max times. As I suspected, CSS training is appreciably faster (7 times as much) as compared to CSS-MLE, at least on this data for my machine:
Incidentally, runtimes didn't vary much due to any processes going on in my system. I'm now running a similar timings notebook for the two statespace options in statsmodels (via pmdarima v1.5.3 -- I presume this works given that SARIMAX provides them as options) to see whether they make a difference (Hamilton vs. Harvey). See information on the two representations here: https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html I don't think there are any other obvious knobs to turn, other than optimizers which I don't want to touch b/c in my experience the quasinewton method BFGS (or its limited memory variant) works very well. The upshot of all of this is that for use for large data, it may be the case that removing the CSS training option is problematic. Even if CSS isn't a maximum likelihood fit, it at least is some fit in a principled manner which may make sense for certain applications. I'll report back here tomorrow with complete timings for v1.5.3 with the two state-space representations. I'll happily upload the notebooks, the data csv, the csv files exported by the notebooks, etc. once those runs have completed. |
Hi @tgsmith61591, Unfortunately it seems there is indeed a major degradation in terms of speed when omitting CSS from your training methods supported. I ran a similar notebook pegged to pmdarima v1.5.3 using the Hamilton vs. Harvey options for state-space model representation. I'm not sure they actually got passed to SARIMAX correctly (timings look very similar, and hopefully I'm not doing anything naive due to my relative lack of proficiency in Python). Nevertheless, the timings were about 156 seconds, worse than CSS-MLE on ARIMA, and far worse than CSS on ARIMA. Attached are the notebooks, data, timings, etc., in a zip. Feel free to verify on your own or let me know if there's a better approach with SARIMAX that you may be aware of (forgive me if I've missed it!). Shall I raise this as an issue? I personally believe this is a major degradation in performance that merits the old method being supported, but ultimately it is your package. Let me know what you think! Thanks... EDIT: There are some copy and paste typos in the markdown of the description of the attached notebooks. Apologies for this. |
I appreciate the great lengths you've gone to create some detailed timings. This is super helpful. Something you said that I want to address:
I want to point out again, it's not that we chose to stop supporting CSS/CSS-MLE; the What is really comes down to is statsmodels's use of scipy to solve likelihood models. They only support a subset of the Now to actually get to your problem... I agree that we need a way to approximate for longer series. It would seem that we are limited by the scipy/statsmodels gods in the way of not having CSS at our disposal. I don't have a great solution for that right off the top of my head; I am always open to ideas. I also think that staying pinned to an old version of the package is not the best solution. This seems like an issue that should be raised with statsmodels. I'll look into the paper you linked, but I would imagine we'll suffer the same problem there. This would be something we could get around if statsmodels allowed us to specify our own optimizer 🤔 |
Hi @tgsmith61591. I must correct a misconception you have so we're operating on the same page here. CSS is not a solver. It's an objective function (as is MLE). CSS-MLE is not an objective function, but actually two different objective functions (the solution to the first optimization routine w/ the CSS objective function is used as a starting point for the second routine w/ as MLE an objective function). The solver in the CSS case is still (at least in R anyway) one of the many various optimization routines one can use for non-linear programming (e.g., BFGS, Nelder-Mead, etc.). I recognize that you're just wrapping statsmodels, but this is a crucial distinction. See, for example, the documentation for https://stat.ethz.ch/R-manual/R-devel/library/stats/html/arima.html Note the fields https://www.nuffield.ox.ac.uk/economics/Papers/1997/w6/ma.pdf I have little idea what is going on in statsmodels world, and I appreciate your attempt to provide clarity. However given the misconception above, can you restate it? That way perhaps I can direct my timings to the statsmodels folks. Incidentally, I performed the same timings in R's auto.arima. I'm not quite sure I mimicked the exact same setup or if you do slightly different things than Hyndman does, because I wound up getting ARIMA(2, 1, 3) models rather than ARIMA(5, 1, 5). R was about 5 times faster for each of the objective function options (i.e., each of the Thanks for your help!
|
I'll refer you to my previous comment:
I don't know how to better state this. If you follow the links I littered throughout my comments, you'll see that it's not supported anymore by them, save for in ARMA models. You might also actually take a look at how the LikelihoodModel in statsmodels is minimizing the objective function. There is no way to parameterize it, currently. I feel like this thread has devolved into passive aggressive, academic one-upmanship, and nothing I'm saying is really moving it forward, so I'm going to lock it. I don't feel like this is making any progress towards solving your problem at this point, as I've stated and restated the above in at least 3 different ways now. I'll leave the following question (the root question of the matter): what is your suggestion for how we should handle approximation of long time series, given the statsmodels limitations we're bumping into? Open source depends on the collaborative input from multiple folks, and that means brainstorming solutions. I would think the easiest place for us to solve this here is by bringing this to the attention of the statsmodels developers, but I am open to suggestions (read: proposed solutions, which is not what this thread has garnered thus far). My proposed solution would be that statsmodels allows parameterizing the objective. Perhaps you have a better one. In any case, my email address can be located on my Git page. Please feel free to send me a note to continue this conversation. |
Describe the bug
If I specify
error_action="ignore"
inauto_arima
whenmethod
is anything other than the default of CSS-MLE, errors from statsmodels are no longer ignored.To Reproduce
Execute the following python code (this is using pmdarima 1.5.3)
Versions
System:
python: 3.6.10 |Anaconda, Inc.| (default, Jan 7 2020, 15:18:16) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\adgustaf\AppData\Local\Continuum\anaconda3\envs\pmdarima_v153\python.exe
machine: Windows-10-10.0.18362-SP0
Python dependencies:
pip: 20.0.2
setuptools: 46.0.0.post20200309
sklearn: 0.22.2.post1
statsmodels: 0.11.1
numpy: 1.18.1
scipy: 1.4.1
Cython: 0.29.15
pandas: 1.0.1
joblib: 0.14.1
pmdarima: 1.5.3
Expected behavior
I expect the training to complete just as when the default method of CSS-MLE is applied.
Actual behavior
Fitting fails
Additional context
None
The text was updated successfully, but these errors were encountered: