You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pmdarima assumes for y (target) array of shape=(n_obs,) and for X (exogeneous) array of shape=(n_obs, n_features), crucially with time series observations in rows.
by contrast, we have a nested pandas Series or DataFrame, with rows representing iid samples, i.e. in the classical forecasting setting a single row, and columns representing different kinds of measurements (i.e. features), the format is nested because each cell may contain no longer only primitive types but also arrays or pandas Series of shape=(n_obs,),
we chose this data container because it allows to have a unified interface for different time series related learning settings including forecasting, but also panel/longitunidal data settings such as time series classification/regression or supervised/panel forecasting.
Forecasting horizon
we allow the forecasting horizon to be an array with the steps ahead, not only the number of steps ahead (n_periods), allowing for gaps e.g. predicting only the third and fifth step ahead with fh = np.array([3, 5]), may not make a difference for ARIMA, but will do for reduction strategies
Fit method
some methods already require the forecasting horizon in fit (e.g. non-dynamic reduction to time series regression where one time series regressor is fitted for each step of the forcasting horizon)
pmdarima currently takes the forecasting horizon only in predict
should we allow the forecasting horizon when passed in fit to be then changed in predict?
Update method
pmdarima adds the new data to the old data (kept in the model) and refits the model using the previously fitted params as start params,
is the new data always in the future?
alternatively, one could use the new data to update the already fitted parameters using Kalman smoothing/filtering, ideally only the new data is used so that the data seen in training does not have to be stored in the model?
In-sample forecasts
pmdarima has a separate method for in-sample forecasts
currently, we don't have a good way to do in-sample forecasts, our forecasting horizon is currently relative to the end of the series seen in training, allowing negative values would perhaps be one option?
Forecast confidence intervals
pmdarima returns intervals from predict method via an optional kwarg changing the output of predict
should we return intervals from a separate method instead to have a stable method signature for predict?
Great discussion points, definitely some we need to consider on pmdarima side to unify interfaces, though I'm not sure how we'd handle paneled data..
pmdarima adds the new data to the old data (kept in the model) and refits the model using the previously fitted params as start params,
FWIW a lot of the design decisions we've made are due to statsmodels' idiosyncrasies. This is a design decision I wish we could have done differently, but unless we move off of statsmodels as the underlying ARIMA implementation(we've had that discussion), I'm not sure if we can do anything about it ...
Data container
Forecasting horizon
n_periods
), allowing for gaps e.g. predicting only the third and fifth step ahead withfh = np.array([3, 5])
, may not make a difference for ARIMA, but will do for reduction strategiesFit method
fit
(e.g. non-dynamic reduction to time series regression where one time series regressor is fitted for each step of the forcasting horizon)predict
fit
to be then changed inpredict
?Update method
In-sample forecasts
Forecast confidence intervals
predict
method via an optional kwarg changing the output of predictpredict
?Composition
TransformedTargetForecaster
The text was updated successfully, but these errors were encountered: