Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError When lags is greater than number of steps #164

Closed
hdattada opened this issue Jun 10, 2022 · 3 comments
Closed

IndexError When lags is greater than number of steps #164

hdattada opened this issue Jun 10, 2022 · 3 comments
Labels
bug Something isn't working duplicate This issue or pull request already exists

Comments

@hdattada
Copy link

I am using Skforecast for the first time and I am having trouble forecasting steps which is larger than the number of lags. Below is my sample dataframe with 13 historic values

Python Version: 3.8
skforecast version: 0.4.3

historic_data [2022-01-01           77.0] [2022-01-02           77.0] [2022-01-03           77.0] [2022-01-04           77.0] [2022-01-05           77.0] [2022-01-06           77.0] [2022-01-07           77.0] [2022-01-08           77.0] [2022-01-09           77.0] [2022-01-10           77.0] [2022-01-11           77.0] [2022-01-12           77.0] [2022-01-13           77.0]

Forecaster Object after fitting

ForecasterAutoreg 
================= 
Regressor: XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
             colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
             gamma=0, gpu_id=-1, importance_type=None,
             interaction_constraints='', learning_rate=0.300000012,
             max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
             monotone_constraints='()', n_estimators=100, n_jobs=16,
             num_parallel_tree=1, predictor='auto', random_state=123,
             reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
             tree_method='exact', validate_parameters=1, verbosity=0) 
Lags: [ 1  2  3  4  5  6  7  8  9 10 11 12] 
Window size: 12 
Included exogenous: False 
Type of exogenous variable: None 
Exogenous variables names: None 
Training range: [Timestamp('2022-01-01 00:00:00'), Timestamp('2022-01-13 00:00:00')] 
Training index type: DatetimeIndex 
Training index frequency: D 
Regressor parameters: {'objective': 'reg:squarederror', 'base_score': 0.5, 'booster': 'gbtree', 'colsample_bylevel': 1, 'colsample_bynode': 1, 'colsample_bytree': 1, 'enable_categorical': False, 'gamma': 0, 'gpu_id': -1, 'importance_type': None, 'interaction_constraints': '', 'learning_rate': 0.300000012, 'max_delta_step': 0, 'max_depth': 6, 'min_child_weight': 1, 'missing': nan, 'monotone_constraints': '()', 'n_estimators': 100, 'n_jobs': 16, 'num_parallel_tree': 1, 'predictor': 'auto', 'random_state': 123, 'reg_alpha': 0, 'reg_lambda': 1, 'scale_pos_weight': 1, 'subsample': 1, 'tree_method': 'exact', 'validate_parameters': 1, 'verbosity': 0} 
Creation date: 2022-06-10 11:16:13 
Last fit date: 2022-06-10 11:16:15 
Skforecast version: 0.4.3 

Code used for fitting and prediction

forecaster = ForecasterAutoreg(
            regressor=XGBRegressor(random_state=123, verbosity=0),
            lags=12
        )
forecaster.fit(y=historic_data_df.loc[:, 'historic_data'])
predicted = forecaster.predict(steps=6)

Error:

self = ================= 
ForecasterAutoreg 
================= 
Regressor: XGBRegressor(base_score=0.5, booster='gbtree', col...1, 'verbosity': 0} 
Creation date: 2022-06-10 11:17:54 
Last fit date: 2022-06-10 11:17:54 
Skforecast version: 0.4.3 

steps = 6, last_window = array([77.]), exog = None

    def _recursive_predict(
        self,
        steps: int,
        last_window: np.array,
        exog: np.array
    ) -> pd.Series:
        '''
        Predict n steps ahead. It is an iterative process in which, each prediction,
        is used as a predictor for the next step.
    
        Parameters
        ----------
        steps : int
            Number of future steps predicted.
    
        last_window : numpy ndarray
            Values of the series used to create the predictors (lags) need in the
            first iteration of prediction (t + 1).
    
        exog : numpy ndarray, pandas DataFrame
            Exogenous variable/s included as predictor/s.
    
        Returns
        -------
        predictions : numpy ndarray
            Predicted values.
    
        '''
    
        predictions = np.full(shape=steps, fill_value=np.nan)
    
        for i in range(steps):
>           X = last_window[-self.lags].reshape(1, -1)
E           IndexError: index -2 is out of bounds for axis 0 with size 1
@hdattada
Copy link
Author

I think I know whats happening, it would be great to get a confirmation. The training window is set by length_of_dataset - num_of_lags so in my case my dataset size was 13 and my lag was 12. So only 1 value was being added to the last window. Is that understanding right?

@JavierEscobarOrtiz
Copy link
Collaborator

Hello @hdattada,

Yes, that is a bug we found in version 0.4.3. You can read a full description in this issue.

We fixed it in version 0.5.0. We are still developing this version but you can install it from GitHub using in the shell:

pip install git+https://github.com/JoaquinAmatRodrigo/skforecast@0.5.x 

Please, let us know if this fixes your problem.

Thank you very much!

@JavierEscobarOrtiz JavierEscobarOrtiz added bug Something isn't working duplicate This issue or pull request already exists labels Jun 20, 2022
@JoaquinAmatRodrigo
Copy link
Owner

Fixed it in version 0.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants