Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error with XGBModel and Encoders #1991

Closed
negodfre opened this issue Sep 12, 2023 · 3 comments · Fixed by #2034
Closed

[BUG] Error with XGBModel and Encoders #1991

negodfre opened this issue Sep 12, 2023 · 3 comments · Fixed by #2034
Labels
bug Something isn't working

Comments

@negodfre
Copy link

negodfre commented Sep 12, 2023

Describe the bug
I get a TypeError when trying to fit sparse data with encoders and XGBModel from the darts package.
If I remove the encoders and lags_past_covariates the bug goes away, but I'm not sure why.

To Reproduce

import pandas as pd
import numpy as np
import darts

from datetime import datetime

from darts import TimeSeries
from darts.models import XGBModel

values = np.array([3.5, 0, 6.5, 0, 0, 0, 0, 0, 0])

dates = pd.date_range(start=datetime(2021, 5, 1),
              end=datetime(2022, 1, 1),
              freq='MS')

data = {'Date':dates,
        'Values':values}

encoders = {'cyclic': {'past':['month']}}
model_kwargs = {'lags':[-6, -3, -2],
                'lags_past_covariates':[-6],
                'add_encoders':encoders
               }

model = XGBModel(**model_kwargs)

df = pd.DataFrame(data)
ts = TimeSeries.from_dataframe(df, time_col='Date', value_cols=['Values'])

model.fit(ts)

Expected behavior
I expected the model to fit without error, but instead I get the following error:
TypeError: unsupported operand type(s) for //: 'Timedelta' and 'pandas._libs.tslibs.offsets.MonthBegin'

System (please complete the following information):

  • Python version: [3.9.4]
  • darts version [0.24.0]
@negodfre negodfre added bug Something isn't working triage Issue waiting for triaging labels Sep 12, 2023
@madtoinou
Copy link
Collaborator

Hi @negodfre,

This might be a duplicate of #1875, this has been moved to the top of the backlog, we'll try to fix it as soon as possible.

@madtoinou madtoinou removed the triage Issue waiting for triaging label Sep 14, 2023
@gvas7
Copy link

gvas7 commented Oct 4, 2023

Can I ask a question here? I am also getting the same bug TypeError: unsupported operand type(s) for //: 'Timedelta' and 'pandas._libs.tslibs.offsets.MonthBegin' when trying to use the past encoders in the datetime_attribute. However, when I set retrain to False I get a result? Why would that make a difference? My backtest is just the following simple backtest:

    backtest = model.historical_forecasts(
    target_transformed, start=0.6, forecast_horizon=3, verbose=True, retrain = False
    )

A more general question on this too - if there is no time or computing power limit, is best practice retrain = True or Retrain = False?

@madtoinou
Copy link
Collaborator

Hi @gvas7,

The bug seems to occur only during training, there error when retrain=True (as historical_forecasts call fit() internally).

If you have time and computational power, I would recommend retrain=True but it ultimately depends on what you are trying to measure:

  • retrain=True will ensure the model is trained with as much data as possible, making it more "up-to-date" with respect to the period to forecast.
  • retrain=False is probably closer to real-life scenario; the model is not retrained before inference. However, depending on the training dataset used, there could be either some data leakage or the model's training dataset could be considered as "old" compared to "recent" forecasted values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants