Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

Closed
ngupta23 opened this issue Jan 26, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@ngupta23
Copy link
Contributor

Describe the bug
It seems that the number of splits calculation ignores the initial_window. In ForecastingGridSearchCV since the window data gets appended to the initial_window, the number of splits should be far less than the one reported in the example below

To Reproduce

y = load_airline()

fh = np.arange(1, 13)
y_train, y_test = temporal_train_test_split(y, test_size=len(fh))
print(y.shape, y_train.shape[0], y_test.shape[0])

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5), start_with_window=True)

regressor = RandomForestRegressor(random_state=42)
forecaster = ReducedRegressionForecaster(regressor=regressor, strategy="recursive")

param_grid = {"regressor__n_estimators": [200, 300]}

gscv = ForecastingGridSearchCV(forecaster=forecaster, cv=cv, param_grid=param_grid, verbose=1)
gscv.fit(y_train)
(144,) 132 12
Fitting 122 folds for each of 2 candidates, totalling 244 fits

Expected behavior

With an internal window length of 10 being used, I would have expected the following

Number of folds
= len(validation data) - window_length
= len(y_train) - initial_window - window_length
= 132 - 66 - 10
= 56 folds (expected) but output shows 122 folds.

Below image is for illustration only (does not match the exact number of points in this example)

image

Versions

System: python: 3.6.12 |Anaconda, Inc.| (default, Sep 9 2020, 00:29:25) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\Nikhil\.conda\envs\sktime_dev\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies:
pip: 20.3.3
setuptools: 51.0.0.post20201207
sklearn: 0.24.0
sktime: 0.5.1
statsmodels: 0.12.1
numpy: 1.19.4
scipy: 1.5.4
Cython: 0.29.17
pandas: 1.1.5
matplotlib: 3.3.3
joblib: 1.0.0
numba: 0.52.0
pmdarima: 1.8.0
tsfresh: 0.17.0

@ngupta23 ngupta23 added the bug Something isn't working label Jan 26, 2021
@mloning
Copy link
Contributor

mloning commented Jan 26, 2021

Yes that's right, good catch! Would appreciate a PR!

@ngupta23
Copy link
Contributor Author

OK, let me give it a shot and submit a PR.

@mloning
Copy link
Contributor

mloning commented Mar 23, 2021

Closed by #690

@mloning mloning closed this as completed Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants