[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

ngupta23 · 2021-01-26T12:10:57Z

Describe the bug
It seems that the number of splits calculation ignores the initial_window. In ForecastingGridSearchCV since the window data gets appended to the initial_window, the number of splits should be far less than the one reported in the example below

To Reproduce

y = load_airline()

fh = np.arange(1, 13)
y_train, y_test = temporal_train_test_split(y, test_size=len(fh))
print(y.shape, y_train.shape[0], y_test.shape[0])

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5), start_with_window=True)

regressor = RandomForestRegressor(random_state=42)
forecaster = ReducedRegressionForecaster(regressor=regressor, strategy="recursive")

param_grid = {"regressor__n_estimators": [200, 300]}

gscv = ForecastingGridSearchCV(forecaster=forecaster, cv=cv, param_grid=param_grid, verbose=1)
gscv.fit(y_train)

(144,) 132 12
Fitting 122 folds for each of 2 candidates, totalling 244 fits

Expected behavior

With an internal window length of 10 being used, I would have expected the following

Number of folds
= len(validation data) - window_length
= len(y_train) - initial_window - window_length
= 132 - 66 - 10
= 56 folds (expected) but output shows 122 folds.

Below image is for illustration only (does not match the exact number of points in this example)

Versions

System: python: 3.6.12 |Anaconda, Inc.| (default, Sep 9 2020, 00:29:25) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\Nikhil\.conda\envs\sktime_dev\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies:
pip: 20.3.3
setuptools: 51.0.0.post20201207
sklearn: 0.24.0
sktime: 0.5.1
statsmodels: 0.12.1
numpy: 1.19.4
scipy: 1.5.4
Cython: 0.29.17
pandas: 1.1.5
matplotlib: 3.3.3
joblib: 1.0.0
numba: 0.52.0
pmdarima: 1.8.0
tsfresh: 0.17.0

The text was updated successfully, but these errors were encountered:

mloning · 2021-01-26T17:06:02Z

Yes that's right, good catch! Would appreciate a PR!

ngupta23 · 2021-01-26T21:37:19Z

OK, let me give it a shot and submit a PR.

mloning · 2021-03-23T10:09:30Z

Closed by #690

ngupta23 added the bug Something isn't working label Jan 26, 2021

mloning closed this as completed Mar 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

ngupta23 commented Jan 26, 2021

mloning commented Jan 26, 2021

ngupta23 commented Jan 26, 2021

mloning commented Mar 23, 2021

[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

[BUG] Number of folds reported in ForecastingGridSearchCV seems to be incorrect #632

Comments

ngupta23 commented Jan 26, 2021

mloning commented Jan 26, 2021

ngupta23 commented Jan 26, 2021

mloning commented Mar 23, 2021