-
Notifications
You must be signed in to change notification settings - Fork 65
[ENH] skforecast integration for time series hyperparameter tuning
#208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Omswastik-11
wants to merge
9
commits into
hyperactive-project:main
Choose a base branch
from
Omswastik-11:skforecast-integration
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+609
−1
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
ab1768d
Integrated Skforecast
Omswastik-11 bbb153f
added get_test_params method and improved docstrings
Omswastik-11 fbcbba3
Corrected no disk space problem in CI builds and added skforecast to …
Omswastik-11 3d92878
changed the versions
Omswastik-11 7c26c0c
added tags and inferred metrices in constructor
Omswastik-11 87da9db
update skforecast_forecasting.py
Omswastik-11 ca013a4
added a 'higher_is_better' parameter in constructor
Omswastik-11 3e7f96a
revert the changes in CI
Omswastik-11 59d05ce
revert depset changes and clean runner before testing
Omswastik-11 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| """ | ||
| Skforecast Integration Example - Hyperparameter Tuning for Time Series Forecasting | ||
|
|
||
| This example demonstrates how to use Hyperactive to tune hyperparameters of a | ||
| skforecast ForecasterRecursive model. It uses the SkforecastOptCV class which | ||
| provides a familiar sklearn-like API for integrating skforecast models with | ||
| Hyperactive's optimization algorithms. | ||
|
|
||
| Characteristics: | ||
| - Integration with skforecast's backtesting functionality | ||
| - Tuning of regressor hyperparameters (e.g., RandomForestRegressor) | ||
| - Uses HillClimbing optimizer (can be swapped for any Hyperactive optimizer) | ||
| - Time series cross-validation via backtesting | ||
| """ | ||
|
|
||
| import numpy as np | ||
| import pandas as pd | ||
| from skforecast.recursive import ForecasterRecursive | ||
| from sklearn.ensemble import RandomForestRegressor | ||
| from hyperactive.opt import HillClimbing | ||
| from hyperactive.integrations.skforecast import SkforecastOptCV | ||
|
|
||
| # Generate synthetic data | ||
| data = pd.Series( | ||
| np.random.randn(100), | ||
| index=pd.date_range(start="2020-01-01", periods=100, freq="D"), | ||
| name="y", | ||
| ) | ||
|
|
||
| # Define forecaster | ||
| forecaster = ForecasterRecursive( | ||
| regressor=RandomForestRegressor(random_state=123), lags=5 | ||
| ) | ||
|
|
||
| # Define optimizer | ||
| optimizer = HillClimbing( | ||
| search_space={ | ||
| "n_estimators": list(range(10, 100, 10)), | ||
| "max_depth": list(range(2, 10)), | ||
| }, | ||
| n_iter=10, | ||
| ) | ||
|
|
||
| # Define SkforecastOptCV | ||
| opt_cv = SkforecastOptCV( | ||
| forecaster=forecaster, | ||
| optimizer=optimizer, | ||
| steps=5, | ||
| metric="mean_squared_error", | ||
| initial_train_size=50, | ||
| verbose=True, | ||
| ) | ||
|
|
||
| # Fit | ||
| print("Fitting...") | ||
| opt_cv.fit(y=data) | ||
|
|
||
| # Predict | ||
| print("Predicting...") | ||
| predictions = opt_cv.predict(steps=5) | ||
| print("Predictions:") | ||
| print(predictions) | ||
| print("Best params:", opt_cv.best_params_) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
230 changes: 230 additions & 0 deletions
230
src/hyperactive/experiment/integrations/skforecast_forecasting.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,230 @@ | ||
| """Experiment adapter for skforecast backtesting experiments.""" | ||
| # copyright: hyperactive developers, MIT License (see LICENSE file) | ||
|
|
||
| import copy | ||
|
|
||
| from hyperactive.base import BaseExperiment | ||
|
|
||
|
|
||
| class SkforecastExperiment(BaseExperiment): | ||
| """Experiment adapter for skforecast backtesting experiments. | ||
|
|
||
| This class is used to perform backtesting experiments using a given | ||
| skforecast forecaster. It allows for hyperparameter tuning and evaluation of | ||
| the model's performance. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| forecaster : skforecast forecaster | ||
| skforecast forecaster to benchmark. | ||
|
|
||
| y : pandas Series | ||
| Target time series used in the evaluation experiment. | ||
|
|
||
| exog : pandas Series or DataFrame, default=None | ||
| Exogenous variable/s used in the evaluation experiment. | ||
|
|
||
| steps : int | ||
| Number of steps to predict. | ||
|
|
||
| metric : str or callable | ||
| Metric used to quantify the goodness of fit of the model. | ||
| If string, it must be a metric name allowed by skforecast | ||
| (e.g., 'mean_squared_error'). | ||
| If callable, it must take (y_true, y_pred) and return a float. | ||
|
|
||
| initial_train_size : int | ||
| Number of samples in the initial training set. | ||
|
|
||
| refit : bool, default=False | ||
| Whether to re-fit the forecaster in each iteration. | ||
|
|
||
| fixed_train_size : bool, default=False | ||
| If True, the train size doesn't increase but moves by `steps` in each iteration. | ||
|
|
||
| gap : int, default=0 | ||
| Number of samples to exclude from the end of each training set and the | ||
| start of the test set. | ||
|
|
||
| allow_incomplete_fold : bool, default=True | ||
| If True, the last fold is allowed to have fewer samples than `steps`. | ||
|
|
||
| return_best : bool, default=False | ||
| If True, the best model is returned. | ||
|
|
||
| n_jobs : int or 'auto', default="auto" | ||
| Number of jobs to run in parallel. | ||
|
|
||
| verbose : bool, default=False | ||
| Print summary figures. | ||
|
|
||
| show_progress : bool, default=False | ||
| Whether to show a progress bar. | ||
|
|
||
| higher_is_better : bool, default=False | ||
| Whether higher metric values indicate better performance. | ||
| Set to False (default) for error metrics like MSE, MAE, MAPE where | ||
| lower values are better. Set to True for metrics like R2 where | ||
| higher values indicate better model performance. | ||
| """ | ||
|
|
||
Omswastik-11 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| _tags = { | ||
| "authors": ["Omswastik-11", "JoaquinAmatRodrigo"], | ||
| "maintainers": ["Omswastik-11", "fkiraly", "JoaquinAmatRodrigo", "SimonBlanke"], | ||
| "python_dependencies": "skforecast", | ||
| } | ||
|
|
||
| def __init__( | ||
| self, | ||
| forecaster, | ||
| y, | ||
| steps, | ||
| metric, | ||
| initial_train_size, | ||
| exog=None, | ||
| refit=False, | ||
| fixed_train_size=False, | ||
| gap=0, | ||
| allow_incomplete_fold=True, | ||
| return_best=False, | ||
| n_jobs="auto", | ||
| verbose=False, | ||
| show_progress=False, | ||
| higher_is_better=False, | ||
| ): | ||
| self.forecaster = forecaster | ||
| self.y = y | ||
| self.steps = steps | ||
| self.metric = metric | ||
| self.initial_train_size = initial_train_size | ||
| self.exog = exog | ||
| self.refit = refit | ||
| self.fixed_train_size = fixed_train_size | ||
| self.gap = gap | ||
| self.allow_incomplete_fold = allow_incomplete_fold | ||
| self.return_best = return_best | ||
| self.n_jobs = n_jobs | ||
| self.verbose = verbose | ||
| self.show_progress = show_progress | ||
| self.higher_is_better = higher_is_better | ||
|
|
||
| super().__init__() | ||
|
|
||
| # Set the optimization direction based on higher_is_better parameter | ||
| higher_or_lower = "higher" if higher_is_better else "lower" | ||
| self.set_tags(**{"property:higher_or_lower_is_better": higher_or_lower}) | ||
|
|
||
| @classmethod | ||
| def get_test_params(cls, parameter_set="default"): | ||
| """Return testing parameter settings for the estimator. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| parameter_set : str, default="default" | ||
| Name of the parameter set to return. | ||
|
|
||
| Returns | ||
| ------- | ||
| params : dict or list of dict, default = {} | ||
| Parameters to create testing instances of the class | ||
| Each dict are parameters to construct an "interesting" test instance, | ||
| i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test | ||
| instance. | ||
| create_test_instance uses the first (or only) dictionary in `params` | ||
| """ | ||
| from skbase.utils.dependencies import _check_soft_dependencies | ||
|
|
||
| if not _check_soft_dependencies("skforecast", severity="none"): | ||
| return [] | ||
|
|
||
| import numpy as np | ||
Omswastik-11 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import pandas as pd | ||
| from skforecast.recursive import ForecasterRecursive | ||
| from sklearn.ensemble import RandomForestRegressor | ||
|
|
||
| forecaster = ForecasterRecursive( | ||
| regressor=RandomForestRegressor(random_state=123), | ||
| lags=2, | ||
| ) | ||
|
|
||
| y = pd.Series( | ||
| np.random.randn(20), | ||
| index=pd.date_range(start="2020-01-01", periods=20, freq="D"), | ||
| name="y", | ||
| ) | ||
|
|
||
| params = { | ||
| "forecaster": forecaster, | ||
| "y": y, | ||
| "steps": 3, | ||
| "metric": "mean_squared_error", | ||
| "initial_train_size": 10, | ||
| } | ||
| return [params] | ||
|
|
||
| @classmethod | ||
| def _get_score_params(cls): | ||
| """Return settings for testing score/evaluate functions. Used in tests only. | ||
|
|
||
| Returns a list, the i-th element should be valid arguments for | ||
| self.evaluate and self.score, of an instance constructed with | ||
| self.get_test_params()[i]. | ||
|
|
||
| Returns | ||
| ------- | ||
| list of dict | ||
| The parameters to be used for scoring. | ||
| """ | ||
| return [{"n_estimators": 5}] | ||
|
|
||
| def _evaluate(self, params): | ||
| """Evaluate the parameters. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| params : dict with string keys | ||
| Parameters to evaluate. | ||
|
|
||
| Returns | ||
| ------- | ||
| float | ||
| The value of the parameters as per evaluation. | ||
| dict | ||
| Additional metadata about the search. | ||
| """ | ||
| from skforecast.model_selection import TimeSeriesFold, backtesting_forecaster | ||
|
|
||
| forecaster = copy.deepcopy(self.forecaster) | ||
| forecaster.set_params(params) | ||
|
|
||
| cv = TimeSeriesFold( | ||
| steps=self.steps, | ||
| initial_train_size=self.initial_train_size, | ||
| refit=self.refit, | ||
| fixed_train_size=self.fixed_train_size, | ||
| gap=self.gap, | ||
| allow_incomplete_fold=self.allow_incomplete_fold, | ||
| ) | ||
|
|
||
| results, _ = backtesting_forecaster( | ||
| forecaster=forecaster, | ||
| y=self.y, | ||
| cv=cv, | ||
| metric=self.metric, | ||
| exog=self.exog, | ||
| n_jobs=self.n_jobs, | ||
| verbose=self.verbose, | ||
| show_progress=self.show_progress, | ||
| ) | ||
|
|
||
| if isinstance(self.metric, str): | ||
| metric_name = self.metric | ||
| else: | ||
| metric_name = ( | ||
| self.metric.__name__ if hasattr(self.metric, "__name__") else "score" | ||
| ) | ||
|
|
||
| # backtesting_forecaster returns a DataFrame | ||
| res_float = results[metric_name].iloc[0] | ||
|
|
||
| return res_float, {"results": results} | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| """Skforecast integration package.""" | ||
| # copyright: hyperactive developers, MIT License (see LICENSE file) | ||
|
|
||
| from hyperactive.integrations.skforecast.skforecast_opt_cv import SkforecastOptCV | ||
|
|
||
| __all__ = ["SkforecastOptCV"] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.