[ENH] `SkoptForecastingCV` - hyperparameter tuning using `scikit-optimize` #4580

hazrulakmal · 2023-05-12T11:24:20Z

Reference Issues/PRs

Fixes #3390
See also #4188 & #3359
Continuation from #4251

What does this implement/fix? Explain your changes.

Implementing hyperparameter optimisation algorithm from scikit-optimize into sktime forecastingCV. This is a draft PR so works are in progress.

Does your contribution introduce a new dependency? If yes, which one?

Yes, this PR introduces soft dependency, scikit-optimize

Did you add any tests for the change?

In the process of creating tests.

Architecture design decision

Sktime Side
Decision: subclassing BaseGridCV rather than directly from _DelegatedForecaster

to inherit usable methods like _get_fitted_params , _update and all the __init__ setup
_fit method has to be overwritten because the BaseGridCV is restrive nature, it assumes that the collection of hyperparameter combinations is defined beforehand. this is not possible for iterative or sequential algorithm where the the collection of hyperparameter combinations depends on previous iterations.

other things to consider:

ensure that implementation also supports pipeline object.

Skopt Side
there are two possibilities

(high-level abstraction) a high-level interface to various pre-configured optimizers. i.e gp_minimize etc
(low-level abstaction) a low-level interface that controls optimizer directly. optimizer is a black box algorithm that tells which hyperparameter combo to search for.

Decision: go with the second option so that sktime users can interact directly with the optimizer and control how it behaves.

To-do lists

Some notes for myself.

get the optimizer initialisation ready for each dict
_evaluate_step method - for fitting and evaluating a sample of hyperparameter from optimizer at each iteration. keep updating the cv result for every step.
_run_search method - running sequential hyperparameter optimisation. at each iteration, optimizer will suggest the next-best hyperparameters to search.
_fit method - reranking and updating the attributes
handle optimisation direction internally
pytest
docstring writing
quality_check on param_distribution
add an interface for custom iteration for each parameter space (when users give a list of dicts and want to customise the num of iterations for each parameter space)

pyproject.toml

sktime/forecasting/model_selection/_tune.py

hazrulakmal · 2023-06-10T09:19:11Z

Hi @yarnabrina, I'd like to bring this PR to your attention and may need some help.

So it seems that the soft-dependency package, scikit-optimize, is using a depreciated attribute from numpy in their codebase and so whenever tests are run, there is a warning message about it. Although there's no harm to it, it's a bit annoying to see this warning message every time I run the pytest locally, is there a way to silence the warning message or is that not a good thing to do?

Also, I'm not quite sure why the tests only fail in version 3.11. I suspect the package is not yet compatible with python 3.11. can that be the reason?

yarnabrina · 2023-06-10T13:15:20Z

Are you talking about these?

FAILED sktime/tests/test_all_estimators.py::TestAllEstimators::test_fit_updates_state[ForecastingSkoptSearchCV-1-ForecasterFitPredictHierarchicalSimple] - AttributeError: module 'numpy' has no attribute 'int'.
np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

I'm afraid these are actual errors, and not warnings. These attributes were deprecated in numpy version 1.20, and removed in version 1.24. Please take a look at this.

These can not be ignored, must be handled.

I suspect the package is not yet compatible with python 3.11. can that be the reason?

If you see the installed dependencies for all the jobs (there is a Show Dependencies step), numpy gets resolved to 1.24 only for 3.11 environment and not for 3.10 - 3.7, where it gets resolved to lower versions, where these deprecations are not yet failures and raised warnings instead of errors.

fkiraly · 2023-06-10T16:51:57Z

Hm, I see!
Also see my comment here: #4188 (comment)

I think a way to deal with this would be to limit the python version or the numpy version then in the tags (not sure what is better).

The tags are python_version and python_dependencies.

For estimator specific tests, we can use pytest.skipif with _check_estimator_deps.

sktime/forecasting/model_selection/tests/test_tune.py

hazrulakmal · 2023-06-18T16:04:33Z

@yarnabrina I believe that a recent change was made to the pre-commit that is not being captured in this working branch. This is causing a failure in the CI, although the pre-commit ran successfully on my local machine.

Would merging the current main branch into this working branch solve this issue? I am considering doing something like this:

git fetch upstream
git merge upstream/bayescv

fkiraly · 2023-06-21T00:33:21Z

FYI, I added a _check_python_version at the start of __init__ to deal with the 3.11 failures.

We should probably think how to address the malfunction of the dependency system, this happens whenever (a) the package and import string are different, and (b) a python version bound is given.

hazrulakmal · 2023-07-03T08:33:04Z

_check_python_version at the start of init to deal with the 3.11 failures.

This is weird. I was baffled as to why this estimator did not capture the Python version dependency even after specifying it in the tags before you added _check_python_version. Shouldn't Python version dependencies be checked by _check_estimator_deps, which was placed in the parent class BaseForecaster?

My suspicion (not entirely sure) is that it has to do with the placement of _check_estimator_deps in BaseForecaster. Shouldn't it be placed before calling the parent classes of BaseForecaster (i.e., before super() is called)?

Anyway, it seems like the reason for the failure of the test-window (3.8) was not SkoptForecastingCV, but rather MovingBlockBootstrapTransformer. or did I miss something here? Also, any idea why there is no test log for test-window (3.9)?

fkiraly · 2023-07-03T15:57:59Z

@hazrulakmal, not sure which failures you are referring to - two CI elements are failing, but that seemed unrelated to your PR, one of the sporadic failures (pytest-xdist or memory related)

hazrulakmal · 2023-07-03T22:53:38Z

not sure which failures you are referring to

I was referring to CI failures in 56b6db9 that failed due to python version dependency in test python 3.11 which then was resolved (thanks to you) after you made some edits here d0d97ee by adding the python dependency check manually in the __init__.

fkiraly · 2023-07-03T23:08:07Z

I was referring to CI failures in 56b6db9 that failed due to python version dependency in test python 3.11

Ah, I see.
Indeed, the base class should take care of that, but unfortunately it doesn't work the intended way if the package has an alias.

Have we opened an issue for this general problem? I vaguely remember we wanted to but cannot find it.

hazrulakmal · 2023-07-04T10:05:08Z

I think my previous sentence may have misled you, my bad. What I meant was python version, not python dependencies. Upon reviewing the test log report, I found that the failure was not due to scikit-optimize, but rather due to the detection of the Python version. The error message is as follows:

FAILED sktime/tests/test_softdeps.py::test_python_error[ForecastingSkoptSearchCV] - RuntimeError: Estimator ForecastingSkoptSearchCV has python version bound <3.11 according to tags, but does not raise an appropriate error message on init for incompatible python environments. Likely reason is that init does not call super(cls).init.

this is what makes me wonder, shouldn't this python version bound error message be handled by base class via the tags? or did I miss something here?

but unfortunately it doesn't work the intended way if the package has an alias.

Agreed. The current workaround is to add _check_soft_dependency to handle that alias manually, rather than relying on the tagging system. This is more of a patch to the tagging system's inability to handle packages with aliases - something we want to discuss further. I haven't yet raised an issue for this because the initial discussion was done at the end of the week before I left for holiday.

Anyway, this PR is ready for review and merge. I decided to pin down both Python and NumPy to be safe and conservative - open for a better solution if there is any. The former is to avoid any failures in CI tests, while the latter is because scikit-optimize does not support NumPy version 1.24 or higher.

fkiraly · 2023-07-04T16:05:30Z

Just updated from main, we should check that this does not constrain pandas < 2 on 3.8-3.10 (or other unexpected constraints).

hazrulakmal · 2023-07-05T08:50:41Z

it seems that scikit-optimize is not being installed as part of dependencies in unix-pandas1, do you happen to know why is that the case?

…bayescv

fkiraly · 2023-07-06T14:46:31Z

Hm, could there be a clash with soft dependencies that imply numpy>=1.24? As skopt implies numpy<1.24.

hazrulakmal added 5 commits May 9, 2023 21:59

base class

0c6efb8

docstring edits pre-commit

643faf0

another codestyling

1f8d59e

remove touching baseclass

f484eff

completed algo

850137d

fkiraly reviewed May 15, 2023

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

fkiraly reviewed May 15, 2023

View reviewed changes

sktime/forecasting/model_selection/_tune.py Outdated Show resolved Hide resolved

hazrulakmal added 6 commits May 30, 2023 12:41

add docstrings, typing, metric direction

a3dad35

reorder method

9150521

docsting & get test params

b058478

softdeps tests

df02c4e

pytest

de9b1f9

fixed pytests

9973f1a

yarnabrina mentioned this pull request Jun 10, 2023

[ENH] advanced hyper-parameter tuning for forecasting - interfacing hyperopt, optuna, skopt, tune-sklearn etc #4188

Open

4 tasks

hazrulakmal added 5 commits June 10, 2023 21:42

docsting edits

8eab24e

resolve deps

5484bae

limit numpy

15c729a

resolved python deps tag

372dcc3

add python limit

a480bb3

fkiraly reviewed Jun 17, 2023

View reviewed changes

sktime/forecasting/model_selection/tests/test_tune.py Outdated Show resolved Hide resolved

hazrulakmal added 6 commits June 17, 2023 11:07

add check input

b945410

add uncommited commits

2a94db3

group same steps together

eb7b8e7

add custom n_iter

be033b7

add skipif test

d704935

check precommit

41e6fce

fkiraly added 3 commits June 21, 2023 02:04

linting

cc795b7

Merge branch 'main' into pr/4580

2b30580

linting

ca556ad

hazrulakmal mentioned this pull request Jul 4, 2023

[ENH] Extend tagging system to handle dependencies with alias #4820

Closed

Merge branch 'main' into pr/4580

28a03eb

Merge branch 'main' into pr/4580

1d5cd92

hazrulakmal added 3 commits July 6, 2023 11:43

Merge remote-tracking branch 'origin/main' into bayescv

66d3810

add scikit-optimize to all_extras_ pandas2

673c1bb

Merge branch 'bayescv' of https://github.com/hazrulakmal/sktime into …

d26da62

…bayescv

hazrulakmal requested review from achieveordie and benHeid as code owners July 6, 2023 10:44

hazrulakmal added 5 commits July 6, 2023 20:21

exclude skopt from main test add check_estimator

9f302d2

Merge remote-tracking branch 'origin/main' into bayescv

63acf89

update soft dep handling

5b3b78e

remove unusable test

7586d7c

add skoptcv to api docs

7df8ee6

fkiraly added implementing algorithms Implementing algorithms, estimators, objects native to sktime module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Jul 13, 2023

fkiraly merged commit 5e16db8 into sktime:main Jul 13, 2023
24 checks passed

fkiraly changed the title ~~[ENH] SkoptForecastingCV Integration~~ [ENH] SkoptForecastingCV - hyperparameter tuning using scikit-optimize Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] `SkoptForecastingCV` - hyperparameter tuning using `scikit-optimize` #4580

[ENH] `SkoptForecastingCV` - hyperparameter tuning using `scikit-optimize` #4580

hazrulakmal commented May 12, 2023 •

edited

hazrulakmal commented Jun 10, 2023

yarnabrina commented Jun 10, 2023

fkiraly commented Jun 10, 2023

hazrulakmal commented Jun 18, 2023

fkiraly commented Jun 21, 2023

hazrulakmal commented Jul 3, 2023

fkiraly commented Jul 3, 2023

hazrulakmal commented Jul 3, 2023 •

edited

fkiraly commented Jul 3, 2023

hazrulakmal commented Jul 4, 2023 •

edited

fkiraly commented Jul 4, 2023

hazrulakmal commented Jul 5, 2023

fkiraly commented Jul 6, 2023

[ENH] SkoptForecastingCV - hyperparameter tuning using scikit-optimize #4580

[ENH] SkoptForecastingCV - hyperparameter tuning using scikit-optimize #4580

Conversation

hazrulakmal commented May 12, 2023 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

Did you add any tests for the change?

Architecture design decision

To-do lists

hazrulakmal commented Jun 10, 2023

yarnabrina commented Jun 10, 2023

fkiraly commented Jun 10, 2023

hazrulakmal commented Jun 18, 2023

fkiraly commented Jun 21, 2023

hazrulakmal commented Jul 3, 2023

fkiraly commented Jul 3, 2023

hazrulakmal commented Jul 3, 2023 • edited

fkiraly commented Jul 3, 2023

hazrulakmal commented Jul 4, 2023 • edited

fkiraly commented Jul 4, 2023

hazrulakmal commented Jul 5, 2023

fkiraly commented Jul 6, 2023

[ENH] `SkoptForecastingCV` - hyperparameter tuning using `scikit-optimize` #4580

[ENH] `SkoptForecastingCV` - hyperparameter tuning using `scikit-optimize` #4580

hazrulakmal commented May 12, 2023 •

edited

hazrulakmal commented Jul 3, 2023 •

edited

hazrulakmal commented Jul 4, 2023 •

edited