New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] SkoptForecastingCV
- hyperparameter tuning using scikit-optimize
#4580
Conversation
Hi @yarnabrina, I'd like to bring this PR to your attention and may need some help. So it seems that the soft-dependency package, Also, I'm not quite sure why the tests only fail in version 3.11. I suspect the package is not yet compatible with python 3.11. can that be the reason? |
Are you talking about these?
I'm afraid these are actual errors, and not warnings. These attributes were deprecated in These can not be ignored, must be handled.
If you see the installed dependencies for all the jobs (there is a |
Hm, I see! I think a way to deal with this would be to limit the python version or the numpy version then in the tags (not sure what is better). The tags are For estimator specific tests, we can use |
@yarnabrina I believe that a recent change was made to the pre-commit that is not being captured in this working branch. This is causing a failure in the CI, although the pre-commit ran successfully on my local machine. Would merging the current main branch into this working branch solve this issue? I am considering doing something like this:
|
FYI, I added a We should probably think how to address the malfunction of the dependency system, this happens whenever (a) the package and import string are different, and (b) a python version bound is given. |
This is weird. I was baffled as to why this estimator did not capture the Python version dependency even after specifying it in the tags before you added My suspicion (not entirely sure) is that it has to do with the placement of Anyway, it seems like the reason for the failure of the test-window (3.8) was not |
@hazrulakmal, not sure which failures you are referring to - two CI elements are failing, but that seemed unrelated to your PR, one of the sporadic failures (pytest-xdist or memory related) |
Ah, I see. Have we opened an issue for this general problem? I vaguely remember we wanted to but cannot find it. |
I think my previous sentence may have misled you, my bad. What I meant was python version, not python dependencies. Upon reviewing the test log report, I found that the failure was not due to scikit-optimize, but rather due to the detection of the Python version. The error message is as follows:
this is what makes me wonder, shouldn't this python version bound error message be handled by base class via the tags? or did I miss something here?
Agreed. The current workaround is to add Anyway, this PR is ready for review and merge. I decided to pin down both Python and NumPy to be safe and conservative - open for a better solution if there is any. The former is to avoid any failures in CI tests, while the latter is because scikit-optimize does not support NumPy version 1.24 or higher. |
Just updated from |
it seems that scikit-optimize is not being installed as part of dependencies in unix-pandas1, do you happen to know why is that the case? |
Hm, could there be a clash with soft dependencies that imply |
SkoptForecastingCV
- hyperparameter tuning using scikit-optimize
Reference Issues/PRs
Fixes #3390
See also #4188 & #3359
Continuation from #4251
What does this implement/fix? Explain your changes.
Implementing hyperparameter optimisation algorithm from
scikit-optimize
into sktime forecastingCV. This is a draft PR so works are in progress.Does your contribution introduce a new dependency? If yes, which one?
Yes, this PR introduces soft dependency,
scikit-optimize
Did you add any tests for the change?
In the process of creating tests.
Architecture design decision
Sktime Side
Decision: subclassing
BaseGridCV
rather than directly from_DelegatedForecaster
_get_fitted_params
,_update
and all the__init__
setup_fit
method has to be overwritten because theBaseGridCV
is restrive nature, it assumes that the collection of hyperparameter combinations is defined beforehand. this is not possible for iterative or sequential algorithm where the the collection of hyperparameter combinations depends on previous iterations.other things to consider:
pipeline
object.Skopt Side
there are two possibilities
gp_minimize
etcoptimizer
directly.optimizer
is a black box algorithm that tells which hyperparameter combo to search for.Decision: go with the second option so that sktime users can interact directly with the optimizer and control how it behaves.
To-do lists
Some notes for myself.
_evaluate_step
method - for fitting and evaluating a sample of hyperparameter fromoptimizer
at each iteration. keep updating the cv result for every step._run_search
method - running sequential hyperparameter optimisation. at each iteration,optimizer
will suggest the next-best hyperparameters to search._fit
method - reranking and updating the attributes