Skip to content
This repository has been archived by the owner on Feb 28, 2024. It is now read-only.

Remove BayesSearchCV(iid=) parameter deprecated in sklearn 0.24 #988

Merged
merged 7 commits into from May 4, 2021

Conversation

kernc
Copy link
Contributor

@kernc kernc commented Jan 22, 2021

Fixes #978
Fixes #987
Fixes #927
Fixes #990
Fixes #981 via updated scikit-learn

Fixes #1006

Fixes #718

Closes #951

skopt/searchcv.py Outdated Show resolved Hide resolved
@betatim
Copy link
Member

betatim commented Jan 26, 2021

LGTM except for that one language suggestion.

kernc and others added 2 commits January 26, 2021 13:52
@andreimargeloiu
Copy link

Please merge this!

@bole1
Copy link
Contributor

bole1 commented Feb 12, 2021

my commit on this merge request. Sorry, but Ive got 2 troubles with this merge request on sklearn 0.24

@thomasfrederikhoeck
Copy link

@trinebrockhoff

…alidation._fit_and_score

* Update searchcv.py

Further improvements for kernc previous commits.  iid totally removed to prevent troubles with sklearn utils prettyprinting. prettyprinting looks for params when print to repl. Sklearn 0.24 has changed return value after cv, so i changed dict destructuring(a litttle bit dirty)

* Update searchcv.py

* Revert unrelated changes

* PEP8 format; add comment

* Revert reverting "unrelated changes"

This is required to pass tests/test_searchcv.py
with scikit-learn 0.24+.

Co-authored-by: Kernc <kerncece@gmail.com>
@kernc kernc requested a review from betatim February 23, 2021 13:41
@kernc
Copy link
Contributor Author

kernc commented Feb 23, 2021

@betatim We need to ship this fix. Afterwards, we need someone to tackle #718.

@JonasTriki
Copy link

Any update on when this PR will get merged? @betatim

@kernc
Copy link
Contributor Author

kernc commented Apr 7, 2021

I suppose not as long as it's failing the CI. I'm tackling related years-overdue #718. Give me a few more days. 😅

@kernc
Copy link
Contributor Author

kernc commented Apr 11, 2021

Looks promising!

With the latest commit, I've migrated the class to use sklearn's BaseSearchCV._run_search() API (#718). Thanks to @homarb's sketch, not a lot was needed. Unfortunate it didn't find its way in already in 2018.

@glouppe
Copy link
Member

glouppe commented Apr 11, 2021

Feel free to merge this when this looks OK and then move on with all the other patches :-) Thanks for the work @kernc !

@kernc
Copy link
Contributor Author

kernc commented Apr 12, 2021

@glouppe Would the notable change go into the 0.8.2 or the 0.9.0 whatsnew, in your opinion?

@glouppe
Copy link
Member

glouppe commented Apr 12, 2021

Time for 0.9!

@glouppe
Copy link
Member

glouppe commented Apr 16, 2021

Shall we merge? :)

@kernc
Copy link
Contributor Author

kernc commented Apr 16, 2021

I believe so. I think also a prompt release would be well accepted. This PR closes 7 issues with tens of subscribers. 😬

git log v0.8.1..HEAD shows a few more notes to add to 0.9.0 changelog.

@hsorsky
Copy link

hsorsky commented Apr 30, 2021

Is there anything blocking this? It appears to be passing CI with the author (@kernc) and a skopt contributor (@glouppe) agreeing that its ready to merge

@QuentinSoubeyran
Copy link
Contributor

QuentinSoubeyran commented May 3, 2021

I suggest adding support for multimetric scoring to BayesSearchCV to provide an interface closer to that of GridSearchCV and RandomSearchCV. Currently, trying multimetric scoring (e.g. because GridSearchCV was replaced by BayesSearchCV) yields an obscure KeyError at L411, due to the result dict naming scheme changing between multimetric and single metric.

This doesn't appear too hard, as only L411 and L471 need to be adapted. I believe this is sufficient:

  • adding a new parameter to BayesSearchCV, for instance optimized_scoring, that defaults to "score". If multimetric scoring is used, this parameter must be set to the key of the scorer to use, similar to the refit argument (see below why the refit argument is not used)
  • adding a check for the above requirement to specify the additional argument. It is possible to use None as the default value, check for multimetric scoring by using the same code as BaseSearchCV.fit(), and defaulting or raising if no value was passed. Or we simply check for "score", assuming no metric is named "score" in a multimetric setting.
  • changing L411 to local_results = all_results["mean_test_" + self.optimized_scoring][-len(params):] or local_results = all_results["mean_test_" + (self.optimized_scoring or "score")][-len(params):] if a None default is used
  • adapting L471 to handle the multimetric case. This check is easier than the one for the new parameter, because at that point self.multimetric_ -- provided by BaseSearchCV -- is available

Using refit for this purpose requires handling callable refit (at least to raise an error if unsupported) and might change the effect of that parameter compared to BayesSearchCV and RandomSearchCV, so I think having a dedicated parameter is easier to implement and provide more functionality.

If @kernc and @glouppe approve this idea, I am happy to make a small PR.

@kernc
Copy link
Contributor Author

kernc commented May 3, 2021

  • adding a new parameter to BayesSearchCV, for instance optimized_scoring

The multimetric feature is more thoroughly discussed in #797. I'm leaning to reusing refit= as far as possible. If you can make a PR that effectively fixes #797, that'd be welcome.

@QuentinSoubeyran
Copy link
Contributor

Sorry for missing #797 !
I think I have a working PR, using refit as indicated. I tried to keep as much of GridSearchCV's functionalities as I could, and to provide clear errors for the few bits that are not supported (callable refit for instance).
Since the changes depend on your PR, should I PR on your fork @kernc , or should I make a separate PR in this repo ?

@kernc
Copy link
Contributor Author

kernc commented May 4, 2021

PR on your fork

I think that would be the easiest wrt integration.

@glouppe
Copy link
Member

glouppe commented May 4, 2021

@kernc Feel free to click the green button and merge this PR. The work comes mainly from you :) (It will then be easier for everyone else to build upon...)

@kernc kernc merged commit 9461bfe into scikit-optimize:master May 4, 2021
@kernc
Copy link
Contributor Author

kernc commented May 4, 2021

@QuentinSoubeyran You should now make a fix against master directly. 👍

@kernc kernc deleted the fix_sklearn branch May 8, 2021 02:11
kernc added a commit to kernc/scikit-optimize that referenced this pull request May 8, 2021
Removed in 9461bfe
"Remove BayesSearchCV(iid=) parameter deprecated in sklearn 0.24 (scikit-optimize#988)"
glouppe pushed a commit that referenced this pull request May 8, 2021
Removed in 9461bfe
"Remove BayesSearchCV(iid=) parameter deprecated in sklearn 0.24 (#988)"
@wundermahn
Copy link

Is this merged into master? I just installed today, still a problem :(

@glouppe
Copy link
Member

glouppe commented May 17, 2021

Yes, this has been merged. Can you report the issue you have?

@grudloff
Copy link

Hello, is there a schedule for a stable release including this fix? Currently, release 0.8.1 is still broken.

@wundermahn
Copy link

@glouppe This is what I was referring to. When I use 0.8.1 and still get this problem with sklearn version 0.24.2

@donmegamuffin
Copy link

Same here, still an issue with sklearn 0.24.1 and skopt 0.8.1. Line 432 of searchcv.py throws a wobbly unpacking the zip(*out)

@wundermahn
Copy link

Can anyone post an example? I'm away from home at the moment.

@indrajitsg
Copy link

Even I am facing this issue.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Miniconda3\envs\pyga\lib\site-packages\skopt\searchcv.py", line 692, in fit
    optim_result = self._step(
  File "C:\Miniconda3\envs\pyga\lib\site-packages\skopt\searchcv.py", line 579, in _step
    self._fit(X, y, groups, params_dict)
  File "C:\Miniconda3\envs\pyga\lib\site-packages\skopt\searchcv.py", line 431, in _fit
    (test_scores, test_sample_counts,
ValueError: too many values to unpack (expected 5)

@serhii-datajob
Copy link

Hi!
Are there any plans to fix the issue with unpacking the zip(*out)?
Thanks!

@QuentinSoubeyran
Copy link
Contributor

While this PR was merged into the master branch, scikit-optimize 0.9 has not been released yet. The scikit-optimize version on PyPi (0.8.1 at the time of writing) is still incompatible with scikit-learn 0.24+.

For anyone needing the compatibility, you can install scikit-optimize directly from source with

pip install git+https://github.com/scikit-optimize/scikit-optimize.git

If you also need multimetric BayesSearch, you need to install from my fork until #1030 is merged:

pip install git+https://github.com/QuentinSoubeyran/scikit-optimize.git@multimetric

Hope this helps.

@glouppe
Copy link
Member

glouppe commented Jul 29, 2021

@kernc Could you make a release? I think that would help a lot of users :-)

@glouppe
Copy link
Member

glouppe commented Jul 29, 2021

(And in general, we need help from all of you with the project! No one is actively maintaining it anymore, unfortunately.)

@wundermahn
Copy link

wundermahn commented Jul 29, 2021 via email

@serhii-datajob
Copy link

For anyone needing the compatibility, you can install scikit-optimize directly from source with

@QuentinSoubeyran , thanks a lot!
Your version from source is not rising the deprecated iid or zip unpacking errors anymore!

@gwerbin
Copy link

gwerbin commented Aug 13, 2021

(And in general, we need help from all of you with the project! No one is actively maintaining it anymore, unfortunately.)

@glouppe is the problem fixed? Do you just need someone to publish a v0.9 release to PyPI? Or does more work need to happen still? I might be willing to answer a "call for maintainers", but it'd be helpful to know what's involved.

I'm also very interested in @QuentinSoubeyran's multimetric version, and if/how that can be merged upstream.

@jhn-nt
Copy link

jhn-nt commented Sep 11, 2021

Any news on when 9.0 with sklearn >0.24 support will be released?

@QuentinSoubeyran
Copy link
Contributor

@wundermahn My understanding is that no one is regularly taking the time to answer and fix issues, review and merge PR and make releases on PyPI

@RNarayan73
Copy link

Hello
Is the multi-metric feature included in the latest release 0.9?
It doesn't appear in the change log (https://scikit-optimize.github.io/dev/whats_new/v0.9.html)
But is in a 0.9rc1 here (#1071)
Thanks
Narayan

@QuentinSoubeyran
Copy link
Contributor

@RNarayan73
The multimetric feature is not part of 0.9.0, #1062 hasnt been merged in master yet.

The 0.9.0rc1 you mention is completely unofficial: it is a branch I made on my fork to test things out. Since the release of the official 0.9.0 by kernc, I haven't rebase my fork on the current master branch, so it is out-of-sync with scikit-optimize.
#1071 was just a proposal, and should probably be closed with the release of 0.9.0.

@RNarayan73
Copy link

Ok. Thanks for clarifying.
I notice that your branch mock0.9rc1 has the multimetric that I can install.
Narayan

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.