Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow get_config to return the same config more than once #487

Merged
merged 3 commits into from Jan 23, 2023

Conversation

mseeger
Copy link
Collaborator

@mseeger mseeger commented Jan 4, 2023

Issue #, if available: 415

Description of changes:
Introduces flag allow_duplicates to searchers, which allows to return the same config more than once. Also contains a new test on searchers, whether they properly implement allow_duplicates=False (the default).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@@ -122,6 +127,11 @@ def __init__(

self._hp_ranges = make_hyperparameter_ranges(self.config_space)
self._excl_list = ExclusionList.empty_list(self._hp_ranges)
self._allow_duplicates = allow_duplicates
# Maps ``trial_id`` to configuration. This is used to blacklist
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a technicality:

  • If not allow_duplicates, then _excl_list collects all configs suggested, so none is suggested 2x
  • If allow_duplicates, then _excl_list only collects configs for which the trial failed. In this case, we need to know the config for a trial, because evaluation_failed only gets the trial_id.

The same code is used in other places, same comment applies.

def evaluation_failed(self, trial_id: str):
if self._allow_duplicates and trial_id in self._config_for_trial_id:
# Blacklist this configuration
self._excl_list.add(self._config_for_trial_id[trial_id])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only get trial_id here, but need to know the config. This is why _config_for_trial is needed, only if allow_duplicates.

@mseeger mseeger requested review from wistuba and removed request for wesk January 11, 2023 14:44
Comment on lines 150 to 167
def register_pending(
self,
trial_id: str,
config: Optional[dict] = None,
milestone: Optional[int] = None,
):
if self._allow_duplicates and trial_id not in self._config_for_trial_id:
if config is not None:
self._config_for_trial_id[trial_id] = config
else:
logger.warning(
f"register_pending called for trial_id {trial_id} without passing config"
)

def evaluation_failed(self, trial_id: str):
if self._allow_duplicates and trial_id in self._config_for_trial_id:
# Blacklist this configuration
self._excl_list.add(self._config_for_trial_id[trial_id])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen these functions now a couple of times. Is it possible to move it up in the class hierarchy or use a utils function instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, doing that

(KDE, False),
],
)
def test_allow_duplicates_or_not(scheduler_cls, multifid):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems natural to move allow_duplicates here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add another parametrize line with allow_duplicates before the function

@mseeger
Copy link
Collaborator Author

mseeger commented Jan 17, 2023

Hi @wistuba , I am still working on this. I'd like to extend the unit test to the case where all trials fail, which should filter them out even if allow_duplicates == True. But this does not work right now.

@mseeger
Copy link
Collaborator Author

mseeger commented Jan 20, 2023

OK, I am done with this now

Copy link
Collaborator

@wistuba wistuba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multi-inheritance is not a good choice here. What you could do instead is make it part of the base searcher. Searchers who don't support it can simply hard code the choice of allow_duplicates to False.

@@ -290,6 +290,9 @@ def _suggest(self, trial_id: int) -> Optional[TrialSuggestion]:
config = self.searcher.get_config(trial_id=str(trial_id))
if config is not None:
config = cast_config_values(config, self.config_space)
self.searcher.register_pending(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these changes related to the feature?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is something that was missing and came to light because of the new test

@@ -450,3 +451,72 @@ def _restore_from_state(self, state: dict):

def set_random_state(self, random_state: np.random.RandomState):
self.random_state = random_state


class FilterDuplicatesMixin:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mixin should not have functions overlapping with BaseSearcher

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so this would basically be multiple inheritance then.

The problem with moving this into BaseSearcher:

  • BaseSearcher is really supposed to be just an API. David would rightly reject putting code in there
  • Some of the searchers do this allow_duplicates in a different way

Is multiple inheritance really so evil?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think what I can do, is I can integrate this code into SearcherWithRandomSeed. This way, the BaseSearcher remains clean.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or better: make it a subclass of that one, this is best

@mseeger mseeger force-pushed the allow_duplicates branch 3 times, most recently from 29a9bc1 to 1b6b396 Compare January 21, 2023 12:31
Copy link
Collaborator

@wistuba wistuba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few lines of code that might be suited in the newly added class.

Comment on lines 88 to 89
if allow_duplicates is None:
allow_duplicates = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deal with this in parent class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Comment on lines 247 to 248
if not self.allow_duplicates:
self._excl_list.add(config) # Should not be suggested again
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we delegate to parent class as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try something

Comment on lines 96 to 97
if allow_duplicates is None:
allow_duplicates = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@wistuba wistuba self-requested a review January 23, 2023 09:11
wistuba
wistuba previously approved these changes Jan 23, 2023
@mseeger mseeger merged commit 894ad84 into main Jan 23, 2023
@mseeger mseeger deleted the allow_duplicates branch January 23, 2023 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants