Skip to content
This repository has been archived by the owner on Nov 14, 2023. It is now read-only.

Allow any searcher, fixes #198

Merged
merged 8 commits into from
Apr 29, 2021
Merged

Conversation

Yard1
Copy link
Member

@Yard1 Yard1 commented Mar 23, 2021

This PR:

  • Allows the user to pass any Searcher subclass to TuneSearchCV, making it possible to use Searchers other than predefined
  • Fixes an issue with multimetric scoring where values for refit parameter were not being enforced as intended
  • Moves metric and mode from Searchers/Schedulers to tune.run where possible

There are no API breaking changes and CI should pass.

The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with _metric and _mode already set which would cause issues if those keys were passed to tune.run - but it's also not possible to just check for those attributes as there's no requirement of them being there. Forcing the user to pass a class instead (while still allowing them to instantiate it with custom kwargs) seemed like the best solution.

@Yard1 Yard1 requested a review from richardliaw March 23, 2021 21:01
Copy link
Collaborator

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main concern I have is with regards to dependencies. Please take a look!
cc @krfricke

tune_sklearn/tune_search.py Show resolved Hide resolved
Comment on lines 335 to 341
if (search_optimization not in set(available_optimizations.values())
) and (not isinstance(search_optimization, type)
or not issubclass(search_optimization, Searcher)):
raise ValueError(
"Search optimization must be one of "
f"{', '.join(list(available_optimizations.values()))} "
"or a ray.tune.suggest.Searcher class.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, does this mean we are not allowed to pass a custom Searcher object?

Copy link
Member Author

@Yard1 Yard1 Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out a good way to pass an already initialized object, as I have detailed in description - tl;dr there is no enforcement of attributes on Searcher subclassses, so while a solution using objects would be possible with the Searchers present in Tune, it may not work with user-defined ones - unless we do enforcement ourselves here, by forcing the object to have _metric and _mode attributes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so Tune uses set_search_properties() to set the search space, metric, and mode of a search algorithm. The Searcher base class also has a _metric and _mode attribute. Can't we assume that custom searchers always implement that interface?
Custom searchers were not possible to use before with tune-sklearn, right? Thus, now would be a good time to define and enforce properties of these custom searchers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think that would need to be enforced in Ray Tune itself, but we can check it here for now. In that case, I'll make it so that an object is passed instead of a class

tune_sklearn/tune_search.py Outdated Show resolved Hide resolved
Comment on lines +679 to +690
if run_args["scheduler"]:
if hasattr(run_args["scheduler"], "_metric") and hasattr(
run_args["scheduler"], "_mode"):
run_args["scheduler"]._metric = search_kwargs["metric"]
run_args["scheduler"]._mode = search_kwargs["mode"]
else:
warnings.warn(
"Could not set `_metric` and `_mode` attributes "
f"on Scheduler {run_args['scheduler']}. "
"This may cause an exception later! "
"Ensure your Scheduler initializes with those "
"attributes.", UserWarning)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some way of setting this, like with set_search_properties?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use set_search_properties as that would cause issues when it's passed to tune.run, which calls it itself - and it expects an object which hadn't had it called before

@richardliaw
Copy link
Collaborator

Sorry for the slow reply!

The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with _metric and _mode already set which would cause issues if those keys were passed to tune.run

If I had a custom Searcher, how would I pass it in to TuneSearchCV with custom arguments?

I actually think we could just check if object.metric / object.mode is set, and fail if they are?

@Yard1
Copy link
Member Author

Yard1 commented Apr 3, 2021

@richardliaw You'd pass custom Searcher args as kwargs in TuneSearchCV - however now that we've talked it through it will probably be much cleaner to pass Searcher objects instead and enforce those attributes on them. I'll change it after holidays here :)

@Yard1
Copy link
Member Author

Yard1 commented Apr 5, 2021

I had a look at it and using objects is not as straightforward.

The issue is that if a user passes an algo specific search space, it can't be passed directly to tune.run (like the ConfigSpace object for BOHB). The handling for that is usually present in the Searcher's __init__ - and it also sets attributes like metric_op and such, which can vary from algo to algo and cannot all be accounted for here. It is thus not possible to simply change the _metric and _mode attributes and pass the changed object to tune.run.

Therefore, if we want to allow users to pass Searcher instances, instead of classes as I first proposed, we would need to force the user to use only Tune search spaces with them.

What are your thoughts on solving this? I agree that passing classes instead of instances is not intuitive but I am not sure what'd be a solution that would not have us prevent allowing algo specific search space objects.

This is how passing a Searcher class with arguments would look like:

search = TuneSearchCV(search_optimization=MySearcher, searcher_argument_1=True) # kwargs are passed to search_optimization __init__

CC @krfricke @richardliaw

@richardliaw
Copy link
Collaborator

@Yard1 could you help me understand the following questions? Under the object searcher proposal:

  1. Will users be able to provide a string argument for the searcher?
  2. If using a custom searcher object, why would the user not be able to just pass the custom config space via the searcher object?
  3. What are the breaking changes made to user experience by using the object searcher proposal?

@Yard1
Copy link
Member Author

Yard1 commented Apr 5, 2021

@richardliaw

  1. Yes, that is unchanged in either case (no API breaking changes)
  2. Yes, but they will also need to initialize the object with metric and mode alongside the space. All those attributes would be detached from the space, metric and mode arguments in TuneSearchCV, as it will be not possible to set them in a way that will cover all possible cases. There are a lot of algo specific interactions and derived attributes that are set on initialization - for example, many Searchers set the metric_op attribute during initialization depending on the _mode attribute. We could technically make cases for all Searcher classes in Ray Tune but it will be a nightmare to maintain and it will not account for the possibility of a user making their own Searcher.
  3. As described in 2, it would completely detach the parameters set in TuneSearchCV with what's set on the Searcher object. I can imagine a situation where someone would forget to initialize a new Searcher instance with a different metric, changing it only in TuneSearchCV. For the same reason, ensuring consistency just by raising exceptions in case of a mismatch will not be robust either (though more doable than setting them, as derived attributes will not need to be accounted for).

@richardliaw
Copy link
Collaborator

One use case I'd like to support is the search algorithm restoration. This allows people to use learnings from past optimization jobs to speed up new ones.

This would require access to the search algorithm object (Searcher.restore()).

Ideally, we would have the following:

# 1. Base usage if using a custom searcher
searcher = Searcher()
TuneSearchCV(searcher, config=config, metric=X, mode=Y)

# 2. RAISES ERROR if metrics are set in searcher
searcher = Searcher(metric=X, mode=Y)
TuneSearchCV(searcher, metric=X, mode=Y)

# 3. custom search space, OK
searcher = Searcher(config=config)  # is this possible?
TuneSearchCV(searcher, metric=X, mode=Y)

# 4. RAISES error if both config is set.
searcher = Searcher(config=config) 
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)

# 5. Restore from previous run
searcher = Searcher()
searcher.restore("path")
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)

My understanding is that 3 is the controversial one? Though actually a quick skim of code tells me that you can actually pass space without setting metric/mode for many search algs.

@Yard1
Copy link
Member Author

Yard1 commented Apr 6, 2021

So there are two ways of configuring searchers. Either through set_search_properties called by tune.run or by specifying the necessary parameters in __init__. The issue is that for all the Tune searchers, if the instance is initialized with the space, then the set_search_properties will fail and tune.run will raise an exception. You can see this behavior in eg. the Optuna Searcher:

class OptunaSearch(Searcher):
    def __init__(self,
                 space: Optional[Union[Dict, List[Tuple]]] = None,
                 metric: Optional[str] = None,
                 mode: Optional[str] = None,
                 points_to_evaluate: Optional[List[Dict]] = None,
                 sampler: Optional[BaseSampler] = None):
        # ...

        if isinstance(space, dict) and space:
            resolved_vars, domain_vars, grid_vars = parse_spec_vars(space)
            if domain_vars or grid_vars:
                logger.warning(
                    UNRESOLVED_SEARCH_SPACE.format(
                        par="space", cls=type(self)))
                space = self.convert_search_space(space)

        self._space = space
        # ...
        if self._space:
            self._setup_study(mode)

    def set_search_properties(self, metric: Optional[str], mode: Optional[str],
                              config: Dict) -> bool:
        if self._space:
            return False
        space = self.convert_search_space(config)
        self._space = space
        if metric:
            self._metric = metric
        if mode:
            self._mode = mode

        self._setup_study(mode)
        return True

If the space is set in __init__, then it is not possible to set metric and mode through set_search_properties (it returns False, which causes tune.run to raise an exception). All the other Searchers have the same behavior. Therefore, there are only two ways to configure them:

  1. If space, metric and mode are to be set through set_search_properties called by tune.run, then the Searcher has to be initialized without specifying the space, so that set_search_properties can set all three,
  2. Alternatively, all three attributes must be set in __init__.

It is not possible to eg. set just the space in __init__ and then metric and mode by tune.run as that will raise an excception. On the other hand, tune.run only supports dicts as search spaces so it is not possible to pass algo-specific search space objects to it, leaving only the second choice available.

I see two solutions:

  1. Force everything to go through tune.run - will be the simplest, most straightforward and easiest to maintain option but it will make it impossible to eg. pass a ConfigSpace object directly to BOHB, forcing the user to recreate it as a dict of Tune search spaces
  2. If the searcher has been initialized with a space already, then raise exceptions if metrics and mode in TuneSearchCV are different - this is not 100% robust, but should work fine for most cases.

I hope this clears it up, sorry for the confusion. The gist is that it's only possible to either set space, metric and mode all at once in __init__ of a Searcher, or none at all.

@richardliaw
Copy link
Collaborator

Hmm ok thanks for the clarification!

What about we go with:

If the searcher has been initialized with a space already, then raise exceptions if metrics and mode in TuneSearchCV are different - this is not 100% robust, but should work fine for most cases.

But also make it so that we don't actually need to specify it twice (i.e., you can leave the one in TuneSearchCV empty?)

@Yard1
Copy link
Member Author

Yard1 commented Apr 12, 2021

So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).

@richardliaw
Copy link
Collaborator

So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).

Ah, no. Ok to make it simple, how about you just make sure everyone has to have the same for now, and we can think about overriding later?

TuneSearchCV mode/metric has to be set, and Scheduler/Searchers will also have to be set if passed in.

@Yard1
Copy link
Member Author

Yard1 commented Apr 27, 2021

Sure, let me get on that

@Yard1
Copy link
Member Author

Yard1 commented Apr 27, 2021

It's now using instances while enforcing consistency between TuneSearchCV and the Searcher.

Comment on lines +28 to +35
searcher = HEBOSearch()

# It is also possible to use user-defined Searchers, as long as
# they inherit from ray.tune.suggest.Searcher and have the following
# attributes: _space, _metric, _mode

tune_search = TuneSearchCV(
clf, param_distributions, n_trials=3, search_optimization=searcher)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in this case you don't need to specify metric/mode (because it's automatic)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If metric, mode and space are unspecified in the Searcher, then the ones defined in TuneSearchCV will be used (and it has its default ones)

Comment on lines +706 to +709
if hasattr(run_args["scheduler"], "_metric") and hasattr(
run_args["scheduler"], "_mode"):
run_args["scheduler"]._metric = search_kwargs["metric"]
run_args["scheduler"]._mode = search_kwargs["mode"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's raise an issue on ray-project/ray for this

Copy link
Collaborator

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice this looks great!

@richardliaw
Copy link
Collaborator

@Yard1 could you look into the test and see if this is related?

@Yard1
Copy link
Member Author

Yard1 commented Apr 29, 2021

@richardliaw CI is failing due to Optuna - I believe the version is wrong, I'll check - but it shouldn't be related to this.

@Yard1
Copy link
Member Author

Yard1 commented Apr 29, 2021

@richardliaw Mode is now configurable and release CI passes (it was installing a version of Optuna that was too old)

@richardliaw richardliaw merged commit ed088f4 into ray-project:master Apr 29, 2021
@richardliaw
Copy link
Collaborator

Great work @Yard1 !

@Yard1 Yard1 deleted the allow_any_searcher branch April 29, 2021 19:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants