New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow user to specify a selection heuristic #75
Conversation
more changes typos docs: changes to the form of history and output of report typos more typos minor typos implement changes in docs - first attempt with passing tests add forgotten file typo rename the default selection heurisic; add weights to it
cc @azev77 |
I'm not going to straight up say it is definitely ok for us (because I didn't do anything yet more than a cursory look), however, should be ok. Most of our implementation is in As I said, an early "quick glance" conclusion based on what you've written here. |
Thanks for the feedback @iqml . As I say, I'm happy to check this out myself and make the necessary PR. Working on this today. |
Okay, it's possible to implement the new API at TreeParzen. This branch passes tests in conjunction with the current PR on my machine under Julia 1.4. Below is a sketch of the necessary changes to TreeParzen.jl @yalwan-iqvia I can give more details in the final PR if you are happy with the general direction.
The rest of Previously there was not separate trial history in state, but one need to create one every call to A difference to note is that the trial history is no-longer exposed to the user in the reported history. If desired, this could be remedied either by implemented @yalwan-iqvia If you are happy with this I will merge the current PR, and later make a formal PR to TreeParzen. |
@ablaom I've opened what you did as a PR agains treeparzen, a few comments there but nothing to do with the direction of the work. This looks fine to me. |
Context: JuliaAI/MLJ.jl#487 #40
Replaces: #74
Currently the implementer of a new strategy implements a
best
method for deciding on how to extract the history entry corresponding to the "best" (optimal) model. All the current strategies apply the "naive" heuristic which simply miminizes (or maximises) the user-specifiedmeasure
evaluations, aggregated over all samples (folds) in resampling.This PR adds an interface point for the user to specify the "selection heuristic", with the tuning strategy surrendering that responsibility. The idea is that most "selection heuristics" would apply generically (ie, to any tuning strategy) but the PR leaves open the possibility that some future heuristics might be strategy-specific.
Formally, a selection heuristic is a subtype of a new abstract type
SelectionHeuristic
and a new concrete subtype implements abest
method (and a traitsupports_heuristic
to indicate if it is generic or strategy-specific). An moderately sophisticated user could add their own custom heuristics; see here.Only a single selection heuristic
NaiveSelection
is introduced in the current PR.This PR will break the externally implemented strategy
TreeParzen
. I will be happy to make the necessary PR to TreeParzen.jl to fix this. The built-in strategiesGrid
andRandomSearch
are fixed in this PR.edit The
models!
method is replaced with non-mutatingmodels
method to make the whole interface "functional" (state
objects can now be immutable, andmodels
addsnew_state
to its return value).New for users
Adds keyword
selection_heuristic=...
to constructor ofTunedModel
instances. Default isNaiveSelection()
, which gives the existing behaviour (with an option to specifyweights
for multiple measures).Breaking for users
This PR also tweaks the reporting for
TunedModel
instances. Ifmach
is a machine trained on aTunedModel
instance, andr = report(mach)
then:r.best_result
no longer exists (replaced withr.best_history_entry
)r.history
is now a vector of named tuples (true also for the internal representation of the history)r.history
excludes implementation-specific "model metadata" (necessarily included in the internal history)Main changes for the tuning strategy API
best
method to implementresult
method is renamedextras
method with a simplified role (no need to handle "model metadata", which is automatically include in the history under the hood)tuning_report
has a simplified role (no need to generate the user form of history)models!
is replaced by a non-mutatingmodels
method with identical signature butnewstate
added to return value (now a tuple,(vector_of_models, new_state)
) instead ofvector_of_models
).For details refer to the revised README.md
To do before merging
cc @yalwan-iqvia