Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AxClient.get_best_parameters, returning the parameter configurations on the Pareto frontier #656

Closed
josalhor opened this issue Aug 6, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request wishlist Long-term wishlist feature requests

Comments

@josalhor
Copy link
Contributor

josalhor commented Aug 6, 2021

Hello again!

I've reviewed 22acc17 and every change is excellent! I do however find one thing missing from the API, I cannot find an obvious way to get the points on the Pareto frontier. On the one side, gest_best_parameters is deprecated for multi-objective optimization, returning a random point on the Pareto frontier. On the other, compute_posterior_pareto_frontier requires primary and secondary metrics (the reason for which I am not sure I fully understand).

Am I missing a method/function?

Certainly, one could manually compute the Pareto frontier with get_trials_dataframe, but that doesn't look like an optimal solution.

Edit: For reference, in #654 the example uses compute_posterior_pareto_frontier with only two objectives.

@lena-kashtelyan lena-kashtelyan changed the title Get best parameters on Pareto frontier Implement AxClient.get_best_parameters, returning the parameter configurations on the Pareto frontier Aug 6, 2021
@lena-kashtelyan lena-kashtelyan added the enhancement New feature or request label Aug 6, 2021
@lena-kashtelyan
Copy link
Contributor

On the other, compute_posterior_pareto_frontier requires primary and secondary metrics (the reason for which I am not sure I fully understand)

I believe that is just so we we know, for which two metrics we are computing the Pareto frontier –– @sdaulton, @Balandat, correct me if I'm wrong here? If so, @josalhor, you should just be able to do this as a workaround for now:

pareto_frontier_results = compute_pareto_frontier(
    experiment=ax_client.experiment,
    primary_objective=ax_client.experiment.metrics.get("my_metric_a"),
    secondary_objective=ax_client.experiment.metrics.get("my_metric_b"),
)

Of course, this is limited to two objectives, and we should look into supporting this more conveniently in the Service API (renamed the issue accordingly).

@lena-kashtelyan lena-kashtelyan self-assigned this Aug 6, 2021
@josalhor
Copy link
Contributor Author

josalhor commented Aug 6, 2021

Hi!

I believe that is just so we we know, for which two metrics we are computing the Pareto frontier

I think I didn't fully explain myself. I meant that I am unsure why compute_posterior_pareto_frontier only accepts two dimensions/objectives. Although, in retrospect, the fact that the function is in ax.plot should have given me a hint that it is an auxiliary function for plotting.

@lena-kashtelyan
Copy link
Contributor

@josalhor, that's right, it is for plotting I believe. We'll add improving on this to our feature requests!

@lena-kashtelyan lena-kashtelyan added the wishlist Long-term wishlist feature requests label Aug 11, 2021
@lena-kashtelyan
Copy link
Contributor

Merging this into master wishlist issue, but this will likely be taken care of sooner rather than later.

@josalhor
Copy link
Contributor Author

josalhor commented Aug 19, 2021

Hi again!

In the meantime, I share how I calculated the points on the frontier in case anyone else is in the same situation / the maintainers want to take a look.

Here is the source code: https://gist.github.com/josalhor/8ca386f0902f1523e0ee4cd17e337b37

I provide a couple of implementations, a basic one with nothing fancy and one that introduces the concept of virtual points. This should reduce the cost of computing the frontier in case the frontier gets big but remains proportionally small with respect to the number of points probed.

I’ll take the opportunity to explain it. Let’s define the Virtual Best Point (VBS) as the least restrictive point that would dominate each point on the frontier. Any point that pareto dominates the VBS will automatically dominate the whole frontier, so we don’t need to compare for each point. By reflection, we could define a Virtual Worst Point (VWP). If the VWP dominates any new point, then the new point cannot be in the frontier.

If necessary, one could generalize this even further, defining Virtual Points that dominate only subsets of the points on the frontier. This would create a tradeoff between the overhead of the virtual points and the decreased complexity of not comparing to each point on the frontier unless required.

I didn’t spend too much time on this, but I can’t find a trivial way to decrease the complexity under O(n * k * d) where n = nº points k = average nº points on the frontier d = nº dimensions. Edit: I've found a way, but the solution requires thorough knowledge of the relationship between the points of the frontier.

This code is clearly not up to the standard of the library (ignores status quo, absolute_metrics, type hinting etc). But it was a fun exercise anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wishlist Long-term wishlist feature requests
Projects
None yet
Development

No branches or pull requests

2 participants