[WIP] Integrate YAHPO Gym #301

pfistfl · 2022-08-05T08:36:56Z

Description of changes:

First draft for including YAHPO Gym as a BlackBoxRecipe.
This is not entirely straightforward and I might need some input from @geoalgo on how to progress.

Currently, yahpo has a nested structure: /scenario/instance where scenario is a problem set and all instances within a scenario share the same search space. (A scenario is e.g. xgboost and the instances are different datasets)
In the current design, the user would call

bb = load_blackbox("YAHPO")[<scenario>]
bb.set_instance(<instance>)

If we unnest this, this would (in total) be around 850 instances.

@geoalgo Could you perhaps do a pass / help me think about how to integrate the different designs?
I guess we might want to have one Recipe per scenario as you do in the icml_2020 recipe?
Would this bloat the Recipes?

I will list a few open to-do's:

Check what needs to be serialized / How to distribute the .onnx neural networks.
Check where the YAHPO setup (pointer to data dir etc.) needs to happen.
Add an example script.

geoalgo · 2022-08-05T09:53:56Z

Thanks @pfistfl for this PR, its really exciting to have access to all the benchmarks of yahpo!

Regarding the nesting of instances, I think it would be best to have one blackbox per family rather than have all instances into one. This is because we can then benchmark all transfer-learning methods more easily inside one family (such as iaml_xgboost for instance).

To be clear, I am suggesting the following (plus/minus task naming conventions):

bb_dict = load_blackbox("yahpo-lcbench")  # download yahpo_data/lcbench, it would also be OK to download everything if more convenient as the whole dir seems to be 1.3GB 
bb = bb_dict["Fashion-MNIST"]  # access a given dataset for a scenario, not sure how its named in yahpo
print(bb(configuration={...}))

Or to take your naming, to have load_blackbox("yahpo-{scenario}") returns a dictionary where the keys are {instance}. I do not think it would bloat the recipes as we can programatically add them (since they all have the same structure).

Does this sounds reasonable to you? If so, let me know what part you want to work on. We could also merge your code and adapt it ourselves if you think this is reasonable but don't have the time for this, up to you :-)

PS1: A small thing, could you run ./dev_setup.sh before your future pushes (this will automatically format your code at each commit, see our contribution guideline for more info)

PS2: Could you add the LICENSE header (present in any file such as this one, this is something we did not get to automate yet

pfistfl · 2022-08-05T11:07:11Z

Hi @geoalgo

One blackbox per family -> makes sense, let us do that!
Should be trivial to programmatically add those, I have started.

It might be more sensible if you take over for now since you know the in's and out's of syne-tune better.
I hope the existing code provides you with everything required to make things run.

The current setup does not allow me to specify yahpo as a github repository, so I will work on making yahpo pip-installable in the mean time.

geoalgo · 2022-08-05T12:06:35Z

It might be more sensible if you take over for now since you know the in's and out's of syne-tune better.
I hope the existing code provides you with everything required to make things run.

Ok thanks it works for me, one thing though is that I may not be able to finish before summer break (next week), I will finish the integration once I get back (end of August) if I dont get it done by then.

The current setup does not allow me to specify yahpo as a github repository, so I will work on making yahpo pip-installable in the mean time.

As you prefer, it could be nice indeed to be able to install specific versions directly from requirements. That being said the current approach should work for now.

Again thanks a lot for this work, it is very exciting to be able to benchmark syne-tune schedulers with all this new data!

pfistfl · 2022-08-06T08:11:12Z

Just a small update: yahpo is pip-installable now

mseeger · 2022-08-15T08:28:32Z

Hello, is there anything I can do to unblock while David is PTO?

pfistfl · 2022-08-19T07:37:00Z

Hi,

We planned on David taking over the PR given that he is more familiar with the design intents and internal structure of syne-tune. I guess the current task is to marry minor differences in the design of yahpo-gym and syne-tune and add tests/ examples once everything works.

mseeger · 2022-08-19T07:47:25Z

Sounds good. David wrote the benchmark repository and will be the best person to help here.

geoalgo · 2022-08-31T14:00:05Z

Just a quick update on this.
I came back from holidays and will resume this work and add an example that allows to run ASHA and others with a Yahpo benchmark.

geoalgo · 2022-09-06T09:24:29Z

Hi Florian, I sent a PR based on your initial work (#337) to integrate Yahpo to syne-tune, could you take a look? (I opened a new one as there were a few conflicts, I hope you dont mind)

pfistfl and others added 4 commits August 5, 2022 10:17

Initial draft

602055a

Merge branch 'awslabs:main' into main

fd3f89c

add requirements

45bb858

Merge branch 'main' of github.com:pfistfl/syne-tune

74bd851

pfistfl added 2 commits August 5, 2022 13:04

update

d11e7ab

formater

74c3dfe

update requirements

a592e8e

This was referenced Aug 31, 2022

Make Yahpo available in blackbox-repository and add example to run one of its benchmark with ASHA #326

Closed

Integrate Yahpo to blackbox repository, add example to show how to ru… #337

Merged

geoalgo closed this Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Integrate YAHPO Gym #301

[WIP] Integrate YAHPO Gym #301

pfistfl commented Aug 5, 2022 •

edited

geoalgo commented Aug 5, 2022

pfistfl commented Aug 5, 2022

geoalgo commented Aug 5, 2022

pfistfl commented Aug 6, 2022

mseeger commented Aug 15, 2022

pfistfl commented Aug 19, 2022

mseeger commented Aug 19, 2022

geoalgo commented Aug 31, 2022

geoalgo commented Sep 6, 2022

[WIP] Integrate YAHPO Gym #301

[WIP] Integrate YAHPO Gym #301

Conversation

pfistfl commented Aug 5, 2022 • edited

geoalgo commented Aug 5, 2022

pfistfl commented Aug 5, 2022

geoalgo commented Aug 5, 2022

pfistfl commented Aug 6, 2022

mseeger commented Aug 15, 2022

pfistfl commented Aug 19, 2022

mseeger commented Aug 19, 2022

geoalgo commented Aug 31, 2022

geoalgo commented Sep 6, 2022

pfistfl commented Aug 5, 2022 •

edited