New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Integrate YAHPO Gym #301
Conversation
Thanks @pfistfl for this PR, its really exciting to have access to all the benchmarks of yahpo! Regarding the nesting of instances, I think it would be best to have one blackbox per family rather than have all instances into one. This is because we can then benchmark all transfer-learning methods more easily inside one family (such as iaml_xgboost for instance). To be clear, I am suggesting the following (plus/minus task naming conventions): bb_dict = load_blackbox("yahpo-lcbench") # download yahpo_data/lcbench, it would also be OK to download everything if more convenient as the whole dir seems to be 1.3GB
bb = bb_dict["Fashion-MNIST"] # access a given dataset for a scenario, not sure how its named in yahpo
print(bb(configuration={...})) Or to take your naming, to have Does this sounds reasonable to you? If so, let me know what part you want to work on. We could also merge your code and adapt it ourselves if you think this is reasonable but don't have the time for this, up to you :-) PS1: A small thing, could you run PS2: Could you add the LICENSE header (present in any file such as this one, this is something we did not get to automate yet |
Hi @geoalgo One blackbox per family -> makes sense, let us do that! It might be more sensible if you take over for now since you know the in's and out's of The current setup does not allow me to specify yahpo as a github repository, so I will work on making yahpo pip-installable in the mean time. |
Ok thanks it works for me, one thing though is that I may not be able to finish before summer break (next week), I will finish the integration once I get back (end of August) if I dont get it done by then.
As you prefer, it could be nice indeed to be able to install specific versions directly from requirements. That being said the current approach should work for now. Again thanks a lot for this work, it is very exciting to be able to benchmark syne-tune schedulers with all this new data! |
Just a small update: yahpo is pip-installable now |
Hello, is there anything I can do to unblock while David is PTO? |
Hi, We planned on David taking over the PR given that he is more familiar with the design intents and internal structure of |
Sounds good. David wrote the benchmark repository and will be the best person to help here. |
Just a quick update on this. |
Hi Florian, I sent a PR based on your initial work (#337) to integrate Yahpo to syne-tune, could you take a look? (I opened a new one as there were a few conflicts, I hope you dont mind) |
Description of changes:
First draft for including YAHPO Gym as a
BlackBoxRecipe
.This is not entirely straightforward and I might need some input from @geoalgo on how to progress.
Currently, yahpo has a nested structure:
/scenario/instance
wherescenario
is a problem set and allinstances
within ascenario
share the same search space. (A scenario is e.g.xgboost
and theinstances
are different datasets)In the current design, the user would call
If we unnest this, this would (in total) be around 850 instances.
@geoalgo Could you perhaps do a pass / help me think about how to integrate the different designs?
I guess we might want to have one
Recipe
perscenario
as you do in theicml_2020
recipe?Would this bloat the Recipes?
I will list a few open to-do's:
.onnx
neural networks.