Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather CV Results as Completed #433

Merged
merged 88 commits into from May 20, 2019

Conversation

@vecchp
Copy link
Contributor

commented Dec 1, 2018

In progress PR to address #415.

Paul Vecchio
@TomAugspurger

This comment has been minimized.

Copy link
Member

commented Dec 1, 2018

The CI failures can probably be ignored. I'll try to get them fixed next week.

@vecchp can you think of a way to test this? The best I've come up with is a custom estimator class, that when given a specific hyper parameter combination does something like

get_worker inside the task and closes the worker, to simulate it going away. https://distributed.dask.org/en/latest/api.html#distributed.get_worker. Haven't tested that out yet.

@jcrist will you have time next week to give feedback on this approach?

Paul Vecchio
@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented Dec 2, 2018

@TomAugspurger I'll think a little about the tests, though your current suggestion seems reasonable.

dask_ml/model_selection/_search.py Outdated Show resolved Hide resolved
dask_ml/model_selection/_search.py Outdated Show resolved Hide resolved
dask_ml/model_selection/_search.py Outdated Show resolved Hide resolved
dask_ml/model_selection/_search.py Outdated Show resolved Hide resolved
dask_ml/model_selection/_search.py Outdated Show resolved Hide resolved

Paul Vecchio added some commits Dec 5, 2018

Paul Vecchio
Paul Vecchio
Paul Vecchio
Paul Vecchio
@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented Dec 5, 2018

@jcrist @TomAugspurger made some changes w.r.t the PR comments. On my end all tests pass except for test_cache_cv. I believe the failure is due to the cache being hit more than the previous expected values given the current changes.

build_graph is now broken into build_cv_graph & build_result_graph, I had to return main_token, X_name, y_name, weights, fit_params and then pass it to build_result_graph which seems a bit messy, but I haven't thought of a better way. What are your thoughts on these changes now?

I also haven't got around to confirming if the futures no longer need to be canceled or writing robust tests for the new as_completed feature.

vecchp added some commits Feb 15, 2019

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented Apr 18, 2019

@TomAugspurger checking in again to see what else should be done. All tests pass.

@TomAugspurger

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

@TomAugspurger

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

All tests pass.

It seems like coverage is decreasing. Does that make sense, or is this a false positive? https://codecov.io/gh/dask/dask-ml/compare/cfa57e5789fc29bdcee5d846b5116c07f5fbe292...9da5b8f0cfa5960a7ee70f216a684bff96a450b2/changes

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented May 4, 2019

All tests pass.

It seems like coverage is decreasing. Does that make sense, or is this a false positive? https://codecov.io/gh/dask/dask-ml/compare/cfa57e5789fc29bdcee5d846b5116c07f5fbe292...9da5b8f0cfa5960a7ee70f216a684bff96a450b2/changes

Makes sense, looks like it's dead code given the as_completed implementation no longer uses it. I'll update.

vecchp added some commits May 4, 2019

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented May 4, 2019

@TomAugspurger looks like all stages pass. I think codecov was removed but that should be back to normal as well given the dead code is now gone.

try:
from cytoolz import get, pluck
except ImportError: # pragma: no cover
from toolz import get, pluck

try:
from dask.distributed import as_completed

This comment has been minimized.

Copy link
@stsievert

stsievert May 4, 2019

Member

Is this try/except needed? distributed is a dependency now (#466).

This comment has been minimized.

Copy link
@vecchp

vecchp May 4, 2019

Author Contributor

I'm fine with removing the optional check. Previous guidance was to make it option as it wasn't a dependency at the time.

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented May 7, 2019

@stsievert all optional distributed checks in this PR should be removed now.

@TomAugspurger any guidance left to get this merged?

@TomAugspurger
Copy link
Member

left a comment

Thanks @vecchp, and apologies once again for the delay on reviewing.

This looks quite good. My only request is a small compatibility layer for build_graph.

return fit_params


def build_cv_graph(

This comment has been minimized.

Copy link
@TomAugspurger

TomAugspurger May 20, 2019

Member

Can we provide a compatibility shim with the old name? I believe that TPOT is using build_graph https://github.com/EpistasisLab/tpot/blob/b626271e6b5896a73fb9d7d29bebc7aa9100772e/tpot/gp_deap.py#L429

This comment has been minimized.

Copy link
@TomAugspurger

TomAugspurger May 20, 2019

Member

I think we can just add back build_graph with a comment that it's going to be removed (not even a deprecation warning). This is a private module, but I was lazy when adding dask support to TPOT. Once this is in, I can cut a release and make a PR to tpot updating their call.

@TomAugspurger

This comment has been minimized.

Copy link
Member

commented May 20, 2019

I'm looking into the compatibility for tpot now. Will push an update.

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented May 20, 2019

Sounds good!

TomAugspurger added some commits May 20, 2019

@TomAugspurger TomAugspurger merged commit 954d41f into dask:master May 20, 2019

6 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
dask.dask-ml Build #20190520.10 succeeded
Details
dask.dask-ml (linux earliest) linux earliest succeeded
Details
dask.dask-ml (linux linux) linux linux succeeded
Details
dask.dask-ml (linux sklearnDev) linux sklearnDev succeeded
Details
dask.dask-ml (win64 win-64) win64 win-64 succeeded
Details
@TomAugspurger

This comment has been minimized.

Copy link
Member

commented May 20, 2019

Thanks @vecchp!

@vecchp

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

@TomAugspurger thanks so much for getting this in!

@vecchp vecchp deleted the vecchp:gather_results_as_completed branch May 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.