Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dask dataframe in train_test_split #351

Closed
TomAugspurger opened this issue Aug 31, 2018 · 3 comments
Closed

Support dask dataframe in train_test_split #351

TomAugspurger opened this issue Aug 31, 2018 · 3 comments

Comments

@TomAugspurger
Copy link
Member

@TomAugspurger TomAugspurger commented Aug 31, 2018

Easy way: call to_dask_array(lengths=True). This will take some computation.

The harder (maybe not too hard) way to do this would be to directly support dask dataframes.

@TomAugspurger
Copy link
Member Author

@TomAugspurger TomAugspurger commented Aug 31, 2018

Yeah, I think so. Will just take a bit of work to ensure that we split multiple dataframes the same.

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Aug 31, 2018

TomAugspurger added a commit to TomAugspurger/dask-ml that referenced this issue Aug 31, 2018
TomAugspurger added a commit that referenced this issue Sep 4, 2018
* ENH: Support dask dataframe in train_test_split

Closes #351
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants