Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Support dask dataframe in train_test_split #351
Perhaps http://dask.pydata.org/en/latest/dataframe-api.html#dask.dataframe.DataFrame.random_split can be useful here?
You can provide a seed to the method and it will split them the same way (assuming that the chunk structure is the same)…
On Fri, Aug 31, 2018 at 1:15 PM, Tom Augspurger ***@***.***> wrote: Yeah, I think so. Will just take a bit of work to ensure that we split multiple dataframes the same. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#351 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszI8630isCaUYKQlt6A4u6VUhySBVks5uWW9IgaJpZM4WVkU2> .