You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ray MLDataset is a distributed dataset implemented based on ray ParallelIterator. The data in MLDataset can be used by mllibs on ray such as xgboost_on_ray or raysgd for distributed training.
It would be great that mars can support convert mars dataframe to ray MLDataset, so that mars can use ray mllibs for distributed training easily.
And since both records in Ray MLDataset and chunks in mars are pandas.Dataframe too, there won't be any conversion cost between mars dataframe and ray MLDataset
The text was updated successfully, but these errors were encountered:
Fyi MLdataset is planned for deprecation in Ray, we're in the process of replacing them with just Dataset (once it leaves beta).
Thanks for the information. After some offline discussion, we decided to support MLDataset too. Because most of the work for supportting ray dataset and MLDataset are the same, and xgboost_ray/lightgbm_ray doesn't support ray dataset yet. And for older version of ray which doesn't have ray support, MLDataset is still useful.
Ray MLDataset is a distributed dataset implemented based on ray ParallelIterator. The data in MLDataset can be used by mllibs on ray such as xgboost_on_ray or raysgd for distributed training.
It would be great that mars can support convert mars dataframe to ray MLDataset, so that mars can use ray mllibs for distributed training easily.
And since both records in Ray MLDataset and chunks in mars are
pandas.Dataframe
too, there won't be any conversion cost between mars dataframe and ray MLDatasetThe text was updated successfully, but these errors were encountered: