Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Mars DataFrame with Ray MLDataset #2294

Merged
merged 20 commits into from
Aug 26, 2021

Conversation

vcfgv
Copy link
Contributor

@vcfgv vcfgv commented Aug 5, 2021

What do these changes do?

In order that Mars can support converting Mars DataFrame to Ray MLDataset, the following changes were made.

  • Introduced a new class named RayMLDataset to take over the entire process where Ray MLDataset is created from ParallelIterator.
  • Introduced a fetch-like universal procedure name fetch_infos so that ray.ObjectRef is now reachable.

Furthermore, xgboost_ray is initially tested working with Mars DataFrame. Integration with Ray Dataset will come in a future PR.

Related issue number

Fixes #2230

Initially implemented mldataset api

commit before deep dive

Initially tested mars with xgboost

Code refactoring

Initially implemented ray MLDataset integration
@qinxuye qinxuye added this to In progress in DataFrame via automation Aug 5, 2021
@qinxuye qinxuye added this to PR-In progress in v0.8 Release via automation Aug 5, 2021
@qinxuye qinxuye added this to the v0.8.0a2 milestone Aug 5, 2021
@chaokunyang
Copy link
Contributor

Is mars/dataframe a better to place dataset.py?

Code refactoring and bug fixes
@vcfgv vcfgv changed the title [WIP] Integrate Mars DataFrame with Ray MLDataset [WIP] Integrate Mars DataFrame with Ray Dataset & MLDataset Aug 18, 2021
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
@vcfgv vcfgv changed the title [WIP] Integrate Mars DataFrame with Ray Dataset & MLDataset [WIP] Integrate Mars DataFrame with Ray MLDataset Aug 21, 2021
chaokunyang
chaokunyang previously approved these changes Aug 24, 2021
Copy link
Contributor

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, left some comments.

mars/deploy/oscar/session.py Outdated Show resolved Hide resolved
mars/dataframe/tests/test_mldataset.py Outdated Show resolved Hide resolved
mars/dataframe/tests/test_mldataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
DataFrame automation moved this from In progress to Needs review Aug 24, 2021
v0.8 Release automation moved this from PR-In progress to PR-Needs review Aug 24, 2021
@vcfgv vcfgv changed the title [WIP] Integrate Mars DataFrame with Ray MLDataset Integrate Mars DataFrame with Ray MLDataset Aug 24, 2021
mars/dataframe/core.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/dataframe/dataset.py Outdated Show resolved Hide resolved
mars/deploy/oscar/session.py Outdated Show resolved Hide resolved
@vcfgv vcfgv force-pushed the feature/mldataset branch 3 times, most recently from db74157 to 71e9ac6 Compare August 25, 2021 15:53
qinxuye
qinxuye previously approved these changes Aug 26, 2021
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename all filters as field as filters imply query conditions.

Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wjsi wjsi requested a review from qinxuye August 26, 2021 11:26
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

DataFrame automation moved this from Needs review to Reviewer approved Aug 26, 2021
@qinxuye qinxuye merged commit 54f8cf9 into mars-project:master Aug 26, 2021
DataFrame automation moved this from Reviewer approved to Done Aug 26, 2021
v0.8 Release automation moved this from PR-Needs review to PR-Done Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
DataFrame
  
Done
Mars on Ray
Awaiting triage
Development

Successfully merging this pull request may close these issues.

Ray MLDataset Integration
4 participants