Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#6970: Implement to/from_ray_dataset functions #6971

Merged
merged 14 commits into from Mar 5, 2024

Conversation

Retribution98
Copy link
Collaborator

What do these changes do?

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Implement to/from_ray_dataset functions #6970
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

Signed-off-by: Kirill Suvorov <kirill.suvorov@intel.com>
modin/pandas/dataframe.py Fixed Show fixed Hide fixed
modin/pandas/test/test_io.py Fixed Show fixed Hide fixed
modin/pandas/test/test_io.py Fixed Show fixed Hide fixed
modin/pandas/test/test_io.py Outdated Show resolved Hide resolved
modin/pandas/test/test_io.py Outdated Show resolved Hide resolved
Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
modin/pandas/dataframe.py Fixed Show fixed Hide fixed
@YarShev YarShev changed the title FEAT-#6970: Implement to/from_ray functions FEAT-#6970: Implement to/from_ray_dataset functions Feb 28, 2024
docs/ecosystem.rst Outdated Show resolved Hide resolved
docs/ecosystem.rst Outdated Show resolved Hide resolved
docs/ecosystem.rst Outdated Show resolved Hide resolved
docs/ecosystem.rst Outdated Show resolved Hide resolved
modin/core/execution/ray/generic/io/io.py Outdated Show resolved Hide resolved
modin/core/execution/ray/generic/io/io.py Outdated Show resolved Hide resolved
modin/pandas/test/test_io.py Outdated Show resolved Hide resolved
modin/pandas/dataframe.py Show resolved Hide resolved
Copy link
Collaborator

@YarShev YarShev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment. Otherwise, looks good!

modin/pandas/io.py Outdated Show resolved Hide resolved
@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

@Retribution98, CI fails for the new functions. Can you take a look?

@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

It looks like a Ray Dataset issue. We can submit this bug to them. To workaround the issue for now, I guess we could use a simpler data without a Datetime index for testing.

@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

@anmyachev, any other comments?

@Retribution98
Copy link
Collaborator Author

It looks like a Ray Dataset issue. We can submit this bug to them. To workaround the issue for now, I guess we could use a simpler data without a Datetime index for testing.

Yes, I know about this problem because I caught it when I was testing this feature locally. This problem is not related to the use of the index, but to the use of an old version of Python from Ray, so I'm very surprised that the previous CI running passed.
In any case, this problem has already been fixed in the master branch of the Ray repository, so we can try installing ray from the nightly releases. I checked it works.

@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

Could you point me out to the issue they resolved? I am kind of confused that it relates to the python version Ray uses.

@Retribution98
Copy link
Collaborator Author

Could you point me out to the issue they resolved? I am kind of confused that it relates to the python version Ray uses.

This problem was resolved here.

@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

I see, it was related to a new pandas but not python. Thanks!

@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

It seems we should xfail the test until a new release of Ray is off and then enable it.

@Retribution98
Copy link
Collaborator Author

I see, it was related to a new pandas but not python. Thanks!

The pandas version, of course, sorry for this misprint.

@@ -32,7 +32,7 @@

from modin.config import PersistentPickle
from modin.logging import disable_logging
from modin.pandas.io import from_pandas, to_pandas
from modin.pandas.io import from_pandas, to_pandas, to_ray_dataset

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
modin.pandas.io
begins an import cycle.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YarShev
It seems that change still doesn’t work.
It looks like a false positive, CodeQL have the same PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Then we can revert imports back to the top.

modin/pandas/io.py Outdated Show resolved Hide resolved
modin/pandas/test/test_io.py Outdated Show resolved Hide resolved
modin/pandas/test/test_io.py Show resolved Hide resolved
modin/core/execution/ray/generic/io/io.py Outdated Show resolved Hide resolved
modin/core/io/io.py Outdated Show resolved Hide resolved
@YarShev YarShev merged commit 07a5e3b into modin-project:master Mar 5, 2024
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement to/from_ray_dataset functions
4 participants