Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#3044: Create Extentions Module in Modin #6961

Merged
merged 15 commits into from Mar 1, 2024

Conversation

devin-petersohn
Copy link
Collaborator

@devin-petersohn devin-petersohn commented Feb 22, 2024

What do these changes do?

Adds extensions capability to Modin.

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Add support for registering a custom accessor to allow user to extend DataFrames/Series #3044
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
modin/extension/extensions.py Fixed Show resolved Hide resolved
@devin-petersohn devin-petersohn changed the title FEAT-#6960: Create Exentions Module in Modin FEAT-#3044: Create Exentions Module in Modin Feb 22, 2024
@YarShev YarShev changed the title FEAT-#3044: Create Exentions Module in Modin FEAT-#3044: Create Extentions Module in Modin Feb 23, 2024
@@ -0,0 +1,11 @@
from .extensions import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move modin/extension to modin/pandas/api/extensions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am imagining eventually adding API extensions for numpy and even others.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-dpetersohn Users will most likely look for these functions in the same import path as in pandas. I think this is a good enough reason to make from modin.pandas.api.extensions import register_* work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to add other API extentions, say for numpy, we would put them in modin.numpy.api.extensions. @sfc-gh-dpetersohn, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can move it.

modin/extension/extensions.py Outdated Show resolved Hide resolved
modin/extension/extensions.py Outdated Show resolved Hide resolved
modin/extension/extensions.py Outdated Show resolved Hide resolved
modin/extension/extensions.py Outdated Show resolved Hide resolved
modin/pandas/dataframe.py Show resolved Hide resolved
sfc-gh-dpetersohn and others added 4 commits February 23, 2024 09:44
Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
modin/extension/__init__.py Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
modin/extension/__init__.py Outdated Show resolved Hide resolved
modin/extension/extensions.py Outdated Show resolved Hide resolved
modin/extension/test/test_extension_series.py Outdated Show resolved Hide resolved
modin/extension/test/test_extension_pd.py Outdated Show resolved Hide resolved
modin/extension/test/test_extension_dataframe.py Outdated Show resolved Hide resolved
sfc-gh-dpetersohn and others added 4 commits February 26, 2024 09:25
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
@YarShev
Copy link
Collaborator

YarShev commented Feb 26, 2024

Just in case, CI is failing for now and we are working on that in #6969.

Copy link
Collaborator

@YarShev YarShev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devin-petersohn, LGTM, but let's wait for #6969 to get in first. Do you want to add a doc page about the extentions in a separate PR?

@devin-petersohn
Copy link
Collaborator Author

Sounds great @YarShev. I think we can leave this undocumented for the moment since I put some extensive docstrings in the PR. Eventually those should be the docs, but we can enable it in a separate PR.

@YarShev
Copy link
Collaborator

YarShev commented Feb 26, 2024

@devin-petersohn, could you rebase on current master to rerun CI? It should pass after that.

@sfc-gh-dpetersohn sfc-gh-dpetersohn enabled auto-merge (squash) February 27, 2024 17:00
Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
@@ -650,6 +650,7 @@ jobs:
mkdir -p "${CONDA_PKGS_DIR}_do_not_cache" && \
find "${CONDA_PKGS_DIR}" -mindepth 1 -maxdepth 1 -type d -exec mv {} "${CONDA_PKGS_DIR}_do_not_cache" \;
if: matrix.os == 'windows'
- run: ${{ matrix.execution.shell-ex }} modin/extensions/test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devin-petersohn you added a test to the place where ray was already stopped. You need to add running tests a couple of lines higher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thank you!

Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
@sfc-gh-dpetersohn
Copy link
Contributor

This is finally passing and ready for final review!

Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
@YarShev
Copy link
Collaborator

YarShev commented Mar 1, 2024

@sfc-gh-dpetersohn, I had some more questions on the changes but not to waste time I went ahead and pushed some changes. Hope, it is fine to you.

@sfc-gh-dpetersohn sfc-gh-dpetersohn merged commit cee9b98 into modin-project:master Mar 1, 2024
35 checks passed
@sfc-gh-dpetersohn
Copy link
Contributor

Yes, of course this is fine with me @YarShev 😄. I am always forgetting the license headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for registering a custom accessor to allow user to extend DataFrames/Series
4 participants