Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-36730: [Python] Add support for Cython 3.0.0 #37097

Merged
merged 41 commits into from Sep 21, 2023

Conversation

danepitkin
Copy link
Contributor

@danepitkin danepitkin commented Aug 9, 2023

Rationale for this change

Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0. Cython 3 is not enabled in this diff.

What changes are included in this PR?

Note:

Are these changes tested?

Yes.

Are there any user-facing changes?

Yes.

@github-actions
Copy link

github-actions bot commented Aug 9, 2023

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@danepitkin danepitkin changed the title WIP: [Python] Add support for Cython 3.0.0 Work In Progress: [Python] Add support for Cython 3.0.0 Aug 9, 2023
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Aug 11, 2023
@danepitkin
Copy link
Contributor Author

danepitkin commented Aug 11, 2023

Potentially related cloudpickle issue? cloudpipe/cloudpickle#506

Edit: Nope. Cython 3.0.0 changed the default compiler directive "binding" from False to True. Setting back to false fixes the cloudpickle test.

@kou kou changed the title Work In Progress: [Python] Add support for Cython 3.0.0 GH-36730: Work In Progress: [Python] Add support for Cython 3.0.0 Aug 11, 2023
@kou
Copy link
Member

kou commented Aug 11, 2023

I've copied #36745 PR description.
We should update the PR description before we merge this PR.

@danepitkin
Copy link
Contributor Author

@github-actions crossbow submit python

@github-actions
Copy link

Revision: c8e6a19

Submitted crossbow builds: ursacomputing/crossbow @ actions-ba5c455e80

Task Status
example-python-minimal-build-fedora-conda Github Actions
example-python-minimal-build-ubuntu-venv Github Actions
python-sdist Github Actions
test-conda-python-3.10 Github Actions
test-conda-python-3.10-hdfs-2.9.2 Github Actions
test-conda-python-3.10-hdfs-3.2.1 Github Actions
test-conda-python-3.10-pandas-latest Github Actions
test-conda-python-3.10-pandas-nightly Github Actions
test-conda-python-3.10-spark-v3.4.1 Github Actions
test-conda-python-3.10-substrait Github Actions
test-conda-python-3.11 Github Actions
test-conda-python-3.11-dask-latest Github Actions
test-conda-python-3.11-dask-upstream_devel Github Actions
test-conda-python-3.11-hypothesis Github Actions
test-conda-python-3.11-pandas-upstream_devel Github Actions
test-conda-python-3.11-spark-master Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-pandas-1.0 Github Actions
test-conda-python-3.8-spark-v3.4.1 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-pandas-latest Github Actions
test-cuda-python Github Actions
test-debian-11-python-3 Azure
test-fedora-35-python-3 Azure
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-python-3 Github Actions
verify-rc-source-python-linux-almalinux-8-amd64 Github Actions
verify-rc-source-python-linux-conda-latest-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-python-macos-amd64 Github Actions
verify-rc-source-python-macos-arm64 Github Actions
verify-rc-source-python-macos-conda-amd64 Github Actions

@danepitkin
Copy link
Contributor Author

test-cuda-python and test-ubuntu-20.04-python-3 are failing due to this cython issue cython/cython#5552

@danepitkin danepitkin changed the title GH-36730: Work In Progress: [Python] Add support for Cython 3.0.0 GH-36730: [Python] Add support for Cython 3.0.0 Aug 15, 2023
@github-actions
Copy link

Revision: 6f32fcf

Submitted crossbow builds: ursacomputing/crossbow @ actions-165015566a

Task Status
test-conda-python-3.10 Github Actions
test-conda-python-3.10-cython2 Github Actions
test-conda-python-3.10-hdfs-2.9.2 Github Actions
test-conda-python-3.10-hdfs-3.2.1 Github Actions
test-conda-python-3.10-pandas-latest Github Actions
test-conda-python-3.10-pandas-nightly Github Actions
test-conda-python-3.10-spark-v3.4.1 Github Actions
test-conda-python-3.10-substrait Github Actions
test-conda-python-3.11 Github Actions
test-conda-python-3.11-dask-latest Github Actions
test-conda-python-3.11-dask-upstream_devel Github Actions
test-conda-python-3.11-hypothesis Github Actions
test-conda-python-3.11-pandas-upstream_devel Github Actions
test-conda-python-3.11-spark-master Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-pandas-1.0 Github Actions
test-conda-python-3.8-spark-v3.4.1 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-pandas-latest Github Actions
test-cuda-python Github Actions
test-debian-11-python-3 Azure
test-fedora-35-python-3 Azure
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-python-3 Github Actions

@danepitkin
Copy link
Contributor Author

I've separated out the enablement of Cython 3 into a separate diff: #37743

Now, we can merge Cython 3 support and then enable Cython 3 once Cython 3.0.3 is released.

@danepitkin
Copy link
Contributor Author

IMO this can be merged now!

"""A Dataset created from a list of paths on a particular filesystem.
@staticmethod
def from_paths(paths, schema=None, format=None, filesystem=None,
partitions=None, root_partition=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... classmethod doesn't work correctly on Cython 3 anymore? Is it a known issue? Or is this change actually not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classmethod does work, I can change it back. I tried switching to staticmethod when numpydocs were failing, but it turns out it was the comment that was causing numpydoc parsing errors. I thought staticmethod was a slight improvement since cls wasn't actually used in the classmethod.

Copy link
Member

@AlenkaF AlenkaF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this Dane!! Great work 👍

@danepitkin danepitkin added the Priority: Blocker Marks a blocker for the release label Sep 19, 2023
@AlenkaF
Copy link
Member

AlenkaF commented Sep 21, 2023

@github-actions crossbow submit -g python

@github-actions
Copy link

parse() missing 1 required positional argument: 'config'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/6259391389

@AlenkaF
Copy link
Member

AlenkaF commented Sep 21, 2023

@github-actions crossbow submit python

@github-actions
Copy link

parse() missing 1 required positional argument: 'config'
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/6259489413

@kou
Copy link
Member

kou commented Sep 21, 2023

We need to fix #37803 to use @github-actions crossbow submit .... :<

@AlenkaF
Copy link
Member

AlenkaF commented Sep 21, 2023

We need to fix #37803 to use @github-actions crossbow submit .... :<

Thanks! Got it now =)

@AlenkaF
Copy link
Member

AlenkaF commented Sep 21, 2023

@github-actions crossbow submit -g python

@github-actions
Copy link

Revision: 3c4f581

Submitted crossbow builds: ursacomputing/crossbow @ actions-d63bfc88b3

Task Status
test-conda-python-3.10 Github Actions
test-conda-python-3.10-cython2 Github Actions
test-conda-python-3.10-hdfs-2.9.2 Github Actions
test-conda-python-3.10-hdfs-3.2.1 Github Actions
test-conda-python-3.10-pandas-latest Github Actions
test-conda-python-3.10-pandas-nightly Github Actions
test-conda-python-3.10-spark-v3.4.1 Github Actions
test-conda-python-3.10-substrait Github Actions
test-conda-python-3.11 Github Actions
test-conda-python-3.11-dask-latest Github Actions
test-conda-python-3.11-dask-upstream_devel Github Actions
test-conda-python-3.11-hypothesis Github Actions
test-conda-python-3.11-pandas-upstream_devel Github Actions
test-conda-python-3.11-spark-master Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-pandas-1.0 Github Actions
test-conda-python-3.8-spark-v3.4.1 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-pandas-latest Github Actions
test-cuda-python Github Actions
test-debian-11-python-3 Azure
test-fedora-35-python-3 Azure
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-python-3 Github Actions

@AlenkaF
Copy link
Member

AlenkaF commented Sep 21, 2023

None of the failures look related, wil merge. Thanks again Dane!

@AlenkaF AlenkaF merged commit e83c23b into apache:main Sep 21, 2023
53 of 58 checks passed
@AlenkaF AlenkaF removed the awaiting change review Awaiting change review label Sep 21, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit e83c23b.

There was 1 benchmark result indicating a performance regression:

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

etseidl pushed a commit to etseidl/arrow that referenced this pull request Sep 28, 2023
### Rationale for this change

Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0. **Cython 3 is not enabled in this diff.**

### What changes are included in this PR?

* Don't use `vector[XXX]&&`
* Add a declaration for `postincrement`
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#c-postincrement-postdecrement-operator
* Ignore `C4551` warning (function call missing argument list) with MSVC
  * See also: cython/cython#4445
* Add missing `const` to `CLocation`'s static methods.
* Don't use `StopIteration` to stop generator
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#python-3-syntax-semantics
* non-extern `cdef` functions will now propagate python exceptions automatically unless explicitly labeled `noexcept`
* Function binding in cython is now enabled by default. Class methods that are used as wrappers for pickling should be converted to staticmethods.
* Numpydocs now validates more Cython 3 objects than Cython <3
  * Enum types are now being validated, and some unhelpful validation checks on Enums are now ignored
* Added a cython <3 nightly CI job

Note:
* Cython 3.0.0, 3.0.1, 3.0.2 has an issue when compiling with debug mode cython/cython#5552

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* Closes: apache#36730

Lead-authored-by: Dane Pitkin <dane@voltrondata.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
### Rationale for this change

Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0. **Cython 3 is not enabled in this diff.**

### What changes are included in this PR?

* Don't use `vector[XXX]&&`
* Add a declaration for `postincrement`
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#c-postincrement-postdecrement-operator
* Ignore `C4551` warning (function call missing argument list) with MSVC
  * See also: cython/cython#4445
* Add missing `const` to `CLocation`'s static methods.
* Don't use `StopIteration` to stop generator
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#python-3-syntax-semantics
* non-extern `cdef` functions will now propagate python exceptions automatically unless explicitly labeled `noexcept`
* Function binding in cython is now enabled by default. Class methods that are used as wrappers for pickling should be converted to staticmethods.
* Numpydocs now validates more Cython 3 objects than Cython <3
  * Enum types are now being validated, and some unhelpful validation checks on Enums are now ignored
* Added a cython <3 nightly CI job

Note:
* Cython 3.0.0, 3.0.1, 3.0.2 has an issue when compiling with debug mode cython/cython#5552

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* Closes: apache#36730

Lead-authored-by: Dane Pitkin <dane@voltrondata.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
### Rationale for this change

The Cython 3.0.0 upgrade apache#37097 is triggering numpydoc errors for these missing docstrings.

### What changes are included in this PR?

* Docstrings added to Cython functions that omitted them

### Are these changes tested?

Yes, locally.

### Are there any user-facing changes?

User-facing documentation is added.
* Closes: apache#37217

Lead-authored-by: Dane Pitkin <dane@voltrondata.com>
Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
### Rationale for this change

Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0. **Cython 3 is not enabled in this diff.**

### What changes are included in this PR?

* Don't use `vector[XXX]&&`
* Add a declaration for `postincrement`
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#c-postincrement-postdecrement-operator
* Ignore `C4551` warning (function call missing argument list) with MSVC
  * See also: cython/cython#4445
* Add missing `const` to `CLocation`'s static methods.
* Don't use `StopIteration` to stop generator
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#python-3-syntax-semantics
* non-extern `cdef` functions will now propagate python exceptions automatically unless explicitly labeled `noexcept`
* Function binding in cython is now enabled by default. Class methods that are used as wrappers for pickling should be converted to staticmethods.
* Numpydocs now validates more Cython 3 objects than Cython <3
  * Enum types are now being validated, and some unhelpful validation checks on Enums are now ignored
* Added a cython <3 nightly CI job

Note:
* Cython 3.0.0, 3.0.1, 3.0.2 has an issue when compiling with debug mode cython/cython#5552

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* Closes: apache#36730

Lead-authored-by: Dane Pitkin <dane@voltrondata.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
### Rationale for this change

Cython 3.0.0 is the latest release. PyArrow should work with Cython 3.0.0. **Cython 3 is not enabled in this diff.**

### What changes are included in this PR?

* Don't use `vector[XXX]&&`
* Add a declaration for `postincrement`
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#c-postincrement-postdecrement-operator
* Ignore `C4551` warning (function call missing argument list) with MSVC
  * See also: cython/cython#4445
* Add missing `const` to `CLocation`'s static methods.
* Don't use `StopIteration` to stop generator
  * See also: https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#python-3-syntax-semantics
* non-extern `cdef` functions will now propagate python exceptions automatically unless explicitly labeled `noexcept`
* Function binding in cython is now enabled by default. Class methods that are used as wrappers for pickling should be converted to staticmethods.
* Numpydocs now validates more Cython 3 objects than Cython <3
  * Enum types are now being validated, and some unhelpful validation checks on Enums are now ignored
* Added a cython <3 nightly CI job

Note:
* Cython 3.0.0, 3.0.1, 3.0.2 has an issue when compiling with debug mode cython/cython#5552

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* Closes: apache#36730

Lead-authored-by: Dane Pitkin <dane@voltrondata.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI][Python] Cython 3.0 seems to make our build of pyarrow fail
5 participants