Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-32758][python] Remove PyFlink dependencies' upper bounds #23141

Closed
wants to merge 11 commits into from

Conversation

deepyaman
Copy link
Contributor

@deepyaman deepyaman commented Aug 4, 2023

What is the purpose of the change

This pull request loosens PyFlink dependencies to make the library compatible with a wider range of other Python data libraries, including more recent releases of major libraries like pandas, Ibis, etc.

Brief change log

  • Relax or remove the upper bounds for a lot of PyFlink dependencies
  • Replace deprecated pd.DataFrame.iteritems() with pd.DataFrame.items()

Verifying this change

Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: don't know
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 4, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@deepyaman deepyaman changed the title [python] Remove PyFlink dependencies' upper bounds [FLINK-32758][python] Remove PyFlink dependencies' upper bounds Aug 5, 2023
@deepyaman deepyaman marked this pull request as ready for review August 7, 2023 19:34
Copy link
Contributor

@HuangXingBo HuangXingBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepyaman Thanks a lot for the feature. Looks good overall. I only have one comment. And maybe we need to trigger pipeline manually more times to validate these changes, as we met some problem in some azure machines previously due to some packages upgrade to new version.

flink-python/dev/dev-requirements.txt Outdated Show resolved Hide resolved
@HuangXingBo
Copy link
Contributor

@flinkbot run azure

1 similar comment
@dianfu
Copy link
Contributor

dianfu commented Aug 17, 2023

@flinkbot run azure

@deepyaman
Copy link
Contributor Author

@flinkbot run azure

@dianfu
Copy link
Contributor

dianfu commented Aug 24, 2023

@deepyaman Thanks a lot for the PR. Big +1 for this improvement. The PR LGTM overall.

I found that the change of Cython may break the tests, see https://dev.azure.com/dianfu/Flink/_build/results?buildId=729&view=logs&j=fba17979-6d2e-591d-72f1-97cf42797c11&t=727942b6-6137-54f7-1ef9-e66e706ea068 (I have triggered nightly tests which will run more tests. Cython tests are skipped in the PR CI) for more details.

It seems related to recently released cython 3.0.0. After changing cython>=0.29.24 to cython>=0.29.24,<3, all the tests passed:
https://dev.azure.com/dianfu/Flink/_build/results?buildId=733&view=logs&j=fba17979-6d2e-591d-72f1-97cf42797c11&t=727942b6-6137-54f7-1ef9-e66e706ea068

@deepyaman
Copy link
Contributor Author

@deepyaman Thanks a lot for the PR. Big +1 for this improvement. The PR LGTM overall.

I found that the change of Cython may break the tests, see https://dev.azure.com/dianfu/Flink/_build/results?buildId=729&view=logs&j=fba17979-6d2e-591d-72f1-97cf42797c11&t=727942b6-6137-54f7-1ef9-e66e706ea068 (I have triggered nightly tests which will run more tests. Cython tests are skipped in the PR CI) for more details.

It seems related to recently released cython 3.0.0. After changing cython>=0.29.24 to cython>=0.29.24,<3, all the tests passed: https://dev.azure.com/dianfu/Flink/_build/results?buildId=733&view=logs&j=fba17979-6d2e-591d-72f1-97cf42797c11&t=727942b6-6137-54f7-1ef9-e66e706ea068

@dianfu Sounds great! Let me know if the nightly job raises any other issues. In the meantime, I've gone ahead and updated cython>=0.29.24 to cython>=0.29.24,<3 in this PR, too, as you found necessary.

@dianfu
Copy link
Contributor

dianfu commented Aug 24, 2023

@deepyaman All the tests passed after making the changes of cython.

BTW, it seems that there is one minor problem for the latest update. It should be cython>=0.29.24,<3 instead of cython>=0.29.24,3

@deepyaman
Copy link
Contributor Author

BTW, it seems that there is one minor problem for the latest update. It should be cython>=0.29.24,<3 instead of cython>=0.29.24,3

Sorry, made a typo in one place; fixed!

@dianfu dianfu closed this in 0dd6f97 Aug 25, 2023
@deepyaman deepyaman deleted the python/relax-upper-bounds branch August 25, 2023 02:40
dianfu pushed a commit that referenced this pull request Aug 30, 2023
dianfu pushed a commit that referenced this pull request Aug 30, 2023
@deepyaman
Copy link
Contributor Author

@dianfu @HuangXingBo Do you all have an estimated timeline for when an update PyFlink 1.17 (and possibly 1.18) will be available, now that this has been merged for a while, and the corresponding Jira is closed?

RocMarshal pushed a commit to RocMarshal/flink that referenced this pull request May 9, 2024
nji302 pushed a commit to lyft/flink that referenced this pull request May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants