-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-50647][INFRA] Add a daily build for PySpark with old dependencies #49267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7613543 to
b737a42
Compare
|
pyarrow 10.0 fails the whole pyspark https://github.com/zhengruifeng/spark/actions/runs/12464102622/job/34787749014 |
|
pyarrow 11.0 fails PS |
|
pyarrow 12 also fails PS |
|
pyarrow 13 also fails PS |
|
pyarrow 16 also fail PS |
75ee999 to
da633eb
Compare
|
Will send a separate PR to upgrade the minimum requirement of pyarrow to 11.0.0 |
### What changes were proposed in this pull request? Upgrade the minimum version of `pyarrow` to 11.0.0 ### Why are the changes needed? according to my test in #49267, pyspark with `pyarrow=10.0.0` has already been broken - pyspark-sql failed - pyspark-connect failed - pyspark-pandas failed see https://github.com/zhengruifeng/spark/actions/runs/12464102622/job/34787749014 ### Does this PR introduce _any_ user-facing change? doc changes ### How was this patch tested? ci ### Was this patch authored or co-authored using generative AI tooling? no Closes #49282 from zhengruifeng/mini_arrow_11. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
|
thanks, merged to master |
What changes were proposed in this pull request?
Add a daily build for PySpark with old dependencies
Why are the changes needed?
to guard the installation described in https://apache.github.io/spark/api/python/getting_started/install.html
The installation guide is outdated:
pyspark-sql/connect requires
-- pyarrow>=11.0
-- numpy>=1.21
-- pandas>=2.0.0
pyspark-pandas requires a even new versions of pandas/pyarrow/numpy
-- pyarrow>=11.0
-- numpy>=1.22.4
-- pandas>=2.2.0
This PR excludes PS: we can either
Does this PR introduce any user-facing change?
no, infra-only
How was this patch tested?
PR build with
https://github.com/zhengruifeng/spark/runs/34827211339
Was this patch authored or co-authored using generative AI tooling?
no