New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43348][PYTHON] Support Python 3.8
in PyPy3
#41024
Conversation
Thank you, @HyukjinKwon . |
Thank you, @Yikun . |
from pickle import _Pickler as Pickler # noqa: F401 | ||
else: | ||
import pickle # noqa: F401 | ||
from _pickle import Pickler # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From PyPy Python3.8, _pickle
is removed.
import pickle # noqa: F401 | ||
from _pickle import Pickler # noqa: F401 | ||
import pickle # noqa: F401 | ||
from pickle import Pickler # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same with the upstream.
Python 3.8
in PyPy3
Could you review once more, @HyukjinKwon and @Yikun . |
Thank you, @HyukjinKwon ! |
Merged to master. Thank you all! |
…st only with PyPy 3.8 ### What changes were proposed in this pull request? This PR is a followup of #41024 that skips the test only with PyPy 3.8. ### Why are the changes needed? To narrow the scope of testing skipped. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR should verify the change. Closes #41085 from HyukjinKwon/SPARK-43354-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR aims two goals. 1. Make PySpark support Python 3.8+ with PyPy3 2. Upgrade PyPy3 to Python 3.8 in our GitHub Action Infra Image to enable test coverage Note that there was one failure at `test_create_dataframe_from_pandas_with_day_time_interval` test case. This PR skips the test case and SPARK-43354 will recover it after further investigation. ### Why are the changes needed? Previously, PySpark fails at PyPy3 `Python 3.8` environment. ``` pypy3 version is: Python 3.8.16 (a9dbdca6fc3286b0addd2240f11d97d8e8de187a, Dec 29 2022, 11:45:13) [PyPy 7.3.11 with GCC 10.2.1 20210130 (Red Hat 10.2.1-11)] Starting test(pypy3): pyspark.sql.tests.pandas.test_pandas_cogrouped_map (temp output: /__w/spark/spark/python/target/f1cacde7-d369-48cf-a8ea-724c42872020/pypy3__pyspark.sql.tests.pandas.test_pandas_cogrouped_map__rxih6dqu.log) Traceback (most recent call last): File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 188, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 111, in _get_module_details __import__(pkg_name) File "/__w/spark/spark/python/pyspark/__init__.py", line 59, in <module> from pyspark.rdd import RDD, RDDBarrier File "/__w/spark/spark/python/pyspark/rdd.py", line 54, in <module> from pyspark.java_gateway import local_connect_and_auth File "/__w/spark/spark/python/pyspark/java_gateway.py", line 32, in <module> from pyspark.serializers import read_int, write_with_length, UTF8Deserializer File "/__w/spark/spark/python/pyspark/serializers.py", line 69, in <module> from pyspark import cloudpickle File "/__w/spark/spark/python/pyspark/cloudpickle/__init__.py", line 1, in <module> from pyspark.cloudpickle.cloudpickle import * # noqa File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 56, in <module> from .compat import pickle File "/__w/spark/spark/python/pyspark/cloudpickle/compat.py", line 13, in <module> from _pickle import Pickler # noqa: F401 ModuleNotFoundError: No module named '_pickle' ``` To support Python 3.8 in PyPy3. - From PyPy3.8, `_pickle` is removed. - cloudpipe/cloudpickle#458 - We need this change. - cloudpipe/cloudpickle#469 ### Does this PR introduce _any_ user-facing change? This is an additional support. ### How was this patch tested? Pass the CIs. Closes apache#41024 from dongjoon-hyun/SPARK-43348. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…st only with PyPy 3.8 ### What changes were proposed in this pull request? This PR is a followup of apache#41024 that skips the test only with PyPy 3.8. ### Why are the changes needed? To narrow the scope of testing skipped. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR should verify the change. Closes apache#41085 from HyukjinKwon/SPARK-43354-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR aims two goals.
Note that there was one failure at
test_create_dataframe_from_pandas_with_day_time_interval
test case. This PR skips the test case and SPARK-43354 will recover it after further investigation.Why are the changes needed?
Previously, PySpark fails at PyPy3
Python 3.8
environment.To support Python 3.8 in PyPy3.
_pickle
is removed.Does this PR introduce any user-facing change?
This is an additional support.
How was this patch tested?
Pass the CIs.