New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-15847: [Python] Building with Parquet but without Parquet encryption fails #12565
ARROW-15847: [Python] Building with Parquet but without Parquet encryption fails #12565
Conversation
I first want to test if I can reproduce it on CI (I think almost all builds we run have parquet and parquet encryption either both enabled or both disabled) |
The conda builds are failing with a different error (cmake version issues), but so the "Python / AMD64 MacOS 10.15 Python 3" build is now failing with the error I see locally (https://github.com/apache/arrow/runs/5423674812?check_suite_focus=true) |
@kszucs @pitrou any idea what would be a good way or place to have one of the nightly builds without parquet encryption? (in the first commit here I just crudely disabled it for all python github actions builds in CI, but reverted that again) We do have a python minimal build example (which has parquet but without encryption), but it seems we don't run that in CI like the cpp minimal build example. |
We could have one the nightly builds with that option turned off. |
@jorisvandenbossche You should be able to rebase to get the conda issues fixed. |
ebbdb0d
to
fd43be2
Compare
2c72e12
to
4fef0d5
Compare
@github-actions crossbow submit test-conda-python-3.8-pandas-latest |
Revision: 4fef0d5 Submitted crossbow builds: ursacomputing/crossbow @ actions-1725
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche You need to rebase in order to fix conflicts.
@@ -1392,6 +1392,7 @@ tasks: | |||
PYTHON: "{{ python_version }}" | |||
PANDAS: "{{ pandas_version }}" | |||
NUMPY: "{{ numpy_version }}" | |||
PARQUET_REQUIRE_ENCRYPTION: "OFF" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this just for one of these builds, not all of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is. I suggest to remove cache_leaf
jinja variable and just set flags: --no-leaf-cache
for all cases.
dev/tasks/tasks.yml
Outdated
{% if cache_leaf %} | ||
# use the latest pandas release, so prevent reusing any cached layers | ||
flags: --no-leaf-cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record, isn't this condition inverted? In the parameters, cache_leaf
is true for the fixed pandas version (0.24)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is. I suggest to remove
cache_leaf
jinja variable and just setflags: --no-leaf-cache
for all cases.
I just inverted the check, but can also remove it altogether. I assume the cache is still useful for the one build with fixed pandas version?
@github-actions crossbow submit test-conda-python-3.8-pandas-latest test-conda-python-3.8-pandas-nightly |
For the record, you can use wildcards in crossbow submissions (e.g. |
Yes, I just wanted to limit it to the minimum I needed to check the change (although maybe given we run already so many builds, that doesn't matter that much :)) |
Revision: 54f648f Submitted crossbow builds: ursacomputing/crossbow @ actions-1727
|
Hmm, so the latest test doesn't work anymore (parquet encryption tests are not skipped but passing, so it was built nonetheless), so either my change to limit it to "pandas == latest" or one of the last changes on master interfered. Maybe #12577? (although that's for the packaging builds, that using a different build script for C++) |
Ouch, sorry. Perhaps |
Benchmark runs are scheduled for baseline = c70426f and contender = 7aecc83. 7aecc83 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
…parquet encryption disabled Follow up on #12565 (comment) Closes #12587 from jorisvandenbossche/ARROW-15847 Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
No description provided.