Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-15847: [Python] Building with Parquet but without Parquet encryption fails #12565

Closed

Conversation

jorisvandenbossche
Copy link
Member

No description provided.

@github-actions
Copy link

github-actions bot commented Mar 4, 2022

@jorisvandenbossche
Copy link
Member Author

I first want to test if I can reproduce it on CI (I think almost all builds we run have parquet and parquet encryption either both enabled or both disabled)

@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented Mar 4, 2022

The conda builds are failing with a different error (cmake version issues), but so the "Python / AMD64 MacOS 10.15 Python 3" build is now failing with the error I see locally (https://github.com/apache/arrow/runs/5423674812?check_suite_focus=true)

@jorisvandenbossche
Copy link
Member Author

@kszucs @pitrou any idea what would be a good way or place to have one of the nightly builds without parquet encryption? (in the first commit here I just crudely disabled it for all python github actions builds in CI, but reverted that again)

We do have a python minimal build example (which has parquet but without encryption), but it seems we don't run that in CI like the cpp minimal build example.

@pitrou
Copy link
Member

pitrou commented Mar 4, 2022

We could have one the nightly builds with that option turned off.

@pitrou
Copy link
Member

pitrou commented Mar 7, 2022

@jorisvandenbossche You should be able to rebase to get the conda issues fixed.

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-conda-python-3.8-pandas-latest

@github-actions
Copy link

github-actions bot commented Mar 8, 2022

Revision: 4fef0d5

Submitted crossbow builds: ursacomputing/crossbow @ actions-1725

Task Status
test-conda-python-3.8-pandas-latest Github Actions

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche You need to rebase in order to fix conflicts.

@@ -1392,6 +1392,7 @@ tasks:
PYTHON: "{{ python_version }}"
PANDAS: "{{ pandas_version }}"
NUMPY: "{{ numpy_version }}"
PARQUET_REQUIRE_ENCRYPTION: "OFF"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this just for one of these builds, not all of them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I suggest to remove cache_leaf jinja variable and just set flags: --no-leaf-cache for all cases.

Comment on lines 1396 to 1398
{% if cache_leaf %}
# use the latest pandas release, so prevent reusing any cached layers
flags: --no-leaf-cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record, isn't this condition inverted? In the parameters, cache_leaf is true for the fixed pandas version (0.24)...

@kszucs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I suggest to remove cache_leaf jinja variable and just set flags: --no-leaf-cache for all cases.

I just inverted the check, but can also remove it altogether. I assume the cache is still useful for the one build with fixed pandas version?

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit test-conda-python-3.8-pandas-latest test-conda-python-3.8-pandas-nightly

@pitrou
Copy link
Member

pitrou commented Mar 8, 2022

@github-actions crossbow submit test-conda-python-3.8-pandas-latest test-conda-python-3.8-pandas-nightly

For the record, you can use wildcards in crossbow submissions (e.g. crossbow submit test-conda-*)

@jorisvandenbossche
Copy link
Member Author

For the record, you can use wildcards in crossbow submissions (e.g. crossbow submit test-conda-*)

Yes, I just wanted to limit it to the minimum I needed to check the change (although maybe given we run already so many builds, that doesn't matter that much :))

@github-actions
Copy link

github-actions bot commented Mar 8, 2022

Revision: 54f648f

Submitted crossbow builds: ursacomputing/crossbow @ actions-1727

Task Status
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-pandas-nightly Github Actions

@pitrou pitrou closed this in 7aecc83 Mar 8, 2022
@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented Mar 8, 2022

Hmm, so the latest test doesn't work anymore (parquet encryption tests are not skipped but passing, so it was built nonetheless), so either my change to limit it to "pandas == latest" or one of the last changes on master interfered. Maybe #12577? (although that's for the packaging builds, that using a different build script for C++)

@pitrou
Copy link
Member

pitrou commented Mar 8, 2022

Ouch, sorry. Perhaps PARQUET_REQUIRE_ENCRYPTION doesn't get propagated to the container?
(perhaps easier to debug this locally than to wait for crossbow-submitted build?)

@ursabot
Copy link

ursabot commented Mar 8, 2022

Benchmark runs are scheduled for baseline = c70426f and contender = 7aecc83. 7aecc83 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.5% ⬆️0.13%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.3% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

pitrou added a commit that referenced this pull request Mar 10, 2022
…parquet encryption disabled

Follow up on #12565 (comment)

Closes #12587 from jorisvandenbossche/ARROW-15847

Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants