Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16548: [Python] Add pytest.mark.parquet to all tests under tests/parquet package #13147

Closed
wants to merge 2 commits into from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented May 13, 2022

If we built arrow and pyarrow without PARQUET and tried to run the PARQUET tests. The first test being executed was not correctly marked.

> /home/raulcd/open_source/arrow/python/pyarrow/tests/conftest.py(244)pytest_runtest_setup()
-> for mark in item.iter_markers():
(Pdb) item
<Function test_parquet_invalid_version>
(Pdb) [x for x in item.iter_markers()]
[]
(Pdb)

Meaning the test was not correctly skipped. This was found on the minimal builds PR: #13113 on this job failures: https://github.com/ursacomputing/crossbow/runs/6407176338?check_suite_focus=true

All tests under the parquet package are on the structure:

  • parquet
    ├── test_basic.py
    ├── test_compliant_nested_type.py
    ├── test_dataset.py
    ├── test_data_types.py
    ├── test_datetime.py
    ├── test_encryption.py
    ├── test_metadata.py
    ├── test_pandas.py
    ├── test_parquet_file.py
    └── test_parquet_writer.py

The implementation marks all the individual tests that are on this structure with the parquet dataset mark correctly.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@raulcd
Copy link
Member Author

raulcd commented May 13, 2022

@jorisvandenbossche other solutions that fix this issue could be:

  • Add the pytestmark = pytest.mark.parquet to each test file individually instead of to the common file
  • Add the pytestmark import (even when we are not using it) on each individual test file (from pyarrow.tests.parquet.common import pytestmark)

I was going in a rabbit hole on importlib and load_module but I don't think is worth the effort to spend more time on this having three possible solutions.
One nice thing is I had to go back to a talk I gave at PyCon Spain on how import works :)

@jorisvandenbossche
Copy link
Member

Hmm, I would actually have expected that the pytestmark = .. in the tests/parquet/__init__.py would ensure this applies that mark to all the tests in that directory (and not the one in common.py).
But if that seems to not work correctly in practice, I think adding pytestmark = pytest.mark.parquet to every file might be the most "low tech" solution?

@raulcd
Copy link
Member Author

raulcd commented May 13, 2022

Thanks @jorisvandenbossche . I was able to reproduce the issue on each one of the test files individually. This solution solves the test failures when PARQUET is not enabled.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me assuming that it does solve the issue.

@ursabot
Copy link

ursabot commented May 17, 2022

Benchmark runs are scheduled for baseline = 52a051b and contender = c032290. c032290 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.08% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.36% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.67% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] c032290b ec2-t3-xlarge-us-east-2
[Finished] c032290b test-mac-arm
[Finished] c032290b ursa-i9-9960x
[Finished] c032290b ursa-thinkcentre-m75q
[Finished] 52a051b1 ec2-t3-xlarge-us-east-2
[Finished] 52a051b1 test-mac-arm
[Finished] 52a051b1 ursa-i9-9960x
[Finished] 52a051b1 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

pytestmark = pytest.mark.parquet_encryption
pytestmark = pytest.mark.parquet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we can't assign two marks this way (it is redefining the same variable), it should be something like pytestmark = [..., ...] I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants