Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files #12263

Conversation

westonpace
Copy link
Member

The test could fail when writing due to a race condition. If the batches were delivered AAAAABBBBBCCCCC... then by the time we need to close a file to make space we can close an already completed file (and so we won't have to open up a new one later) and we end up with 5 files for 5 partitions.

Adding use_threads=False to the write_dataset call was not sufficient. The arrow::dataset::FileSystemDataset::Write method was always using the CPU executor for the exec plan. In other scanner methods we base the CPU executor on the scan options (nullptr if scan_options->use_threads is false). Making both of these changes together seems to make the test reliably pass.

@github-actions
Copy link

@kszucs
Copy link
Member

kszucs commented Jan 25, 2022

Thanks Weston!

@lidavidm could you please verify this locally?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this, Weston!

I ran it locally and it seems to be reliable now (at least, before it would fail within a few runs, now it runs for a while and I eventually just killed it)

@westonpace
Copy link
Member Author

@kszucs Feel free to merge this if you want. This should not block RC6 as it is mostly a flaky test (the threading thing has some practical implications but they are minor)

@vibhatha
Copy link
Collaborator

Thanks for looking into this @westonpace 👍

@pitrou
Copy link
Member

pitrou commented Jan 27, 2022

This fixes the failure for me while it could be reproduced quite reliably on master.

@pitrou pitrou closed this in 5a51c6d Jan 27, 2022
@ursabot
Copy link

ursabot commented Jan 27, 2022

Benchmark runs are scheduled for baseline = 79800d4 and contender = 5a51c6d. 5a51c6d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️2.5% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.22% ⬆️0.04%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants