-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Flaky test test_write_dataset_max_open_files #30918
Comments
Joris Van den Bossche / @jorisvandenbossche: |
Vibhatha Lakmal Abeykoon / @vibhatha: |
Vibhatha Lakmal Abeykoon / @vibhatha: @westonpace could this be happening due to the small data size? |
Antoine Pitrou / @pitrou: |
Vibhatha Lakmal Abeykoon / @vibhatha: |
Vibhatha Lakmal Abeykoon / @vibhatha: |
Antoine Pitrou / @pitrou: |
Antoine Pitrou / @pitrou: |
Vibhatha Lakmal Abeykoon / @vibhatha:
|
Weston Pace / @westonpace: We'd get more than 5 files if we get something like: B1P1, B1P2, B1P3, B1P4, B1P5, B2P1, ..., B4P5 However, we'd only get 5 files if we get something like: B1P1, B2P1, B3P1, B4P1, B1P2, ... B4P5 Since we scan the source in parallel both orders are possible.
There is no easy way in the dataset writer to disable parallel writing (the CPU path is completely serial but it submits I/O tasks for each batch so you would need to shrink the I/O thread pool to size 1).
This will disable parallel scanning which should be enough to prevent the flakiness (unless I am misunderstanding how the error is generated). I'll try and setup a reproduction. |
Weston Pace / @westonpace: I don't know if this bug should block the RC but if we cut another RC it would probably be nice to include. |
Antoine Pitrou / @pitrou: |
Antoine Pitrou / @pitrou: |
Found during 7.0.0 verification
Reporter: David Li / @lidavidm
Assignee: Vibhatha Lakmal Abeykoon / @vibhatha
PRs and other links:
Note: This issue was originally created as ARROW-15438. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: