New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #9360, fix #9466: grab a lock before creating directories to fix race condition on Windows in partitioned write #9473
Conversation
…ries to fix race condition on Windows in partitioned write
This comment was marked as abuse.
This comment was marked as abuse.
The errors of @l1t1 actually look like a different error, since they occur on writing the .parquet files, not the folders. In my tests, all the folder specific errors are gone with the build artefact. One possible reason for not being able to write files, which I already ran into, was: If the total path length exceeds ~260 characters, it is not possible to read/write those files for most programs in windows. Maybe @l1t1 could check if that is the case? To use paths with length > 260 characters on windows, there must be a registry setting being set and also the application needs to implement some little details (mostly, application manifest must include a longPathAware setting): https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry#enable-long-paths-in-windows-10-version-1607-and-later |
Thanks for having a look - I will merge this as-is then as it seems to at least fix the directory issue. |
This comment was marked as abuse.
This comment was marked as abuse.
Coming from #9360. I ran my example script from #9360 which makes csv files. Output shown below. Agree with what @killerfurbel mentioned above that the reported error is file related Running with threads=1 works fine, and 24 folders are created: h=0 to h=23.
|
This comment was marked as abuse.
This comment was marked as abuse.
Thanks @l1t1 and all, I have a reproduction, also this secondary problem will be addressed. |
Fixes duckdblabs/duckdb-internal#588 improving on duckdb#9473. Idea is that we iterate on all global partitions instead of iterating on the local ones.
The fix should be in #9535. |
Fixes #9360
Fixes #9404
Fixes #9466