You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The local filesystem implementation of list_with_delimiter is a two step process:
First, walkdir is used to get the filenames
Second, convert_entry is called to get the metadata for each file
If a file is deleted between the first and second step then list_with_delimiter will return an error:
Generic LocalFileSystem error: Unable to access metadata for tmp/pytest-of-pace/pytest-3/test_compact_with_write_82_1000/dataset/_versions/.tmp_7.manifest_9c100374-3298-4537-afc6-f5ee7913666d
Note, I suspect that walkdir itself may fail if an entire directory is deleted while walkdir is iterating. This may be an issue that cannot be cleanly solved without writing a custom walk dir implementation that catches and swallows "file not found" errors.
To Reproduce
Create a dataset with a lot of files
In one thread call list_with_delimiter
In another thread start deleting the files in the dataset
Expected behavior
There is no error. Either the file is returned with metadata, or no file is returned.
Additional context
N/A
The text was updated successfully, but these errors were encountered:
I had a go at this. Simply ignore all not found errors as suggested by @tustvold. Although I did not add a test for this by creating two threads we call list_with_delimiter in one and delete files in the other.
…n Windows (#5830)
* Fix issue #5800: Handle missing files in list_with_delimiter
* draft
* cargo fmt
* Handle leading colon
* Add windows CI
* Fix CI job
* Only run local tests and set target family for failing tests
* Run all tests without my changes and removed target os
* Restore changes again
* Add back newline (removed by mistake)
* Fix test after merge with master
Describe the bug
The local filesystem implementation of
list_with_delimiter
is a two step process:walkdir
is used to get the filenamesconvert_entry
is called to get the metadata for each fileIf a file is deleted between the first and second step then
list_with_delimiter
will return an error:Note, I suspect that
walkdir
itself may fail if an entire directory is deleted whilewalkdir
is iterating. This may be an issue that cannot be cleanly solved without writing a custom walk dir implementation that catches and swallows "file not found" errors.To Reproduce
Create a dataset with a lot of files
In one thread call
list_with_delimiter
In another thread start deleting the files in the dataset
Expected behavior
There is no error. Either the file is returned with metadata, or no file is returned.
Additional context
N/A
The text was updated successfully, but these errors were encountered: