Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[object_store] Potential race condition in list_with_delimiter on Local #5800

Closed
westonpace opened this issue May 24, 2024 · 2 comments · Fixed by #5803
Closed

[object_store] Potential race condition in list_with_delimiter on Local #5800

westonpace opened this issue May 24, 2024 · 2 comments · Fixed by #5803
Labels
bug good first issue Good for newcomers help wanted object-store Object Store Interface

Comments

@westonpace
Copy link
Member

westonpace commented May 24, 2024

Describe the bug
The local filesystem implementation of list_with_delimiter is a two step process:

  • First, walkdir is used to get the filenames
  • Second, convert_entry is called to get the metadata for each file

If a file is deleted between the first and second step then list_with_delimiter will return an error:

Generic LocalFileSystem error: Unable to access metadata for tmp/pytest-of-pace/pytest-3/test_compact_with_write_82_1000/dataset/_versions/.tmp_7.manifest_9c100374-3298-4537-afc6-f5ee7913666d

Note, I suspect that walkdir itself may fail if an entire directory is deleted while walkdir is iterating. This may be an issue that cannot be cleanly solved without writing a custom walk dir implementation that catches and swallows "file not found" errors.

To Reproduce

Create a dataset with a lot of files
In one thread call list_with_delimiter
In another thread start deleting the files in the dataset

Expected behavior
There is no error. Either the file is returned with metadata, or no file is returned.

Additional context
N/A

@tustvold
Copy link
Contributor

I think this should be a relatively straightforward case of suppressing not found errors when listing

@hesampakdaman
Copy link
Contributor

I had a go at this. Simply ignore all not found errors as suggested by @tustvold. Although I did not add a test for this by creating two threads we call list_with_delimiter in one and delete files in the other.

@tustvold tustvold added the object-store Object Store Interface label Jun 3, 2024
alamb pushed a commit that referenced this issue Jul 13, 2024
…n Windows (#5830)

* Fix issue #5800: Handle missing files in list_with_delimiter

* draft

* cargo fmt

* Handle leading colon

* Add windows CI

* Fix CI job

* Only run local tests and set target family for failing tests

* Run all tests without my changes and removed target os

* Restore changes again

* Add back newline (removed by mistake)

* Fix test after merge with master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug good first issue Good for newcomers help wanted object-store Object Store Interface
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants