New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-28041: Inconsistent behavior: Get st_nlink from os.stat() and os.scandir() #795
bpo-28041: Inconsistent behavior: Get st_nlink from os.stat() and os.scandir() #795
Conversation
thebecwar
commented
Mar 23, 2017
- Unify stat approach between POSIX and Windows hosts. The data that comes out of the WIN32_FIND_DATAW is not the same information that is available in BY_HANDLE_FILE_INFORMATION.
- Add unit test to verify fix
- Update deleted file/directory tests to reflect new consistent behavior.
…scandir() - Unify stat approach between POSIX and Windows hosts. The data that comes out of the WIN32_FIND_DATAW is not the same information that is available in BY_HANDLE_FILE_INFORMATION. - Add unit test to verify fix - Update deleted file/directory tests to reflect new consistent behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explicitly undoes the performance boosts available on Windows by requiring each stat to cache on access, instead of using the free information provided by FindFirstFile/FindNextFile.
The free information is incomplete. The missing information means that calling |
And those differences are documented. Making scandir + stat take significantly longer (much worse on if it's a high latency network share, as the round trip time ends up paid for each file, a situation I've personally encountered) for fields that are comparatively unlikely to be needed is not a reasonable compromise. If it's possible, you might consider trying to make the three problematic fields implemented as properties that will lazily stat on first access to those specific fields, to normalize behavior without causing performance regressions. Not sure if StructSequences are flexible enough to support that though... |
Here's the options as I see it. I'm new to the codebase here, so I'm not really the best one to decide which way to go. I'm happy to hear opinions. I'd also be happy to POC/demo any of the options. Do nothing
Merge this PR as is.Benefits:
Drawbacks:
Add a method to DirEntry (something like
|
@@ -11569,7 +11565,7 @@ DirEntry_from_find_data(path_t *path, WIN32_FIND_DATAW *dataW) | |||
if (!entry->path) | |||
goto error; | |||
} | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extraneous whitespace.
You misunderstood scandir () design. See the issue. |