Skip to content

Remove readdir_r entirely and handle filenames > 255 bytes #116639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 16, 2025

Conversation

jozkee
Copy link
Member

@jozkee jozkee commented Jun 13, 2025

Alternative approach described in #116619 suitable for main but requires validation for multiple platforms, especially the ones where historical fixes are being removed.

@NattyNarwhal, as per the git history, you've been submitting multiple fixes for AIX, could you please help validating this one?
Tagging IllumOS folks and kindly asking the same: @AustinWise @gwr @am11.

@jozkee jozkee added this to the 10.0.0 milestone Jun 13, 2025
@jozkee jozkee requested review from GrabYourPitchforks, jkotas and a team June 13, 2025 16:54
@jozkee jozkee self-assigned this Jun 13, 2025
@Copilot Copilot AI review requested due to automatic review settings June 13, 2025 16:54
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes the legacy readdir_r support across the native libraries and switches to using readdir exclusively, simplifying the buffer management and unifying the directory enumeration path. It also adds a manual test for NTFS filename length handling and extends the existing parallel file enumeration tests.

  • Remove HAVE_READDIR_R checks and associated APIs in both CMake config and headers.
  • Replace SystemNative_ReadDirR/buffer-size API with SystemNative_ReadDir in the PAL and interop layers.
  • Update FileSystemEnumerator.Unix and related code to drop manual buffer management.
  • Add manual NTFS-on-Linux mount setup and a new threaded enumeration test.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/native/libs/configure.cmake Dropped check for readdir_r availability
src/native/libs/System.Native/pal_io.h Removed R-buffer APIs; added SystemNative_ReadDir
src/native/libs/System.Native/pal_io.c Deleted readdir_r path; streamlined SystemNative_ReadDir
src/native/libs/System.Native/entrypoints.c Updated entrypoints to use SystemNative_ReadDir
src/native/libs/Common/pal_config.h.in Removed HAVE_READDIR_R define
src/libraries/System.Runtime/tests/.../ManualTests.csproj Included new NTFS manual test source files
src/libraries/System.Runtime/tests/.../NtfsOnLinuxTests.cs Added manual NTFS filename-length test
src/libraries/System.Runtime/tests/.../NtfsOnLinuxSetup.cs Added loopback NTFS mount/unmount fixture for manual tests
src/libraries/System.Runtime/tests/.../EnumerableTests.cs Added a Parallel.ForEach enumeration test
src/libraries/System.Private.CoreLib/.../NonAndroid.cs Removed unused GetDirectoryEntryFullPath helper
src/libraries/System.Private.CoreLib/.../FileSystemEnumerator.Unix.cs Deleted entry-buffer logic; call into new ReadDir
src/libraries/System.Private.CoreLib/.../FileSystemEntry.Unix.cs Updated fixed buffer size constant and span usage
src/libraries/Common/.../Interop.ReadDir.cs Updated interop to only export ReadDir; improved GetName

Comment on lines 102 to 103
Span<char> buffer = MemoryMarshal.CreateSpan(ref _fileNameBuffer._buffer[0], DecodedNameBufferLength);
_fileName = _directoryEntry.GetName(buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Span<char> buffer = MemoryMarshal.CreateSpan(ref _fileNameBuffer._buffer[0], DecodedNameBufferLength);
_fileName = _directoryEntry.GetName(buffer);
_fileName = _directoryEntry.GetName(_buffer);

If you accept the suggestion above, this should be able to use default conversion from InlineArray to Span

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This failed for me locally:

error CS9084: Struct member returns 'this' or other instance members by reference

I pushed anyways to share the CI error, it does compile fine in sharplab.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I have not noticed that the Span escapes here and so it is not possible to create it in safe way (without large refactoring that would likely hurt overall readability).

@jkotas
Copy link
Member

jkotas commented Jun 13, 2025

LGTM otherwise.

@AustinWise
Copy link
Contributor

AustinWise commented Jun 13, 2025

In my survey of open source libc's, readdir_r ensures thread safety by taking a lock on the DIR*. Since FileSystemEnumerator takes a lock when calling readdir, so I think this PR ensures an equivalent level of thread safety.

References to libc implementations

@jozkee
Copy link
Member Author

jozkee commented Jun 13, 2025

@AustinWise the question is more about readdir using a global static buffer, see #116619 (comment).

@jozkee
Copy link
Member Author

jozkee commented Jun 13, 2025

I think POSIX.2024 also alludes to the buffer being global, and not just stored in each DIR:

Historically, readdir() returned a pointer to an internal static buffer that was overwritten by each call

@AustinWise
Copy link
Contributor

I don't see the readdir implementations cited above making use of some state that the readdir_r variants don't use. So my reasoning is readdir_r is safe to use with its lock, readdir is just as safe to use with a lock in C#. It's hard to prove the absence of a problem of course.

I tried to find some example of a libc returning an "internal static buffer". I don't see it in the earliest available commits for illumos from 2005 or GLibC from 1995. BSD stopped using a static buffer in 1982.

Comment on lines 102 to 103
Span<char> buffer = MemoryMarshal.CreateSpan(ref _fileNameBuffer._buffer[0], DecodedNameBufferLength);
_fileName = _directoryEntry.GetName(buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I have not noticed that the Span escapes here and so it is not possible to create it in safe way (without large refactoring that would likely hurt overall readability).

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jozkee
Copy link
Member Author

jozkee commented Jun 16, 2025

/ba-g #116697.

@hamarb123
Copy link
Contributor

Won't this regress #47584? macOS used to be using readdir_r with EINTR handling, but the EINTR handling hasn't been added to readdir, despite being used instead now.

@jkotas
Copy link
Member

jkotas commented Jul 7, 2025

Yes, it looks like a problem - even on Linux. @hamarb123 Would you like to submit a PR with the fix?

@hamarb123
Copy link
Contributor

@jkotas should I just fix the one in pal_io.c or should I also adjust all the other instances of readdir, opendir, etc. that I find?

@jkotas
Copy link
Member

jkotas commented Jul 7, 2025

It would be great if you can fix all instances that you can find.

@hamarb123
Copy link
Contributor

hamarb123 commented Jul 7, 2025

@jkotas I've had a look at some other functions which have EINTR handling in some places too, such as write - we do not seem to handle its result consistently - in some places we don't loop the call, in some places, we loop the call, but don't handle partial writes, and in some places we do handle partial writes. I think it would be easiest to add wrappers for APIs like this & just call into those - does that sound like a good idea? If so, where would the best spot to place it be? It seems that src/native/libs/Common/pal_io_common.h has something like this pre-existing with its Common_Write, but it is not used consistently - should I just be including that header in places that don't have it & using that & adding new ones into there, or would it be better to extract this into a lower-level implementation in a similar fashion? I was thinking that making some file like libc_wrappers.h (or some other similar name) & just adding originalname_wrapped variants of the functions with EINTR handling would be useful, as it can centralise the logic we use to wrap them & ensure it's consistent everywhere that we want it to be.

Or would you rather I just do the change for opendir / readdir & leave something more extensive for the future?

@jkotas
Copy link
Member

jkotas commented Jul 7, 2025

I think it would be easiest to add wrappers for APIs like this & just call into those - does that sound like a good idea?

I think there should not be that many places that call the Linux syscalls directly in our shipping code. I am not sure whether we need a shared wrapper around the syscalls. Do you have a list of places that have the potential problems?

@hamarb123
Copy link
Contributor

I will have a proper look tomorrow, can you remind me which path/s I can limit my search to for just the actual shipping code?

@jkotas
Copy link
Member

jkotas commented Jul 7, 2025

If the path contains "test", it is non-shipping code. Also, the code under src\native\external is shipping code that should be fixed in the upstream copy (it can be done in parallel with fixing it here).

@gwr
Copy link
Contributor

gwr commented Jul 7, 2025

A bit late, but my testing on illumos with this included seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants