Fix flaky MountVolume tests: replace fixed delays with polling loops#125914
Fix flaky MountVolume tests: replace fixed delays with polling loops#125914
Conversation
… loops - Delete_MountVolume.cs: Replace 7 fixed Task.Delay(300).Wait() calls with WaitForDirectoryGone() polling helper (10s timeout, 100ms intervals using Environment.TickCount64) before Directory.Exists assertions. Also replace Task.Delay in DeleteDir retry loop with Thread.Sleep. - ReparsePoints_MountVolume.cs: Add retry logic to DeleteDir with IOException handling and Thread.Sleep backoff to handle mount teardown timing. Fixes #125295, #125624 Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/90f5ab9b-bd66-4286-9b2c-72778993ada3
|
Tagging subscribers to this area: @dotnet/area-system-io |
| Directory.Delete(fileName, sub); | ||
| return; | ||
| } | ||
| catch (IOException) when (attempts > 1) |
There was a problem hiding this comment.
add comment that we have seen this IOException occasionlly (and if possible why it might be)
There was a problem hiding this comment.
Added in 7348e64. The comment explains that IOException can occur transiently when the volume is still being unmounted — the reparse point directory can be momentarily locked by the kernel while the mount is being torn down.
|
@copilot address feedback |
Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/1cc6fac5-ff28-4293-b1dc-b35cdb75dd67
There was a problem hiding this comment.
Pull request overview
Reduces flakiness in Windows-only NTFS MountVolume filesystem tests by replacing fixed sleeps with polling/retry logic around directory deletion and cleanup.
Changes:
Delete_MountVolume.cs: replaces multiple fixed 300ms delays with aWaitForDirectoryGonepolling helper (up to 10s), and switches a retry-loop delay toThread.Sleep.ReparsePoints_MountVolume.cs: hardens cleanup by retryingDirectory.Deleteon transientIOExceptionduring unmount teardown.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/Delete_MountVolume.cs | Introduces a polling helper and removes fixed-delay assumptions after deletes via mount points. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/ReparsePoints_MountVolume.cs | Adds targeted retry logic for transient unmount-related IOException during directory cleanup. |
Description
Directory_Delete_MountVolumeandDirectory_ReparsePoints_MountVolumeare flaky on loaded CI machines because fixed-duration waits are insufficient for NTFS mount point operations to propagate.Delete_MountVolume.cs7 locations used
Task.Delay(300).Wait()before asserting!Directory.Exists()after deleting through a mount point. 300 ms is not enough under load.WaitForDirectoryGone(string path): pollsDirectory.Existsevery 100 ms for up to 10 s usingEnvironment.TickCount64for accurate elapsed trackingWaitForDirectoryGone(<path>)Task.Delay(300).Wait()in theDeleteDirretry loop withThread.Sleep(300); removed unusedSystem.Threading.TasksimportReparsePoints_MountVolume.csDeleteDir(called infinallyblocks afterMountHelper.Unmount) had no retry logic, so transientIOExceptionduring volume teardown would fail cleanup silently or throw.Directory.Deletewith a retry loop: catchesIOExceptionspecifically (which can occur transiently when the volume is still being unmounted — the reparse point directory may be momentarily locked by the kernel while the mount is being torn down), retries up to 10× with 200 ms back-offcatch (IOException)block documenting the observed transient failure modeusing System.ThreadingChanges
Directory/Delete_MountVolume.cs— polling helper + 7 delay replacementsDirectory/ReparsePoints_MountVolume.cs— robustDeleteDirwithIOExceptionretry and explanatory commentTesting
Tests are Windows-only (
[PlatformSpecific(TestPlatforms.Windows)], requires NTFS) and require an elevated environment with mount point access. Validation requires observing reduced flakiness on CI.Original prompt
📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.