-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow file access times when accessing NFS server #48757
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Area label is probably area-System.IO. |
Removing "needs further triage" since we've marked this as "up-for-grabs". |
I can still reproduce this on .NET 6. I tested a few different server distros to see if there are differences:
The control applications in Python and Bash still experience none of these issues. The client is always Ubuntu 21.10. If the Python or Bash control application accesses the NFS mount on Alpine or Fedora first the .NET application will have no further issues afterwards. |
Hi @DomiStyle I am sorry but we have not investigated this issue yet. I suspect that
runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileSystem.Unix.cs Lines 225 to 226 in 71393d0
Is there any chance you could modify the code to use the overload that allows for overwriting? - File.Move(nfsFile, localFile);
+ File.Move(nfsFile, localFile, true); This one uses just one extra sys-call beside We could definitely try remove some of these sys-calls similarly to what I am trying to do in #63675 for |
Hey @adamsitnik
Sure, I tried with the overloaded method in both directions but unfortunately it is still slow. I added some
Works normally: Stuck for ~6 seconds: |
@DomiStyle is there any chance you could capture a trace file using Here you can find the instructions: https://github.com/dotnet/performance/blob/main/docs/profiling-workflow-dotnet-runtime.md#perfcollect They refer to If you prefer a video I've demoed it here: https://www.youtube.com/watch?v=y4-h3qyDpJo&t=1309s&ab_channel=DotNext |
@adamsitnik Sure, here's the trace: I simplified the program and just let it move a file to NFS and back 3 times. I got edit: Just as a side note, there is no crossgen in my .nuget folder so I'm not sure where to find it. I did a self-contained build before checking. |
@DomiStyle big thanks for sharing the trace file! I was able to open it and identify the problem. So the code performs 3
and falls back to the slow path: runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileSystem.Unix.cs Lines 235 to 236 in 71393d0
Which performs multiple sys-calls instead of single And spends most of the time in @tmds I can see that rename fails with You can see the trace file here, please go directly to "Left Heavy" tab as there is only 20ms of data in the trace file and "Time Order" is empty. |
I don't expect Does the timing change if you avoid the file locking by setting Can you share the output of |
Yes, that drastically reduces the access time from 6s to <100ms, which seems to be inline with what bash can do.
strace -c cp non-nfs-mount/testfile nfs-mount/testfile
strace -c cp nfs-mount/testfile non-nfs-mount/testfile
strace cp non-nfs-mount/testfile nfs-mount/testfile
strace cp nfs-mount/testfile non-nfs-mount/testfile
strace -c mv non-nfs-mount/testfile nfs-mount/testfile
|
The locking is what is causing the long delay. To improve it you should look at your NFS client/server configuration. NFS locking implementation changed significantly between NFSv3 and NFS4. With NFSv3, you can disable locking for a mount using the |
I think you're right. I wrote a test application for locking and unlocking in bash and I get the same ~6 second delay with a
Turns out that Ubuntu and Debian don't enable the service required for locks to work over NFSv3 anymore by default while Fedora does. Still doesn't explain why the first lock is delayed by 40s on Fedora but it's unlikely that a .NET application accesses my NFS server first so not an issue for me.
Enabling the service on Ubuntu resolves the issue, as does adding the That just leaves the question why .NET requires locks to move a file while bash and Python do not? |
The locks are used to emulate Windows FileShare behavior. In this case, they prevent other .NET processes to write to the source file, and read from the destination file. |
Interesting, does DOTNET_SYSTEM_IO_DISABLEFILELOCKING work in every release .NET application or only in debug mode? |
It was introduced in .NET 6. |
I'm going to close the issue. The root cause for the delay is .NET locking over NFSv3 without the lock service being available. The options are:
|
Description
NFS performance is awful in .NET while Python and Bash have no problem accessing files at expected speeds.
As the tests below show, .NET needs around 6 seconds to access a file on an NFS share, doesn't matter if reading or writing. File size doesn't matter, access times are slow. Creating folders, deleting folders and deleting files works normally.
Configuration
.NET 5.0.103
Ubuntu 20.04.2 LTS x64 (with 5.11 kernel)
Issue happens on all my .NET applications across various Linux distros
Regression?
Didn't work since at least the introduction of .NET Core 3.0
Other information
Test files are available in this repository.
All times in milliseconds
Network speed according to iperf3: 9.13 Gbits/sec
Files on NFS server are on HDD
1GB file size
Shell:
.NET:
Python:
10MB file size
Shell:
.NET:
Python:
10MB file size
Local control test (SSD)
Shell:
.NET:
Python:
The text was updated successfully, but these errors were encountered: