Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSystemWatcher on Linux uses an excessive amount of resources #62869

Open
neilmayhew opened this issue Dec 15, 2021 · 11 comments
Open

FileSystemWatcher on Linux uses an excessive amount of resources #62869

neilmayhew opened this issue Dec 15, 2021 · 11 comments

Comments

@neilmayhew
Copy link
Contributor

neilmayhew commented Dec 15, 2021

Description

There are many reports on the web of running into this message:

System.IO.IOException : The configured user limit (128) on the number of inotify instances has been reached, or the per-process limit on the number of open file descriptors has been reached.

The Linux implementation of FileSystemWatcher creates a new thread with its own inotify instance for every directory that's watched. This is very expensive in kernel resources, and often hits system limits as can be seen from the message. It's particularly problematic when running in a container because the per-UID quota isn't namespaced and is therefore shared among all containers on the same host. To make things worse, many ASP.net applications running in a container are incorrectly configured and use more FileSystemWatchers than they need to. Such a container consumes much more than its fair share of the available inotify instances, which then causes other containers in the pod (eg with k8s) to fail unexpectedly.

Instead, the implementation should have a single thread with a single inotify instance that's shared by all the FileSystemWatchers in the process. Instead of adding new instances (which are expensive) the implementation would add new watches (which are cheap). It would then be possible to accommodate even the most egregious of ASP.net apps with ease..

Configuration

Any Linux system.

Regression?

No. This has been going on for a long time and there are a lot of puzzled blog posts about it.

Data

Blog posts:

Related issues:

Analysis

The inotify system was never intended to be used like this.

However, rather than changing the implementation to use a single shared thread, it might be a better use of resources to write a new implementation using fanotify as suggested in:

@neilmayhew neilmayhew added the tenet-performance Performance related issue label Dec 15, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Dec 15, 2021
@ghost
Copy link

ghost commented Dec 16, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

There are many reports on the web of running into this message:

System.IO.IOException : The configured user limit (128) on the number of inotify instances has been reached, or the per-process limit on the number of open file descriptors has been reached.

The Linux implementation of FileSystemWatcher creates a new thread with its own inotify instance for every directory that's watched. This is very expensive in kernel resources, and often hits system limits as can be seen from the message. It's particularly problematic when running in a container because the per-UID quota isn't namespaced and is therefore shared among all containers on the same host. To make things worse, many ASP.net applications running in a container are incorrectly configured and use more FileSystemWatchers than they need to. Such a container consumes much more than its fair share of the available inotify instances, which then causes other containers in the pod (eg with k8s) to fail unexpectedly.

Instead, the implementation should have a single thread with a single inotify instance that's shared by all the FileSystemWatchers in the process. Instead of adding new instances (which are expensive) the implementation would add new watches (which are cheap). It would then be possible to accommodate even the most egregious of ASP.net apps with ease..

Configuration

Any Linux system.

Regression?

No. This has been going on for a long time and there are a lot of puzzled blog posts about it.

Data

Blog posts:

Related issues:

Analysis

The inotify system was never intended to be used like this. However, rather than changing the implementation to use a single shared thread, it might be a better use of resources to write a new implementation using fanotify as suggested in:

Author: neilmayhew
Assignees: -
Labels:

area-System.IO, tenet-performance, untriaged

Milestone: -

@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Dec 16, 2021
@adamsitnik adamsitnik added this to To do in System.IO - FileSystemWatcher via automation Dec 16, 2021
@adamsitnik
Copy link
Member

@neilmayhew I agree that the current implementation is not perfect.

@tmds do you have any recommendations?

@tmds
Copy link
Member

tmds commented Dec 16, 2021

We can look at improving the implementation. Maybe some simpler changes can be made related to:

To make things worse, many ASP.net applications running in a container are incorrectly configured and use more FileSystemWatchers than they need to.

@neilmayhew can you give some more information about this?

@neilmayhew
Copy link
Contributor Author

neilmayhew commented Dec 22, 2021

@tmds Sorry for the slow reply.

Both of the blogs I linked to in the description mention ASP.net, and refer to a solution that looks like this:

configuration = new ConfigurationBuilder()
                     .SetBasePath(Directory.GetCurrentDirectory())    
                     .AddJsonFile("appsettings.json", optional: true, reloadOnChange: false);

However, one of the comments on the first blog says that wasn't enough and links to a Stack Overflow question about WebApplicationFactory that has a recursive solution. That in turn links to another question, which doesn't seem to be about ASP.net specifically. So I don't think this is exactly a problem with ASP.net, more about the default behavior of ConfigurationBuilder, but it seems that the problem crops up more often with, or is made worse by, ASP.net.

@neilmayhew
Copy link
Contributor Author

A web search for the error message in question turns up a zillion hits, so a lot of people are encountering this, but previously no-one seems to have figured out the root of the problem and everyone is just finding ways to work around it.

It should be possible to watch thousands of files and directories on Linux without much overhead, but that's not possible with the way it's implemented currently.

The solution I'm thinking of is to use class variables for a common inotify instance and background thread, instead of instance variables and an instance and thread per FileSystemWatcher object. The common thread would send events just like the single thread does, but it would need to keep a dictionary mapping from inotify watch descriptor to FileSystemWatcher so it can obtain the .Net event it needs to send when it reads an inotify event from the inotify instance. This would be only slightly more complex than the current implementation, and the complexity would be concerned only with mutexing and keeping the dictionary updated when FileSystemWatcher instances are created and destroyed.

@neilmayhew
Copy link
Contributor Author

BTW, I don't think it would be appropriate to switch the implementation to use fanotify instead of inotify. Although fanotify is newer, it's designed for a different use case and isn't a replacement for inotify. I'll update the description.

@neilmayhew
Copy link
Contributor Author

neilmayhew commented Dec 22, 2021

I see that the implementation already uses a mutexed dictionary for mapping watch descriptors to absolute paths (_wdToPathMap) so a lot of the infrastructure I discussed is already in place and could be repurposed quite easily.

@suchoss
Copy link

suchoss commented Feb 19, 2023

referenced pull request is not a proper fix, but a workaround... :(

@wasabii
Copy link

wasabii commented Nov 7, 2023

My view on this is there is a fundamental limitation in the API of FileSystemWatcher that causes this issue: it can only watch a single path. And because of that, user's are drawn to create multiple FileSystemWatchers, to watch multiple directories. And since a single FileSystemWatcher holds a single IOCP port, or inotify queue, this is artifically inflating the number of such queues that the system needs to allocate.

In IKVM we had been using FileSystemWatcher as the backing support for our java.nio.file.WatchService support. But, we ran into this issue. WatchService in Java starts out pretty much the same: user's create an instance, and that instance owns a queue or port. However, user's then register and unregister multiple potentially unrelated paths to listen to with the service. So, in that ecosystem, user's create WatchService instance less frequently. Usually one per problem domain. But then register large directory hierarchies, etc, with that single one. But they're in control of how many ports they allocate, and how many directories they watch with each.

So, as an API suggestion that might alleviate this: support multiple Paths. Yes, FileSystemWatcher.Path is a single property. And the class takes a single value in it's ctor. But maybe there's some clean way to extend this a bit.

@neilmayhew
Copy link
Contributor Author

@wasabii Your understanding matches what I was saying in the description. However, I'm suggesting a solution that doesn't require any changes to the API. (I'm pretty sure changing the API isn't an option.)

In the solution I'm suggesting, there would be a single, global inotify queue for all FileSystemWatchers (ie a class variable) instead of having a separate inotify queue per FileSystemWatcher (ie an instance variable) as there is at present.

As I've outlined, I think this would be relatively simple to implement. I wish I had the time to do it myself but the job I'm in now doesn't use .Net and so I can't justify it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO tenet-performance Performance related issue
Development

No branches or pull requests

7 participants