Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSystemWatcher may cause problems in containers - inotify limits and incorrect error message #27272

Closed
shanselman opened this issue Aug 29, 2018 · 31 comments
Labels
area-Extensions-FileSystem bug untriaged New issue has not been triaged by the area owner

Comments

@shanselman
Copy link

MOVED FROM dotnet/aspnetcore#3475

Looking around the web I'm seeing years of issues with FileSystemWatcher saying "The configured user limit (n) on the number of inotify instances has been reached."

UPDATE: Looks like https://github.com/dotnet/corefx/blob/a10890f4ffe0fadf090c922578ba0e606ebdd16c/src/System.IO.FileSystem.Watcher/src/System/IO/FileSystemWatcher.Linux.cs#L371 will assume when inotify_add_watch fails with an ENOSPEC it must but an issue with inotify instances being out of range. In fact, ENOSPEC can also mean "the kernel failed to allocate a needed resource." We had no way to know it was anything other than "too many files open." The error message is misleading.

From the Man Page - The user limit on the total number of inotify watches was reached or the kernel failed to allocate a needed resource

Phrased differently. There's two Error Cases and we throw a message that implies there's just One.

This is becoming more prevalent in container situations in constrained sandboxes. I'm trying to deploy https://github.com/shanselman/superzeit (just clone and "now --public" or run locally with docker) to Zeit.co and I'm hitting this regularly. I don't think I'm hitting a limit. I think Zeit (and others) are blocking the syscall.

I think there are two issues here:

1 We should return a different error message if inotify_add_watch fails, and then circuit break so that FileSystemWatcher doesn't prevent the app from starting. If we CAN startup without a watch successfully, we should.

2 It seems DOTNET_USE_POLLING_FILE_WATCHER=1 is used in dotnet-watch and the aspnet file providers but the base System.IO FileSystemWatcher class doesn't support DOTNET_USE_POLLING_FILE_WATCHER? We should probably be consistent.

If I change reloadOnChange: false in Program.cs to bypass the first watch that is set on AppSettings.json, I end up hitting it later when Razor/MVC sets up its FileWatchers.
We need at a minimum, to have DOTNET_USE_POLLING_FILE_WATCHER respected everywhere. Another idea would be for a way to have FileSystemWatcher "fail gracefully." We need to test on systems with

Unhandled Exception: System.IO.IOException: The configured user limit (8192) on the number of inotify instances has been reached.
> [0]    at System.IO.FileSystemWatcher.StartRaisingEvents()
> [0]    at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
> [0]    at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
> [0]    at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
> [0]    at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
> [0]    at Microsoft.Extensions.FileProviders.PhysicalFileProvider.Watch(String filter)
> [0]    at Microsoft.Extensions.Configuration.FileConfigurationProvider.<.ctor>b__0_0()
> [0]    at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
> [0]    at Microsoft.Extensions.Configuration.FileConfigurationProvider..ctor(FileConfigurationSource source)
> [0]    at Microsoft.Extensions.Configuration.Json.JsonConfigurationSource.Build(IConfigurationBuilder builder)
> [0]    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
> [0]    at Microsoft.AspNetCore.Hosting.WebHostBuilder.BuildCommonServices(AggregateException& hostingStartupErrors)
> [0]    at Microsoft.AspNetCore.Hosting.WebHostBuilder.Build()
> [0]    at superzeit.Program.Main(String[] args) in /app/superzeit/Program.cs:line 17

Related issues?

@stephentoub @natemcmaster @muratg @pranavkm

dotnet --info

.NET Core SDK (reflecting any global.json):
 Version:   2.1.401
 Commit:    91b1c13032

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.17134
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\2.1.401\

Host (useful for support):
  Version: 2.1.3-servicing-26724-03
  Commit:  124038c13e

.NET Core SDKs installed:
  2.1.400 [C:\Program Files\dotnet\sdk]
  2.1.401 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.3-servicing-26724-03 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

@shanselman
Copy link
Author

Just an FYI, our friends at Zeit figured out their server-side misconfiguration here https://github.com/zeit/now-examples/pull/61 but it would have been easier with the better error message.

@stephentoub
Copy link
Member

It looks like this issue as it applies to corefx is just to improve the error message? That seems quite reasonable, and easy to fix.

@shaulbehr
Copy link

Hello, I've just opened a code branch to upgrade to EF Core 2.2. My new code branch is failing all CI builds with the infamous "inotify" error:

System.IO.IOException : The configured user limit (1024) on the number of inotify instances has been reached.

We have about 3,500 unit tests, and it only fails at some point really far along in the tests. Build bots are on Ubuntu 18.04.

I've read the discussion above, and all I can say is that it sounds relevant to my problem, but I have no idea how to fix it. Help, please?

@stephentoub
Copy link
Member

We have about 3,500 unit tests, and it only fails at some point really far along in the tests.

It sounds like either you're not disposing of all of the FileSystemWatchers you're creating, or something is causing tons of tests that create FileSystemWatchers to run concurrently. If your tests aren't themselves creating FileSystemWatchers, then it sounds like something in the environment is creating them, maybe something in EF Core 2.2, and it'd likely be worth an issue in the EF Core repo, assuming that's where they're coming from.

@shaulbehr
Copy link

@stephentoub I'm not explicitly creating any FileSystemWatchers. And my tests are running sequentially, not concurrently.
Here's a sample stack trace of a failing test:

---> System.IO.IOException: The configured user limit (1024) on the number of inotify instances has been reached.
  at System.IO.FileSystemWatcher.StartRaisingEvents()
  at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
  at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
  at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
  at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
  at Microsoft.AspNetCore.Mvc.RazorPages.Internal.PageActionDescriptorChangeProvider.GetChangeToken()
  at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
  at Microsoft.AspNetCore.Mvc.Infrastructure.DefaultActionDescriptorCollectionProvider..ctor(IEnumerable`1 actionDescriptorProviders, IEnumerable`1 actionDescriptorChangeProviders)
  at lambda_method(Closure , IBuildSession , IContext )
  --- End of inner exception stack trace ---
  at lambda_method(Closure , IBuildSession , IContext )
  at StructureMap.Building.BuildPlan.Build(IBuildSession session, IContext context)
  at StructureMap.Pipeline.LazyLifecycleObject`1.CreateValue()
  at StructureMap.SessionCache.GetObject(Type pluginType, Instance instance, ILifecycle lifecycle)
  at StructureMap.SessionCache.GetDefault(Type pluginType, IPipelineGraph pipelineGraph)
  at StructureMap.Container.GetInstance(Type pluginType)
  at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider provider)
  at Microsoft.AspNetCore.Builder.MvcApplicationBuilderExtensions.UseMvc(IApplicationBuilder app, Action`1 configureRoutes)
  at TestingUtilities.Web.AspNetTestConfiguration.Configuration(IApplicationBuilder app) in <snip path>/src/TestingUtilities.Web/AspNetTestConfiguration.cs:line 48

Line 48 of my AspNetTestConfiguration.cs is:

public void Configuration(IApplicationBuilder app)
{
    ....
    // line 48:
    app.UseMvc(routes =>
               {
                   routes.MapRoute("Default", "{controller}/{action}/{id?}", new { controller = "Home", action = "Index" });
               });
}

This, in turn, is being called by my integration test's base class, which creates a new TestServer for each test fixture. I am calling .Dispose() on the TestServer in the TearDown() method of the test fixture.

@shaulbehr
Copy link

Also, to clarify, it wasn't just EF Core I upgraded; I meant to say that I upgraded to .NET Core 2.2.

@stephentoub
Copy link
Member

@shaulbehr, are you able to attach a debugger to the process when it's in one of these states? e.g. if you could attach lldb and use sos, you could use dumpheap -type FileSystemWatcher to see what FSWs are hanging around, whether they're disposed or not, and hopefully if not why not (e.g. what's keeping them alive). If you're able to try .NET Core 3.0, you could also try out the new dotnet-dump tool (https://devblogs.microsoft.com/dotnet/introducing-diagnostics-improvements-in-net-core-3-0/), which should make it easy to collect a dump of the process that can then be analyzed similarly with the sos commands.

@natemcmaster, @rynowak, I'm not sure who's responsible for this support used from MVC, but have you seen any issues related to PhysicalFilesWatcher instances not being disposed of in a timely manner?

@rynowak
Copy link
Member

rynowak commented May 22, 2019

I don't think we've seen issues with disposal, but our file watchers tend to live the lifetime of the app. I think it was the case for a while that we didn't dispose file watchers so that could cause bugs if the app was stopped and started repeatedly in the same process.

Note: that as of 2.2 we shouldn't be creating the filewatcher from that call stack anymore when the environment is set to Production. This was a mitigation on our part for this scenario because of how many times we heard about these problems in containers.

/cc @pranavkm

@rynowak
Copy link
Member

rynowak commented May 22, 2019

Look at that call stack again, this makes a little more sense now if these are integration tests.

@shaulbehr are your tests creating a TestServer for each test?

@shaulbehr
Copy link

@rynowak Yes, each test fixture creates a new TestServer. In addition, I have some test fixtures that have tests running in parallel, in which case I create a new TestServer for each test. I did add code in the TearDown methods to ensure that the TestServers are disposed, but this doesn't appear to have helped.

@shaulbehr
Copy link

Oho, here's something I just noticed. I added some code to ensure that my IContainer objects (from StructureMap) are disposed, and now the stack trace going through Microsoft.AspNetCore.Builder.MvcApplicationBuilderExtensions.UseMvc() doesn't appear in my logs. Now I have a bunch of other iNotify errors, with the following stack trace:

 System.IO.IOException : The configured user limit (1024) on the number of inotify instances has been reached.
Stack Trace:
   at System.IO.FileSystemWatcher.StartRaisingEvents()
   at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
   at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
   at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
   at Microsoft.Extensions.Configuration.FileConfigurationProvider..ctor(FileConfigurationSource source)
   at Microsoft.Extensions.Configuration.Json.JsonConfigurationSource.Build(IConfigurationBuilder builder)
   at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
   at Microsoft.AspNetCore.Hosting.WebHostBuilder.BuildCommonServices(AggregateException& hostingStartupErrors)
   at Microsoft.AspNetCore.Hosting.WebHostBuilder.Build()
   at Microsoft.AspNetCore.TestHost.TestServer..ctor(IWebHostBuilder builder, IFeatureCollection featureCollection)

@rynowak here's your smoking gun pointing at TestServer.

@shaulbehr
Copy link

shaulbehr commented May 23, 2019

@stephentoub I'm really a rookie at Linux. If you can give me step-by-step instructions how to attach lldb and use sos and dumpheap, I'm probably up to that.

@stephentoub
Copy link
Member

@rynowak, based on your question "are your tests creating a TestServer for each test?" and the answer of "Yes", it seems like you may have some insights here?

@rynowak
Copy link
Member

rynowak commented May 23, 2019

One option would be to try and limit the number of TestServer instances you create. That might or might not be feasible given your requirements. If it's possible, I would expect creating fewer servers to speed up your test execution as well.

Another thing you could try, would be to change how configuration is wired up and remove the file watching. The actual problem reported by that call stack is one that we've already fixed in 3.0 dotnet/extensions#928

@shaulbehr
Copy link

@rynowak going through your suggestions:

  1. Limiting the number of TestServer instances.
    Can't cut down on the number of TestFixtures. Unthinkable. So I started going down the path of making a cache of TestServers, since the vast majority of my TestFixtures use the same config...then realized that the TestServer gets injected with a Container, and I need a new Container per TestFixture. Bam.

  2. Upgrade to 3.0.
    Not happening. This is a production system, and we can't risk a dependency on a prelease library.

  3. Change how configuration is wired up and remove file watching.
    Sounds like an excellent idea. How?

@rynowak
Copy link
Member

rynowak commented May 30, 2019

Sorry for the delay on this, I've been out of the office. You should be able to call ConfigureAppConfiguration and then inside there all IConfigurationBuilder.Sources.Clear(). https://github.com/aspnet/AspNetCore/blob/release/2.2/src/Hosting/Hosting/src/WebHostBuilderExtensions.cs#L121

If you can show my how you're setting up TestServer I can probably give you some more specific advice.

@ajamrozek
Copy link

I'm running into this with my client's on-prem K8S clusters. I've set the reloadOnChange false to my appsettings and the USE_POLLING flag true. Even using a vanilla, boilerplate WebAPI project I get this with enough consistency that we now have a script to "refresh" the IOException throwing pods.

Where/when would Sources.Clear() be practicle? When I have an IConfigurationBuilder is in the CreateWebHostBuilder; I'm gonna need those configs later in Startup, I can't just clear them before I've gotten their data.

@shaulbehr
Copy link

shaulbehr commented Jun 11, 2019

@rynowak There are a lot of moving parts in this machine.
Here's the code in the TestFixture base class that creates the TestServer:

protected virtual TestServer CreateTestServer(bool useAuth)
{
	var testConfiguration = new AspNetTestConfiguration(useAuth);
	var config = ConfigHelper.GetConfiguration(ConfigFileName); // ConfigFileName is a virtual member; config is of type IConfiguration

	// Create the WebHostBuilder used by the test server
	var webHostBuilder = WebHost.CreateDefaultBuilder()
		.ConfigureServices(svc =>
		   {
			   svc.AddSingleton(Container); // Container is a StructureMap.IContainer
			   svc.AddSingleton(testConfiguration);
		   })
		.UseConfiguration(config)
		.UseStartup<TestStartup>();

	return new TestServer(webHostBuilder);
}

I could dump a bunch of other accompanying code here, but I don't want to overload you or anyone else with noise; better you should ask me more specific questions about which code snippets you'd like to see. Alternatively, if you'd like, you can message me privately and I can temporarily give you rights to our Git repo so you can see for yourself what the whole setup looks like.

@ajamrozek
Copy link

My issue seemed related to using the same user across all pods in the cluster. Pro-tip: running as root isn't a good idea for more than just security concerns. Now that each pod has their own user, my issue appears to be resolved.

@stevef51
Copy link

I have this issue with the following setup ..

  • OSX Host
  • VSCode 1.36.1
  • Docker 2.0.0.3
  • .NET Core 2.2.300

I am using VSCode's Dev-Containers, however the issue occurs if I F5 Debug from VSCode OR if I run the app direct from container terminal - it ofcourse does not always happen but its easily enough to be a right pain the in bum, only solution I have found so far is to quit VSCode.

My microservices are in early stages of development, I do run MVC but only with like 1 TestController, and I am using the standard

WebHost.CreateDefaultBuilder()

method which I believe performs a watch on appsettings.json and appsettings.development.json so I am stumped as to why every hour or so I have this issue occur, I actually get the feeling its somehow VSCode related (as quiting VSCode resolves the issue even with the Dev-Container still running in the background)

My only other suspicion is my SPA microservice when running Debug is using Webpack hot module replacement, so it must be watching source files. However this issue also occurs on my Auth microservice (IdentityServer4) which is not Webpacking ofcourse

Hah - as it turns out it just happened again, the following is my stack trace

[16:11:26 FTL] Application startup exception
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.IO.IOException: The configured user limit (128) on the number of inotify instances has been reached.
   at System.IO.FileSystemWatcher.StartRaisingEvents()
   at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
   at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
   at Microsoft.Extensions.FileProviders.PhysicalFileProvider.Watch(String filter)
   at Microsoft.AspNetCore.Mvc.RazorPages.Internal.PageActionDescriptorChangeProvider.GetChangeToken()
   at Microsoft.AspNetCore.Mvc.Infrastructure.DefaultActionDescriptorCollectionProvider.GetCompositeChangeToken()
   at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
   at Microsoft.AspNetCore.Mvc.Infrastructure.DefaultActionDescriptorCollectionProvider..ctor(IEnumerable`1 actionDescriptorProviders, IEnumerable`1 actionDescriptorChangeProviders)
   --- End of inner exception stack trace ---
at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
   at Lamar.IoC.Instances.ConstructorInstance.quickResolve(Scope scope)
   at Lamar.IoC.Instances.ConstructorInstance.QuickResolve(Scope scope)
   at System.Linq.Enumerable.SelectArrayIterator`2.ToArray()
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   at Lamar.IoC.Instances.ConstructorInstance.quickResolve(Scope scope)
   at Lamar.IoC.Instances.ConstructorInstance.QuickResolve(Scope scope)
   at Lamar.IoC.Frames.InjectedServiceField.ToVariableExpression(LambdaDefinition definition)
   at LamarCodeGeneration.Expressions.LambdaDefinition.ExpressionFor(Variable variable)
   at Lamar.IoC.Frames.ListAssignmentFrame`1.WriteExpressions(LambdaDefinition definition)
   at Lamar.IoC.Instances.FuncResolverDefinition.BuildResolver()
   at Lamar.IoC.Instances.GeneratedInstance.BuildFuncResolver(Scope scope)
   at Lamar.IoC.Instances.GeneratedInstance.buildResolver(Scope scope)
   at Lamar.IoC.Instances.GeneratedInstance.ToResolver(Scope topScope)
   at Lamar.ServiceGraph.FindResolver(Type serviceType)
   at Lamar.IoC.Scope.GetInstance(Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider provider, Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider provider)
   at Microsoft.AspNetCore.Builder.MvcApplicationBuilderExtensions.UseMvc(IApplicationBuilder app, Action`1 configureRoutes)
   at VirtualMgr.SPA.Startup.Configure(IApplicationBuilder app, IHostingEnvironment env, IApplicationLifetime lifetime, IDistributedCache cache) in /src/VirtualMgr.SPA/Startup.cs:line 162
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.AspNetCore.Hosting.ConventionBasedStartup.Configure(IApplicationBuilder app)
   at Microsoft.AspNetCore.Mvc.Internal.MiddlewareFilterBuilderStartupFilter.<>c__DisplayClass0_0.<Configure>g__MiddlewareFilterBuilder|0(IApplicationBuilder builder)
   at Microsoft.AspNetCore.HostFilteringStartupFilter.<>c__DisplayClass0_0.<Configure>b__0(IApplicationBuilder app)
   at Microsoft.AspNetCore.Hosting.Internal.AutoRequestServicesStartupFilter.<>c__DisplayClass0_0.<Configure>b__0(IApplicationBuilder builder)
   at Microsoft.AspNetCore.Hosting.Internal.WebHost.BuildApplication()
[16:11:26 DBG] Hosting shutdown

Any help is much appreciated

@danmoseley danmoseley reopened this Aug 9, 2019
@danmoseley
Copy link
Member

There's plenty of discussion here, reopening so it's visible.

@kylef000
Copy link

kylef000 commented Aug 26, 2019

Hey all, specific to aspnetcore instances that are experiencing this exception, I have isolated a cause and steps to mitigate until a better solution comes along. Adding the fixes described below in a brand new project reduced the watchers used from around 93 to around 26. This only pertains to aspnetcore and I'll not rehash the other tools to mitigate the issue like reloadOnChange: true etc.

tl;dr version: Taghelpers that use asp-append-version="true" add watchers, removing this or setting to false prevents the watcher creation. You can also dump watchers on startup to keep the value low, by clearing fileproviders and preventing view file recompiling (only affects Development env) like this:

services.AddMvc()
.SetCompatibilityVersion(CompatibilityVersion.Version_2_2)
.AddRazorOptions(ro => {
    ro.FileProviders.Clear();
    ro.FileProviders.Add(new CompositeFileProvider(new[] {new NullFileProvider()}));
    ro.AllowRecompilingViewsOnFileChange = false;
});

Without the above snippet, I noticed that there were about 54-58 watchers being used at startup.

To get the number of watchers being used (+ ~4 watchers that are not used by dotnet), you can add the following to your Dockerfile in the aspnetcore-runtime build step:
RUN apt-get update -y && apt-get install -y procps lsof

Following that, we can either check via the terminal for the docker container by running lsof | grep inotify | wc -l or we can add a ShellHelper class.

If going the ShellHelper route, adding the following JsonResult to a controller, we can verify in the browser:

[Route("/getfilewatches")]
public JsonResult GetFileWatches()
{
    string maxUserWatches = System.IO.File.ReadAllText("/proc/sys/fs/inotify/max_user_watches").Trim();
    string currentUsedInotifyWatches = "lsof | grep inotify | wc -l".Bash();
            
    return new JsonResult(new {maxUserWatches, currentUsedInotifyWatches });
}

With the asp-append-version properties, you'll see the watches jumping by about 40 on a page request that uses the <link ..> or <script ..> assets. After removing the properties, the value should not jump.

YMMV but I hope that this will help anyone else that has been banging their head on their desk for months trying to solve for this issue. I will add a full repro if desired.

@stevef51
Copy link

@kylef000 thanks for your detailed insight. I have managed a work around for this which once setup works nicely, note this is not something I came up with but something I found somewhere else (I really should track where I found this) .. I put a shell script together with

#!/bin/bash
docker run -ti --privileged centos sysctl fs.inotify.max_user_instances=8192

and run it after Docker fires up, this alters the Docker hosts inotify limit (which is a Linux VM on a real Mac host in my case), I dont know whether this limit will realistically be breached at any point, it has worked for many weeks now given how easy the initial issue was to reproduce.

Cheers

@kylef000
Copy link

@stevef51 Thanks for the response. I've done the same locally. Unfortunately running the container in privileged mode is not advisable in a live environment (especially a multi-container docker environment), because it allows the container to interact with the host and other devices connected to the host in a way that may be abused. The docker daemon runs as root.

By default, Docker containers are “unprivileged” and cannot, for example, run a Docker daemon inside a Docker container. This is because by default a container is not allowed to access any devices, but a “privileged” container is given access to all devices (see the documentation on cgroups devices).

When the operator executes docker run --privileged, Docker will enable access to all devices on the host as well as set some configuration in AppArmor or SELinux to allow the container nearly all the same access to the host as processes running outside containers on the host.

https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities

@stevef51
Copy link

@kylef000 Thanks again -

By running the script that specific container does have privileged access to the host (in my case a Linux VM) which is needed to be able to alter the hosts inotify limits, however the container immediately stops and is discarded (actually just noticed I am missing the --rm flag to make this auto discard) the change to the host remains though.

My service containers however run in normal access (none privileged) but they inherit the inotify limit of the host which is now increased, they dont run in privileged mode.

Granted it is definitely a work around and I think either the Linux VM that Docker fires up should by default have a higher inotify limit and/or the ASP.NET Core FileWatchers need some attention to see why they are using so many of them.

Cheers

@carlossanlop
Copy link
Member

carlossanlop commented Jan 23, 2020

Triage: As this was fixed in dotnet/corefx#32462 (in 3.0), and the discussion is dying down, we are re-closing.

If more discussion is needed, please open a new issue.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@JeremyKuhne
Copy link
Member

The fix for the other part of this this is dotnet/extensions#928.

@ackginger
Copy link

I'm not convinced this is fixed.

Isn't the issue actually here in PhysicalFilesWatcher:
https://github.com/dotnet/runtime/blob/master/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PhysicalFilesWatcher.cs#L134

This class attempts to respect DOTNET_USE_POLLING_FILE_WATCHER as it is constructed with pollForChanges=true in that case (by PhysicalFileProvider), and proceeds to register PollingFileChangeTokens for use instead of watching the filesystem ... however TryEnableFileSystemWatcher is called regardless of the value of PollForChanges.

For context, I'm still getting the error:

System.IO.IOException : The configured user limit (128) on the number of inotify instances has been reached, or the per-process limit on the number of open file descriptors has been reached.

albeit due to a mistake in my test parallelisation, but still shouldn't be creating any file system watchers when running on the sdk linux docker image.

@stephentoub
Copy link
Member

Re-opening for the Microsoft.Extensions issue...

@stephentoub stephentoub reopened this Jun 4, 2020
@stephentoub stephentoub removed this from the 3.0 milestone Jun 4, 2020
@stephentoub stephentoub added the untriaged New issue has not been triaged by the area owner label Jun 4, 2020
@stephentoub
Copy link
Member

Will close again. #37664 can be used to track the Microsoft.Extensions issue.

@rventuri76
Copy link

I'm having a very similar issue to this debugging a docker compose proj with vs 2019.
https://stackoverflow.com/questions/63493884/system-io-ioexception-function-not-implemented-in-createhostbuilderargs-buil
The call stack is very similar but the error Message is "Function not implemented"
Happens every time I recompile the react js that changes file in ClientApp/build

@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Extensions-FileSystem bug untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests