Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1024 hard limit of open file descriptors #15931

Closed
mrward opened this issue Jul 31, 2019 · 8 comments
Closed

1024 hard limit of open file descriptors #15931

mrward opened this issue Jul 31, 2019 · 8 comments

Comments

@mrward
Copy link
Member

@mrward mrward commented Jul 31, 2019

Steps to Reproduce

Seen some reports of this with VS Mac, so I created a console project that triggers the http client error by creating a lot of FileSystemWatchers and then disposing them. The HttpClient does not seem to recover and gets stuck throwing an error.

HttpClient request failed. The SSL connection could not be established, see inner exception.
Unable to write data to the transport connection: The socket is not connected.

https://github.com/mrward/TestManyFileWatchers

Output:

mono TestManyFileWatchers/bin/Debug/TestManyFileWatchers.exe
10 file watchers created
20 file watchers created
30 file watchers created
40 file watchers created
50 file watchers created
60 file watchers created
70 file watchers created
80 file watchers created
90 file watchers created
100 file watchers created
110 file watchers created
120 file watchers created
130 file watchers created
140 file watchers created
150 file watchers created
160 file watchers created
170 file watchers created
180 file watchers created
190 file watchers created
200 file watchers created
210 file watchers created
220 file watchers created
230 file watchers created
240 file watchers created
250 file watchers created
HttpClient request failed. The SSL connection could not be established, see inner exception.
Unable to write data to the transport connection: The socket is not connected.
Watchers created: 254
WorkerThreadsMax: 800 CompletionPortThreadsMax 200
WorkerThreadsAvailable: 799 CompletionPortThreadsAvailable 200

Disposing all file watchers.
HttpClient request failed. The SSL connection could not be established, see inner exception.
Unable to write data to the transport connection: The socket is not connected.
Watchers created: 0
WorkerThreadsMax: 800 CompletionPortThreadsMax 200
WorkerThreadsAvailable: 799 CompletionPortThreadsAvailable 200

I suspect there are other ways to trigger this problem, but creating too many FileSystemWatchers is one way.

Current Behavior

HttpClient gets stuck in a state where it cannot make any http requests:

HttpClient request failed. The SSL connection could not be established, see inner exception.
Unable to write data to the transport connection: The socket is not connected.

Expected Behavior

HttpClient can recover from this lack of resources problem.

On which platforms did you notice this

[x] macOS
[ ] Linux
[ ] Windows

Version Used:

Mono JIT compiler version 6.0.0.311 (2019-02/494641b300c Mon Jul 1 20:30:26 EDT 2019)
Mono JIT compiler version 6.4.0.110 (2019-06/edc9e70e04a Mon Jul 15 19:04:16 EDT 2019)

Stacktrace

@marek-safar

This comment has been minimized.

Copy link
Member

@marek-safar marek-safar commented Aug 16, 2019

@steveisok any thoughts on this issue?

@steveisok

This comment has been minimized.

Copy link
Contributor

@steveisok steveisok commented Aug 16, 2019

I looked at this earlier and had questions for @baulig.

@baulig

This comment has been minimized.

Copy link
Member

@baulig baulig commented Aug 21, 2019

Unfortunately, it looks like there's nothing we can do about this because you're hitting a OS limitation.

Quoting from the select(2) man page:

BUGS
     Although the provision of getdtablesize(2) was intended to allow user programs to be written independent of the kernel limit on the number of open files, the dimension of a
     sufficiently large bit field for select remains a problem.  The default size FD_SETSIZE (currently 1024) is somewhat smaller than the current kernel limit to the number of open
     files.  However, in order to accommodate programs which might potentially use a larger number of open files with select, it is possible to increase this size within a program
     by providing a larger definition of FD_SETSIZE before the inclusion of <sys/types.h>.

I debugged this and what's happening is that as soon as the requests start failing, we're hitting this line:

if (fd >= FD_SETSIZE) {

You can also run this with

MONO_LOG_LEVEL=debug MONO_LOG_MASK=io-selector,io-layer,io-layer-socket mono --debug ./bin/Debug/TestManyFileWatchers.exe 

and you will see that once the file descriptor for the next socket reaches 1024, the errors start happening.

And then our I/O layer doesn't seem to recover from this situation very well anymore.

@baulig baulig changed the title HttpClient socket is not connected error - unable to recover 1024 hard limit of open file descriptors Aug 21, 2019
@baulig baulig self-assigned this Aug 21, 2019
@baulig

This comment has been minimized.

Copy link
Member

@baulig baulig commented Aug 21, 2019

@mrward

This comment has been minimized.

Copy link
Member Author

@mrward mrward commented Aug 21, 2019

I am OK with the error occurring when the limit is reached - VS Mac is using too many resources in this case. Was mainly filing because of mono not being able to recover after those resources are removed.

This particular problem appears in VS Mac when using NuGet since some parts of NuGet are not using the NSUrlSessionHandler we are passing, and is falling back to using Mono's HttpClientHandler. Other parts of VS Mac that use the NSUrlSessionHandler still work when in this state.

@lambdageek lambdageek self-assigned this Aug 21, 2019
@lambdageek

This comment has been minimized.

Copy link
Member

@lambdageek lambdageek commented Aug 21, 2019

Going to try to do some error checking before we add the fd to the IOSelector and throw a managed error.

lambdageek added a commit to lambdageek/mono that referenced this issue Aug 21, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono#15931
lambdageek added a commit to lambdageek/mono that referenced this issue Aug 21, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono#15931
lambdageek added a commit to lambdageek/mono that referenced this issue Aug 22, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono#15931
lambdageek added a commit to lambdageek/mono that referenced this issue Aug 22, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono#15931
lambdageek added a commit that referenced this issue Aug 26, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses #15931
monojenkins added a commit to monojenkins/mono that referenced this issue Aug 26, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono#15931
lambdageek added a commit that referenced this issue Aug 27, 2019
The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses #15931
@lambdageek

This comment has been minimized.

Copy link
Member

@lambdageek lambdageek commented Aug 29, 2019

This is now fixed (in so far as Mono will no longer crash- it will throw a managed exception) on Mono master and Mono 2019-08.

@mrward Do we need to backport this further back?

@mrward

This comment has been minimized.

Copy link
Member Author

@mrward mrward commented Aug 29, 2019

@lambdageek Fine by me. Right now I do not think there is a need to backport this.

@lambdageek lambdageek closed this Aug 29, 2019
ManickaP pushed a commit to ManickaP/runtime that referenced this issue Jan 20, 2020
…/mono#16396)

The poll+select i/o selector backend can't handle file descriptor ids greater
than FD_SETSIZE.  This can happen if too many files are open and we want to
wait on it.

Previously, mono would fail in the i/o selector thread by which point it was
too late to do anything.

With this change we will fail eagerly on the thread that calls IOSelector.Add
by throwing a NotSupportedException.

Addresses mono/mono#15931

Commit migrated from mono/mono@78edafd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.