Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FD limit exceed on windows #61

Closed
mikhail-barg opened this issue May 29, 2017 · 11 comments
Closed

FD limit exceed on windows #61

mikhail-barg opened this issue May 29, 2017 · 11 comments
Labels

Comments

@mikhail-barg
Copy link
Contributor

mikhail-barg commented May 29, 2017

Hi, we are using baresip/rem/re combo on Windows, and we are facing the following issue.
If the client left in registered state for some time, the following warning starts to appear:
main: fd_listen: fd=2056 flags=0x01 - Max 2048 fds

Here's the callstack

ourlib.dll!fd_listen(int fd, int flags, void(*)(int, void *) fh, void * arg) Line 584	C
ourlib.dll!udp_thread_attach(udp_sock * us) Line 696	C
ourlib.dll!udp_listen(udp_sock * * usp, const sa * local, void(*)(const sa *, mbuf *, void *) rh, void * arg) Line 376	C
ourlib.dll!udp_range_listen(rtp_sock * rs, const sa * ip, unsigned short min_port, unsigned short max_port) Line 241	C
ourlib.dll!rtp_listen(rtp_sock * * rsp, int proto, const sa * ip, unsigned short min_port, unsigned short max_port, char enable_rtcp, void(*)(const sa *, const rtp_header *, mbuf *, void *) recvh, void(*)(const sa *, rtcp_msg *, void *) rtcph, void * arg) Line 330	C
ourlib.dll!stream_sock_alloc(stream * s, int af) Line 254	C
ourlib.dll!stream_alloc(stream * * sp, const config_avt * cfg, call * call, sdp_session * sdp_sess, const char * name, int label, const mnat * mnat, mnat_sess * mnat_sess, const menc * menc, menc_sess * menc_sess, const char * cname, void(*)(const rtp_header *, mbuf *, void *) rtph, void(*)(rtcp_msg *, void *) rtcph, void * arg) Line 302	C
ourlib.dll!audio_alloc(audio * * ap, const config * cfg, call * call, sdp_session * sdp_sess, int label, const mnat * mnat, mnat_sess * mnat_sess, const menc * menc, menc_sess * menc_sess, unsigned int ptime, const list * aucodecl, void(*)(int, char, void *) eventh, void(*)(int, const char *, void *) errh, void * arg) Line 703	C
ourlib.dll!call_alloc(call * * callp, const config * cfg, list * lst, const char * local_name, const char * local_uri, account * acc, ua * ua, const call_prm * prm, const sip_msg * msg, call * xcall, dnsc * dnsc, void(*)(call *, call_event, const char *, void *) eh, void * arg) Line 593	C
ourlib.dll!ua_call_alloc(call * * callp, ua * ua, vidmode vidmode, const sip_msg * msg, call * xcall, const char * local_uri) Line 447	C
ourlib.dll!handle_options(ua * ua, const sip_msg * msg) Line 473	C
ourlib.dll!request_handler(const sip_msg * msg, void * arg) Line 553	C
ourlib.dll!sip_recv(sip * sip, const sip_msg * msg) Line 266	C
ourlib.dll!udp_recv_handler(const sa * src, mbuf * mb, void * arg) Line 339	C
ourlib.dll!udp_read(udp_sock * us, int fd) Line 219	C
ourlib.dll!udp_read_handler(int flags, void * arg) Line 232	C
ourlib.dll!fd_handler(re * re, int fd, int flags) Line 240	C
ourlib.dll!fd_poll(re * re) Line 843	C
ourlib.dll!re_main(void(*)(int) signalh) Line 979	C
ourlib.dll!ReMainThreadFunc(void * lpThreadParameter) Line 237	C

While investigating the issue, we have discovered the following two things.

First, minor thing, is that this problem happens when handling OPTIONS message, and to handle it it opens a new call, which in case calls audio_alloc() to handle RTP sessions which seem to be completely redundant.

Second, major issue, an actual cause of the problem we are facing is the way libre works with sockets. See

re\src\udp\udp.c, line 319, int udp_listen():

		fd = SOK_CAST socket(r->ai_family, SOCK_DGRAM, IPPROTO_UDP);
		if (fd < 0) {
			err = errno;
			continue;
		}

and following
re\src\main\main.c, line 582, int fd_listen():

	if (fd >= re->maxfds) {
		if (flags) {
			DEBUG_WARNING("fd_listen: fd=%d flags=0x%02x"
				      " - Max %d fds\n",
				      fd, flags, re->maxfds);
		}
		return EMFILE;
	}

	/* Update fh set */
	if (re->fhs) {
		re->fhs[fd].flags = flags;
		re->fhs[fd].fh    = fh;
		re->fhs[fd].arg   = arg;
	}

	re->nfds = max(re->nfds, fd+1);

The problem with this code is that under Windows result of socket() function is not a fd, and is not bound to be between 0 and FD_SETSIZE. See here:

In Winsock applications, a socket descriptor is not a file descriptor

Windows Sockets handles have no restrictions, other than that the value INVALID_SOCKET is not a valid socket. Socket handles may take any value in the range 0 to INVALID_SOCKET–1.

So it seems that using sockets as ints to index some reasonable-sized array would not work under Windows at all, and some other approach should be used, like keeping list of active sockets.

@GGGO
Copy link

GGGO commented Jun 2, 2017

Hi!

I got same problem and I agree with you, we need to keep active socket in a list for Windows.

@alfredh
Copy link
Contributor

alfredh commented Jun 3, 2017

regarding the OPTIONS handling in baresip, could you please create a separate ticket
for that in the baresip project ?

have you tried to increase FD_SETSIZE to a larger value ?

I was thinking that, instead of having a fixed size table we could let the size of the table
grow by doubling the capacity. For example:

fd=500 -> size=512
fd=1000 -> size = 1024
fd=6000 -> size = 8192

a drawback here is that it would leave many unused holes in the table.

@mikhail-barg
Copy link
Contributor Author

regarding the OPTIONS handling in baresip, could you please create a separate ticket
for that in the baresip project ?

Done (baresip/baresip#265)

have you tried to increase FD_SETSIZE to a larger value ?

Well, I have it set to 2048 which is already kind of too much. Yes' it's possible to make the limit larger, but it is useless in general for Windows does not have any limit on the return value of the socket() function. And it's even not guaranteed to return increasing numbers here.

So there's no reasonable limit to specify. And growing the size of re->fhs by powers of two would just eventally conusme huge RAM blocks just to keep few sockets. Imagine first socket() call returning 2^32-1.

@alfredh
Copy link
Contributor

alfredh commented Jul 2, 2017

I now agree that doubling the capacity is not a good solution :)

the int fd is used to lookup the socket/handler with the active event.
for Unix we would like to keep this feature, as it is very fast.

do you know about any other Async I/O projects that supports Windows
and sockets, and how this problem is solved there ?

/Alfred

@GGGO
Copy link

GGGO commented Jul 3, 2017

The best solution for Windows seems to be I/O Completion Ports but the design is quite different from what we have in libre.

With IOCP, we don't need to maintain fd to lookup because it's possible to setup a callback when a task has been completed.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa365198(v=vs.85).aspx

@alfredh
Copy link
Contributor

alfredh commented Jul 9, 2017

Hi @mikhail-barg and @GGGO

would it be possible for you to try to create a patch to solve this issue?
I dont have access to Windows, but I would be happy to review and help.

looks like there are two solutions:

  1. change the fd mapping to handle very large fd's (fd > MAXFDS)

  2. add support for IOCP (I/O Completion Ports) for Windows. this will most likely
    require large architectural changes to libre.

/Alfred

@lgrahl
Copy link
Contributor

lgrahl commented Jul 9, 2017

Another idea would be to add one layer of indirection for sockets on Windows. re would use fake fds which will be mapped to whatever fd Windows is actually using for that socket. For example:

4 -> 2147483647
5 -> 52358
...

However, if this requires similar amount of changes compared to IOCP, it's probably not worth the effort.

@mikhail-barg
Copy link
Contributor Author

@alfredh Yes, I will have to do this as we plan to use baresip stack in our production environment, but it would take some time before I could switch back to this.

alfredh added a commit that referenced this issue Oct 7, 2017
Experimental code in order to try to solve
sockets for Windows. On windows the "int fd" is actually
of type "SOCKET fd" and the range goes from 0 to very large.

ref #61
@alfredh
Copy link
Contributor

alfredh commented Oct 7, 2017

hi @mikhail-barg

I have made a patch that uses a list to hold the mapping between
fd/SOCKET and fd_handlers. Only select is supported for now.

Please test it if you wish :)

NOTE: the current code is very robust and mature, but with the limitation
for Windows (as mentioned). Currently I am just playing around with different
solutions.

@mikhail-barg
Copy link
Contributor Author

@alfredh thanks! We are hoping to get back to baresip-based project in near future, so that's good news for us! Still, I'm worried on the possible performance hit. Anyway, we will be looking at this soon, thanks again!

@alfredh
Copy link
Contributor

alfredh commented Oct 13, 2017

I will reopen when we start working on this actively.

/Alfred

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants