New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: too many file descriptors in select() #26

Open
asicilia opened this Issue Jan 13, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@asicilia

asicilia commented Jan 13, 2016

Hi, We are using Ztreamdy (v0.2) in Windows Server 2012 R2. The server has 296 streams, there are 3 publishers which sends RDF data each 3 seconds. There is one subscriber that reads data from all streams and process it. Ther server and the subscriber are deployed as a Window Service.

Some times (I didn't find any pattern) this error is raised in Ztreamy server:

The instance's SvcRun() method failed 
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\win32\lib\win32serviceutil.py", line 835, in 
SvcRun self.SvcDoRun()
File "C:\DSS\SemanticFramework\Zaanstad\server.py", line 29, in 
SvcDoRun (self._svc_name_,''))
File "C:\DSS\SemanticFramework\Zaanstad\server.py", line 56, in 
main
File "C:\Python27\lib\site-packages\ztreamy\server.py", line 133, in 
start self.ioloop.start()
File "C:\Python27\lib\site-packages\tornado\ioloop.py", line 808, in 
start event_pairs = self._impl.poll(poll_timeout)
File "C:\Python27\lib\site-packages\tornado\platform\select.py", line 63, in 
poll self.read_fds, self.write_fds, self.error_fds, timeout)
ValueError: too many file descriptors in select() 
%2: %3

It seems that 64 is the default limit for select() on Windows. In Python's select module it's raised to 512. I took from here: https://groups.google.com/forum/#!topic/python-tornado/oSbxI9X28MM

Any clue on this?

@jfisteus

This comment has been minimized.

Show comment
Hide comment
@jfisteus

jfisteus Jan 13, 2016

Owner

I'm sorry, but I haven't run into this problem before, since I've always run Ztreamy on Linux servers.

I've run through the Tornado source code to check that it uses epoll in Linux, kqueue in BSD/OSX but select for the rest of platforms, including Windows. The epoll and kqueue interfaces largely improve performance when waiting on a large number of file descriptors. On Linux I'm able to run a Ztreamy server with several tens of thousands of simultaneous subscribers. The only caveat is increasing the default limit to open file descriptors that Linux imposes to each user, as described here: http://www.ztreamy.org/doc/experiments/#increasing-the-limits-of-open-file-descriptors

Looking at the link you've sent, it seems that the way to increase the 512 limit in Windows would be recompiling the Python interpreter. You would need to take the cpython 2.7 sources and set a larger value to FD_SETSIZE as a compilation option as explained here: https://hg.python.org/cpython/file/9a712ad593bb/Modules/selectmodule.c#l25 Then you would run your server with the cpython binaries you compiled yourself. I haven't tested this alternative and don't know how complex that would be or whether that would actually work. I've found in Google issues filled against other network-related projects such as ZeroMQ that run into the same problem in Windows, and they had also to set larger values to FD_SETSIZE.

If moving your server to a Linux machine is out of the question, a workaround in your case would perhaps be adding a new relay stream that repeats the data from all the other streams, so that the client connects just to that stream. You would have just 1 file descriptor associated to your client instead of the 296 you now have.

In order to do that, create a new instance of ztreamy.server.RelayStream (https://github.com/jfisteus/ztreamy/blob/ztreamy-0.2/ztreamy/server.py#L383). For the streams parameters of its constructor use a list with all the streams (i.e. the instances of the Stream class) you want to repeat. Then, install that instance of RelayStream in your Server object the same way you do with the rest of streams. Make your consumer connect just to the relay stream and leave your producers as they are now. This way, the events that get published through the 296 streams get also published through your relay stream and therefore your consumer gets all of them through a single network connection. Since now the client doesn't know from which stream each event comes, if you need it to separate the events upon reception, use the Event-Type header or define your own extension header.

Please, let me know if I can be of any further help regarding this issue.

Owner

jfisteus commented Jan 13, 2016

I'm sorry, but I haven't run into this problem before, since I've always run Ztreamy on Linux servers.

I've run through the Tornado source code to check that it uses epoll in Linux, kqueue in BSD/OSX but select for the rest of platforms, including Windows. The epoll and kqueue interfaces largely improve performance when waiting on a large number of file descriptors. On Linux I'm able to run a Ztreamy server with several tens of thousands of simultaneous subscribers. The only caveat is increasing the default limit to open file descriptors that Linux imposes to each user, as described here: http://www.ztreamy.org/doc/experiments/#increasing-the-limits-of-open-file-descriptors

Looking at the link you've sent, it seems that the way to increase the 512 limit in Windows would be recompiling the Python interpreter. You would need to take the cpython 2.7 sources and set a larger value to FD_SETSIZE as a compilation option as explained here: https://hg.python.org/cpython/file/9a712ad593bb/Modules/selectmodule.c#l25 Then you would run your server with the cpython binaries you compiled yourself. I haven't tested this alternative and don't know how complex that would be or whether that would actually work. I've found in Google issues filled against other network-related projects such as ZeroMQ that run into the same problem in Windows, and they had also to set larger values to FD_SETSIZE.

If moving your server to a Linux machine is out of the question, a workaround in your case would perhaps be adding a new relay stream that repeats the data from all the other streams, so that the client connects just to that stream. You would have just 1 file descriptor associated to your client instead of the 296 you now have.

In order to do that, create a new instance of ztreamy.server.RelayStream (https://github.com/jfisteus/ztreamy/blob/ztreamy-0.2/ztreamy/server.py#L383). For the streams parameters of its constructor use a list with all the streams (i.e. the instances of the Stream class) you want to repeat. Then, install that instance of RelayStream in your Server object the same way you do with the rest of streams. Make your consumer connect just to the relay stream and leave your producers as they are now. This way, the events that get published through the 296 streams get also published through your relay stream and therefore your consumer gets all of them through a single network connection. Since now the client doesn't know from which stream each event comes, if you need it to separate the events upon reception, use the Event-Type header or define your own extension header.

Please, let me know if I can be of any further help regarding this issue.

@asicilia

This comment has been minimized.

Show comment
Hide comment
@asicilia

asicilia Jan 19, 2016

Thank you for the answer, It is very strange behaviour because this error only happens when one of the publishers sends data but for the other two works fine.

I took a look into the configuration of the publisher that had the error and I saw that it was sending data over streams that did not exist in the server. So, it was sending data over 10 streams but only 6 of them were configured in the server.

I changed the configuration of the server and now seems that works fine. I will let you know in order to close the issue.

I do not know if it can help to make the server more robust.

asicilia commented Jan 19, 2016

Thank you for the answer, It is very strange behaviour because this error only happens when one of the publishers sends data but for the other two works fine.

I took a look into the configuration of the publisher that had the error and I saw that it was sending data over streams that did not exist in the server. So, it was sending data over 10 streams but only 6 of them were configured in the server.

I changed the configuration of the server and now seems that works fine. I will let you know in order to close the issue.

I do not know if it can help to make the server more robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment