Add support for kqueue() and epoll() to event loop #203

Closed
giampaolo opened this Issue May 28, 2014 · 11 comments

Projects

None yet

1 participant

@giampaolo
Owner

From g.rodola on January 25, 2012 18:35:26

Right now the internal poller depends on asyncore module; as such it can only 
use select() and poll() system calls which don't scale/perform well with 
thousands of concurrent clients.
This is a benchmark using poll():

pyftpdlib 0.7.0:

2000 concurrent clients (connect, login)      36.63 secs
2000 concurrent clients (RETR 10M file)      128.07 secs
2000 concurrent clients (STOR 10M file)      189.73 secs
2000 concurrent clients (quit)                 0.39 secs


proftpd 1.3.4rc2:

2000 concurrent clients (connect, login)      44.59 secs
2000 concurrent clients (RETR 10M file)       33.90 secs
2000 concurrent clients (STOR 10M file)      138.94 secs
2000 concurrent clients (quit)                 2.28 secs


2000 clients here actually means 4000 concurrent connections (control + data).
As noticeable, poll() clearly suffers a serious performance degradation.
select() on the other hand, wouldn't have been able to work at all as it has a 
limit of 1024 fds.

epoll() (Linux) and kqueue() (BSD / OSX) are supposed to fix this problems 
altogheter.

What I have in mind (for 1.0.0 version) is to add a "lib" package within a 
modified version of asyncore.dispatcher and an asyncore.loop supporting 
kqueue()/epoll().
A partial patch I wrote some time ago is here: http://bugs.python.org/issue6692 
Also, tornado ( http://www.tornadoweb.org/ ) can be used as an example for the 
epoll() implementation.

Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=203

@giampaolo giampaolo self-assigned this May 28, 2014
@giampaolo
Owner

From g.rodola on January 28, 2012 08:23:42

A preliminary patch is in attachment.

=== before patch (poll()) ===

giampaolo@ubuntu:~/svn/pyftpdlib$ python test/bench.py -u giampaolo -p XXX -b 
concurrence -s 1K -n 2000
2000 concurrent clients (connect, login)      34.98 secs
2000 concurrent clients (RETR 1K file)        61.02 secs
2000 concurrent clients (STOR 1K file)       169.42 secs
2000 concurrent clients (quit)                 0.11 secs


=== after patch (epoll()) ===

giampaolo@ubuntu:~/svn/pyftpdlib$ python test/bench.py -u giampaolo -p XXX -b 
concurrence -s 1K -n 2000
2000 concurrent clients (connect, login)      19.46 secs
2000 concurrent clients (RETR 1K file)        24.29 secs
2000 concurrent clients (STOR 1K file)       122.09 secs
2000 concurrent clients (quit)                 0.10 secs

Attachment: ioloop.patch

@giampaolo
Owner

From g.rodola on February 17, 2012 11:45:58

Patch in attachment adds kqueue() support (BSD and OSX systems).

Attachment: kqueue.patch

@giampaolo
Owner

From g.rodola on February 18, 2012 08:06:36

Updated patch.

Attachment: ioloop.patch

@giampaolo
Owner

From g.rodola on February 28, 2012 08:50:49

Updated patch in attachment.

CHANGES:
- got rid of serve_forever()'s "use_poll" and "count" arguments; replaced with 
a new "blocking" argument defaulting to True

TODO:
- kqueue() uses an hack for accepting sockets
- epoll()/poll() currently ckecks for error fds in order to detect closed 
connections but this might not be necessary (twisted doesn't do that)
- on the other hand, select() on windows might need to do that

Attachment: ioloop.patch

@giampaolo
Owner

From g.rodola on March 02, 2012 14:23:39

Ok, I think this is done.
Here's a summary to clarify what I've done.

Before the patch
================

- The IO loop was based on asyncore stdlib module which only supports select() 
and poll().

- These are known to scale/perform reasonably fine under a thousand concurrent 
connections, then they start to show performance degration (poll()) or don't 
work at all (select()).

- asyncore's IO poller is also particularly naive in that every registered file 
descriptor is checked for both read and write operations, even for idle 
connections.

- That means that with 200 connected clients we iterate over a list of 400 (200 
* 2) elements on every loop.


After the patch
===============

- The IO loop has been rewritten from scratch and now supports epoll() and 
kqueue() on Linux and OSX/BSD.

- epoll() and kqueue() scales/perform better with thousands of connections.

- asyncore's original select() and poll() implementation were rewritten.

- The poller is smarter in that it only iterates on fds which are actually 
interested in either reading or writing.

- That means that with 200 idle clients except one we will iterate over a list 
of 1 element instead of 400.

- This is valid for all pollers, including select().

- By default we use the better poller for the designated platform:
    - Linux: epoll()
    - OSX/BSD: kqueue()
    - all other POSIX: poll()
    - Windows: select()

- FTPServer.serve_forever() signature has changed.


Final benchamrk
===============

=== old select() implementation ===

200 concurrent clients (connect, login)                0.96 secs
STOR (1 file with 200 idle clients)                   81.94 MB/sec
RETR (1 file with 200 idle clients)                   89.01 MB/sec
200 concurrent clients (RETR 10M file)                 2.80 secs
200 concurrent clients (STOR 10M file)                 6.65 secs
200 concurrent clients (QUIT)                          0.02 secs


=== new select() implementation ===

200 concurrent clients (connect, login)                0.78 secs
STOR (1 file with 200 idle clients)                  399.46 MB/sec
RETR (1 file with 200 idle clients)                  761.53 MB/sec
200 concurrent clients (RETR 10M file)                 2.22 secs
200 concurrent clients (STOR 10M file)                 5.79 secs
200 concurrent clients (QUIT)                          0.01 secs


=== epoll() implementation ===

200 concurrent clients (connect, login)                0.77 secs
STOR (1 file with 200 idle clients)                  535.83 MB/sec
RETR (1 file with 200 idle clients)                 1632.50 MB/sec
200 concurrent clients (RETR 10M file)                 2.24 secs
200 concurrent clients (STOR 10M file)                 5.82 secs
200 concurrent clients (QUIT)                          0.02 secs


Furter note
===========

A patch which can be applied to current 0.7.0 version version is in attachment.

Attachment: ioloop.patch

@giampaolo
Owner

From g.rodola on May 11, 2012 08:31:44

Patch including updated docstrings.

Attachment: ioloop.patch

@giampaolo
Owner

From g.rodola on May 23, 2012 08:18:13

This in now committed in r1049 .

Status: FixedInSVN
Labels: Milestone-1.0.0

@giampaolo
Owner

From g.rodola on May 23, 2012 08:30:58

Final patch attached.

Attachment: ioloop.patch

@giampaolo
Owner

From nagy.att...@gmail.com on July 16, 2012 11:08:42

Thank you very much for this! I've just began to port my SMTP server from 
python's default asyncore to your lib and using exactly the same code shows a 
substantial amount of speedup.
Previously a dummy SMTP sink could do around 70 MiBps (std asyncore with poll), 
with your io loop (FreeBSD, kqueue) it does around 110.
The same logic in twisted can do about 20...
@giampaolo
Owner

From g.rodola on February 19, 2013 04:49:26

Releasing 1.0.0 just now. Closing.

Status: Fixed
Labels: Version-0.7.0

@giampaolo
Owner

From g.rodola on February 19, 2013 04:58:50

Final benchmarks: https://code.google.com/p/pyftpdlib/wiki/Benchmarks
@giampaolo giampaolo closed this May 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment