The problem is the following: even the libcurl's documentation says that there is no need to run .perform() again after the E_OK is returned, the truth is more subtle. E_OK is returned in two different cases: 1) there is really nothing to do, 2) CURL_MAX_WRITE_SIZE of data was passed to write_function. There is no way to check whether there is any data available right now. Even epoll'ing wouldn't help in some cases, when data is read into internal libcurl's buffers. This behavior significantly degrades performance of fetching http-resources which are bigger than 16Kb. It is especially well seen when some synchronous work should be done (50-100ms XSL transformations in my case).
The problem can be illustrated with this snippet of code: http://gist.github.com/444860/ run it before and after the patch to feel the difference in downloading speed.
I've developed a workaround for the given problem which I propose for merge into tornado: http://github.com/elephantum/tornado/commit/384ce35b9c4b5de9cb247ac4bd5c810ca632daa0
proofpic from hh.ru production: http://skitch.com/elephantum/de6j4/monik
the same problem exists in recently introduced AsyncHTTPClient2. patch: http://github.com/elephantum/tornado/commit/947d2d0124ed8a9512fc440435c0de2732874a81
some more data from hh.ru production:
distribution of number of multi.perform() calls at a time: http://skitch.com/elephantum/djh23/figure-1
distribution of duration of multi.perform() chain: http://skitch.com/elephantum/djh3f/figure-2
Closing since I'm not sure if this is still an issue and the proposed patch has been deleted. If it's still a problem feel free to reopen.