Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace http parser? #858

Closed
fafhrd91 opened this issue May 3, 2016 · 38 comments
Closed

replace http parser? #858

fafhrd91 opened this issue May 3, 2016 · 38 comments
Labels

Comments

@fafhrd91
Copy link
Member

fafhrd91 commented May 3, 2016

@asvetlov should we replace http parser with https://github.com/MagicStack/httptools?
looks good http://magic.io/blog/uvloop-make-python-networking-great-again/

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 3, 2016

@danielnelson ^^^

@digitaldavenyc
Copy link

aiohttp is the only viable option for using the new uvtools library but aiohttp's http parser is a serious bottleneck at the moment. How can we help fixing this or possibly swapping out the http parser with httptools?

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 6, 2016

i will take a look this weekend, should be relative easy to use httptools instead of parser

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 6, 2016

added basic integration in @6dae1a28c863e2d0e7ce7b8cda7d3e1c2f3eee0d

passes most of the tests from test_client_functional_oldstyle.py

@1st1
Copy link
Contributor

1st1 commented May 6, 2016

Do you see any improvement in performance?

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 6, 2016

i am not at that stage yet. need to fix more tests.

@1st1
Copy link
Contributor

1st1 commented May 7, 2016

@fafhrd91 I've also added support for HTTP upgrade in v0.0.7.

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 7, 2016

@1st1 don't like HttpParserUpgrade exception for upgrade, i think it should be should_upgrade() method.

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 8, 2016

all aiohttp tests pass with new parser

@1st1
Copy link
Contributor

1st1 commented May 8, 2016

The code is in the httptools-integration branch, right?

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 8, 2016

yes

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 8, 2016

i didn't work on any optimization though

@1st1
Copy link
Contributor

1st1 commented May 8, 2016

Interestingly, I don't see any difference in performance ;(

@1st1
Copy link
Contributor

1st1 commented May 8, 2016

That means that there is another bottleneck, perhaps even worse than the parser.

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 8, 2016

I test it on 3.4.3, I see lower performance with http parser. Maybe http tools should do all headers manipulation and return dictionary.

Sent from my iPhone

On May 8, 2016, at 4:13 PM, Yury Selivanov notifications@github.com wrote:

That means that there is another bottleneck, perhaps even worse than the parser.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

@1st1
Copy link
Contributor

1st1 commented May 8, 2016

On Python 3.5.1 @ Mac OS X I see roughly equal performance. Doing headers manipulation in httptools will give it a little performance boost, but nothing significant.

A rudimentary asyncio HTTP protocol server does 50,000 requests/s on my machine, and aiohttp only 3,300. While aiohttp does more work (supports timeouts, deflate etc) it shouldn't be 15 times (!) slower.

httptools parser should be faster than any parser implemented in pure-Python, that's just C vs Python story. If the main bottleneck of aiohttp isn't parsing, it should be something else, I suspect it can be the buffering code.

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 9, 2016

i just tried to find bottleneck. replacing aiohttp.web with aiohttp.server increases performance about 70%,
but what is interesting, everything else gave at max 10% increase. i removed parser, parser buffer, simplified headers processing, removed generators on write path, remove a lot of check during write operation. and still it gave only 10% increase. at this point i think python just slow and in any case speed of web server is useful only for benchmarks.

@1st1
Copy link
Contributor

1st1 commented May 9, 2016

replacing aiohttp.web with aiohttp.server increases performance about 70%

Do you see any difference of httptools vs pure-Python parser for bare aiohttp.server?

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 9, 2016

no

@1st1
Copy link
Contributor

1st1 commented May 9, 2016

Could you please push your code to some experimental branch?

@1st1
Copy link
Contributor

1st1 commented May 9, 2016

BTW, did you try to disable aiohttp flow control (CORK)?

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 9, 2016

added to experimental-pertest

replacing FlowControlDataQueue with simple DataQueue gives ~30% more

@fafhrd91
Copy link
Member Author

fafhrd91 commented May 9, 2016

i also added simple bench.py script

@1st1
Copy link
Contributor

1st1 commented May 9, 2016

Yeah, still, there is a ton of Python code in streams/parsers/etc. While aiohttp architecture is super high-level, flexible and modular, it executes a lot of Python code, and yeah, Python is slow :(. For example, DataQueue uses futures and coroutines and it's only a bridge between parser and the protocol -- it can be removed, or replaced with a simpler code. ServerHttpProtocol.start coroutine can be implemented in ServerHttpProtocol.data_received. I think this complexity can be reduced significantly, albeit that will require a serious rewrite. Anyways, my 2 cents. I'll take a more serious look tomorrow.

@nhumrich
Copy link

Any updates on this, very interesting read?

replacing aiohttp.web with aiohttp.server increases performance about 70%

Does this mean I should try to write all my code just using .server and not using any .web code?

@asvetlov
Copy link
Member

Basically it means you should profile everything before making any performance assumptions

@allan-simon
Copy link

@fafhrd91 did you use some flamegraph technique to find the possible bottleneck ?

@fafhrd91
Copy link
Member Author

@allan-simon no, i didn't. but i don't see reason for doing that, python just slow. any increase of performance in http server code would be neglectable compared to app code.

@allan-simon
Copy link

@fafhrd91 ok as I was going with the impression you were looking for a bottleneck.

Regarding your comment, sure at the end most of the time will be spent in the application code itself, and as a user of the library you should look in your code before trying to switch library. But you also a whole set of microservice that would benefits from high performant http frameworks (for example right now i'm writing one that is streaming a .csv file downloaded form amazon S3 and transforming it on the fly (with some filtering) on a json array (that I can stream too by, one element by line) to assure retro-compatibility, in such case my application code is itself very small and the time spent in the library start to become important.

@fafhrd91
Copy link
Member Author

we at Keepsafe have plenty on streaming and transforming work, we stream terabytes of data in/out os s3 and aiohttp works pretty well on this load. i just say aiohttp on itself is fast enough but if you need to go faster you need to search other options and those options are outside of python.

ps i remember i did a lot of profiling around 0.15 version, and could get 30-50% performance increase in synthetic tests, in there was no any change in performance in real system. (turning off logging gave more increase than any optimizations)

@allan-simon
Copy link

I see, thanks for the feedback, so your point of view would be that replacing the io loop by uvloop or the http parser is actually not worth the burden ?

@fafhrd91
Copy link
Member Author

uvloop is worse if you want to write some very low level and simple network protocols, otherwise you just won't see any performance benefits. for high level protocol and complex application it doesn't matter. and again, aiohttp is fast enough, so just use it and write network services :)

@allan-simon
Copy link

great, thanks for answering all my questions.

@allan-simon
Copy link

oh just one last question, are you using your applications behind gunicorn or directly behind nginx (with a supervisor or similar taking care of the service life cycle)?

@fafhrd91
Copy link
Member Author

we use gunicorn and nginx

@fafhrd91
Copy link
Member Author

@1st1 I am working on parser refactoring. You mentioned idea of removing start() and replace with different implementation, could you elaborate on that?

@fafhrd91
Copy link
Member Author

All work is in this pr #1626

@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants