Rationalise Python platforms benchmarks #8055

gi0baro · 2023-03-19T23:52:05Z

The main rationale behind this is to avoid mixing frameworks tests and platforms ones in Python.

We can match platforms with servers, and thus testing different servers on different frameworks makes quite no sense to me, as:

if we define the frameworks response time as RT, and we resume avg(RT) in tests, we can actually compute RT = PT + FT where PT is the platform or server time and FT is the actual framework time
if the final RT is always a composition of the used server and the framework, the single benchmark won't add any useful information on the table in case we have also platforms benchmarks
put in other word, if Uvicorn is faster than Hypercorn, any framework test that will run on both server will have the same ∂T produced by the original benchmarks of the 2 servers

Also, since we have "Composite scores" grouped by frameworks, it gets very complicated to understand such values as they can come from different implementations.

Skipping these benchmarks will:

save CPU time, energy and thus the planet
make the entire benchmarks suite faster

Details of changes:

Drop socketify tests from all frameworks
Drop fastwsgi tests from all frameworks
Display socketify in results with pypy as explicit and CPython as implicit to align to common usage
Add plain gunicorn test
Add plain hypercorn test
Review "Framework", "Platform" and "Webserver" labels for all tests

This will stay draft until I checked all the involved points.
In the meantime, a discussion can be started, I would like to have opinions from @cirospaciari, @remittor and @nbrady-techempower

cirospaciari · 2023-03-20T00:12:42Z

I like this rationally but not every WebFramework calls WSGI and ASGI the same way, like Quart have worst performance on socketify than uvicorn, and pure socketify ASGI is way faster than uvicorn (almost double).

So not every framework has the same percent/avg uplift using socketify.
socketify.py is the only one using PyPy as implicit because originally it was a PyPy first framework (and still is), but I agree and will change that.

I can remove all tests and only keep pure ASGI, WSGI, and socketify itself but People should be able to compare different servers on popular web frameworks on python. As I said, not every framework uses ASGI/WSGI in the same way and may not have the same avg difference to leverage the faster server.

PyPy is another thing, some WebFrameworks have their overhead reduced and can have much better performance than using CPython. Most servers do not run very well or at all on PyPy. Take a look at django, that have 288,565 on PyPy vs 92,043 on CPython with socketify and about 70k using meinheld, and meinheld on Falcon is equal to or faster than socketify in CPython. Raw socketify WSGI is 1,561,530 on PyPy and 697,312 on CPython, meinheld should be close to or better than socketify on CPython, and is not compatible with PyPy.

We can limit the number of benchmarks for each WebFramework (only keep the fastest), I think this is fine but people should be able to know on what server it's running and why.

And Composite scores should only be grouped when using the same server + runtime.

remittor · 2023-03-20T03:48:59Z

We can match platforms with servers, and thus testing different servers on different frameworks makes quite no sense to me, as: ....

Not so simple. Some WSGI/ASGI servers may have some tweaks that allow you to work more efficiently with a particular framework.

For example, look my tweak for Flask: https://github.com/jamesroberts/fastwsgi/blob/796e5b70bbb20d5411df4b7fa1b19fdb17ef10d0/fastwsgi/request.c#L852-L871
By default, the Flask returns data from file objects in very small chunks in the form of PyBytes. That is, to transfer a file, the python interpreter (CPython) will have to create a lot of PyBytes objects and copy the data from the file stream there.
Therefore, it was decided to force the Flask to read from the file in chunks of the desired size. This greatly reduced the number of PyBytes objects created.

Tweak in bjoern server: https://github.com/jonashaag/bjoern/blob/25b14e5042f51eb869d6bfb67fe7e6213c9747ee/bjoern/server.c#L306-L328
Directly reads data from a file descriptor.

All these tweaks give a significant speed boost in some use cases.

cirospaciari · 2023-03-22T12:45:24Z

As I promise the naming issue was addressed here: https://github.com/TechEmpower/FrameworkBenchmarks/pull/8058/files

My opinion about rationalizing benchmarks is that unfortunately is not possible, WebFrameworks diverges a lot in overhead and implementation if you include PyPy it's even more difficult to rationalize.
We can get all rounds and try to rationalize over uvicorn or meinheld or anything and we will never get precise results.

Adding different frameworks help-me a lot to find bugs in WSGI, ASGI implementations (I even opened issues on granian using this information, but never posted granian on TFB using other frameworks because I know you do not approve this and I respect your decision).

I also want to state here, that I disagree with keeping old/dead projects on benchmarks like vibora and japronto. Meinheld is not being maintained too but at least is used by a lot of people.

The only prize we get, being better over time in TFB, is getting a better understanding of the behavior and scaling of our application, and being able to compare the same hardware with other implementations. So keeping dead projects is only hurting the benchmark time.

I still want to create a cloud environment (12vcores or more) to run the different benchmarks, like tracking CPU, Memory, IO usage in each benchmark to identify bottlenecks and also add more types of payloads (different sizes).

For payloads my idea was:

'Hello, World!' (13 bytes)
HTML (1KB)
HTML (8KB)
HTML (16KB)
HTML (64KB)
Streaming (1MB)
Streaming (10MB)
Streaming (100MB)
Streaming (1GB)

avoid HTTP/1.1 pipelining use, also adding POST data benchmarks, and in the future adding WebSockets and others.

In this case, I will not test JSON performance or database, but instead, create another benchmark to test different JSON serializers/deserializers, and database connectors separated.

And also I will only add Python Benchmarks, with some other languages as references like Express, and Fastify using node.js, Golang gnet, fiber and gin, asp.net core, and Rust ntex, to be a reference.

But for this, I need more planning and time.

gi0baro · 2023-03-22T17:33:29Z

Given the comments, I gonna close this.

@nbrady-techempower feel free to continue the discussion, re-open this or extract parts from this.

[Python] Drop platform benchmarks on 3rd party frameworks

dc30180

gi0baro force-pushed the rationalise-python-platforms branch from f43f870 to 095a86a Compare March 19, 2023 23:53

github-actions bot added the PR: Please Update label Mar 20, 2023

[Python] Add gunicorn (meinheld) to suite

f21cbe6

gi0baro force-pushed the rationalise-python-platforms branch from 095a86a to f21cbe6 Compare March 20, 2023 00:35

gi0baro closed this Mar 22, 2023

gi0baro mentioned this pull request Sep 15, 2023

No longer accepting plaintext only frameworks / Limited number of tests mutations #8420

Open

gi0baro mentioned this pull request Jan 22, 2024

[Python] Add FastAPI-Granian tests #8541

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rationalise Python platforms benchmarks #8055

Rationalise Python platforms benchmarks #8055

gi0baro commented Mar 19, 2023 •

edited

Loading

cirospaciari commented Mar 20, 2023 •

edited

Loading

remittor commented Mar 20, 2023 •

edited

Loading

cirospaciari commented Mar 22, 2023 •

edited

Loading

gi0baro commented Mar 22, 2023

Rationalise Python platforms benchmarks #8055

Rationalise Python platforms benchmarks #8055

Conversation

gi0baro commented Mar 19, 2023 • edited Loading

cirospaciari commented Mar 20, 2023 • edited Loading

remittor commented Mar 20, 2023 • edited Loading

cirospaciari commented Mar 22, 2023 • edited Loading

gi0baro commented Mar 22, 2023

gi0baro commented Mar 19, 2023 •

edited

Loading

cirospaciari commented Mar 20, 2023 •

edited

Loading

remittor commented Mar 20, 2023 •

edited

Loading

cirospaciari commented Mar 22, 2023 •

edited

Loading