FastAPI+Uvicorn is running slow than Flask+uWSGI #2690

Arrow-Li · 2021-01-22T05:44:40Z

I'm new to fastapi and I'm trying to test speed between fastapi and flask, but I didn't get a better result by fastapi. pls tell me if I'm making anything wrong?

Example

fastapi

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}

run command: uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1

flask

import flask

app = flask.Flask(__name__)

@app.route("/")
def run():
    return {"message": "hello"}

run command: uwsgi --wsgi-file flask_test.py --process 4 --callable app --http :8000 > /dev/null 2>&1

Result

use ab -n 10000 -c 500 http://127.0.0.1:8000/ to test speed

FastApi

Requests per second:    1533.91 [#/sec] (mean)
Time per request:       325.965 [ms] (mean)
Time per request:       0.652 [ms] (mean, across all concurrent requests)
Transfer rate:          244.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   46 208.1      0    1000
Processing:     1  268 171.1    245     950
Waiting:        0  201 146.1    174     909
Total:          1  314 296.7    246    1918

Flask

Requests per second:    1829.40 [#/sec] (mean)
Time per request:       273.313 [ms] (mean)
Time per request:       0.547 [ms] (mean, across all concurrent requests)
Transfer rate:          162.57 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   18 131.3      0    1000
Processing:    12  192 556.3     36    4302
Waiting:        0  191 556.3     35    4301
Total:         17  210 612.7     36    5300

Environment

OS: CentOS 7
Python Version: 3.9.1
FastAPI Version: 0.63.0

Additional context

The text was updated successfully, but these errors were encountered:

dstlny · 2021-01-22T11:05:42Z

I'm new to fastapi and I'm trying to test speed between fastapi and flask, but I didn't get a better result by fastapi. pls tell me if I'm making anything wrong?

Example

fastapi

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}

run command: uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1

flask

import flask

app = flask.Flask(__name__)

@app.route("/")
def run():
    return {"message": "hello"}

run command: uwsgi --wsgi-file flask_test.py --process 4 --callable app --http :8000 > /dev/null 2>&1

Result

use ab -n 10000 -c 500 http://127.0.0.1:8000/ to test speed

FastApi

Requests per second:    1533.91 [#/sec] (mean)
Time per request:       325.965 [ms] (mean)
Time per request:       0.652 [ms] (mean, across all concurrent requests)
Transfer rate:          244.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   46 208.1      0    1000
Processing:     1  268 171.1    245     950
Waiting:        0  201 146.1    174     909
Total:          1  314 296.7    246    1918

Flask

Requests per second:    1829.40 [#/sec] (mean)
Time per request:       273.313 [ms] (mean)
Time per request:       0.547 [ms] (mean, across all concurrent requests)
Transfer rate:          162.57 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   18 131.3      0    1000
Processing:    12  192 556.3     36    4302
Waiting:        0  191 556.3     35    4301
Total:         17  210 612.7     36    5300

Environment

OS: CentOS 7
Python Version: 3.9.1
FastAPI Version: 0.63.0

Additional context

Well... you're using "async def" in the FastAPI example when you're doing zeeo asynchronous operations in the endpoint. Try make it a normal function, then re-run the benchmarks.

ycd · 2021-01-22T11:33:20Z

For a more detailed benchmark, check TechEmpowers

Framework	JSON	1-query	20-query	Fortunes	Updates	Plaintext
fastapi	171,055	66,185	13,022	52,080	5,926	159,445
flask	63,026	34,217	6,647	23,136	1,327	83,398

FalseDev · 2021-01-22T14:21:12Z

I believe using Uvicorn's Gunicorn worker class along with gunicorn offers more performance than the uvicorn workers

Kludex · 2021-01-22T16:21:36Z

Well... you're using "async def" in the FastAPI example when you're doing zeeo asynchronous operations in the endpoint. Try make it a normal function, then re-run the benchmarks.

@dstlny It will be even costly if you do that.

FalseDev · 2021-01-22T16:25:50Z

Well... you're using "async def" in the FastAPI example when you're doing zeeo asynchronous operations in the endpoint. Try make it a normal function, then re-run the benchmarks.

@dstlny It will be even costly if you do that.

Because of the overhead of using a threadpool executor for something that takes an instant to run anyways?

ycd · 2021-01-23T12:51:20Z

Because of the overhead of using a threadpool executor for something that takes an instant to run anyways?

@FalseDev, i didn't get it what you are trying to say here, are you saying that ThreadPoolExecutor is cheaper?

FalseDev · 2021-01-23T17:14:34Z

Because of the overhead of using a threadpool executor for something that takes an instant to run anyways?

@FalseDev, i didn't get it what you are trying to say here, are you saying that ThreadPoolExecutor is cheaper?

If I understood correctly this comment says declaring a very simple sync function (with no waiting) as async might decrease response time.

I'm just asking if it's because the overhead of submitting the function to a ThreadPoolExecutor is stripped off

falkben · 2021-01-24T17:00:05Z

I doubt this will make a big impact on such a small response, but try wrapping the dict you are returning in fastapi in a JSONResponse

Did you install uvicorn with uvloop?

zihaooo · 2021-01-25T15:08:23Z

I'm very interesting about this question. Considering the comments show different opinions, I decide to test by myself, there are the testing results:

FastAPI + async def + uvicorn

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}

run command: uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1

Requests per second: 12160.04 [#/sec] (mean)
Time per request: 41.118 [ms] (mean)
Time per request: 0.082 [ms] (mean, across all concurrent requests)
Transfer rate: 1935.63 [Kbytes/sec] received

Flask + gunicorn

import flask

app = flask.Flask(__name__)

@app.route("/")
def run():
    return {"message": "hello"}

run command: gunicorn --log-level error -w 4 flask_test:app > /dev/null 2>&1

Requests per second: 15726.21 [#/sec] (mean)
Time per request: 31.794 [ms] (mean)
Time per request: 0.064 [ms] (mean, across all concurrent requests)
Transfer rate: 2641.51 [Kbytes/sec] received

These first two testings show the same result as @Arrow-Li wrote at the begining.

FastAPI + async def + gunicorn with uvicorn workers

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}

run command: gunicorn --log-level error -w 4 -k uvicorn.workers.UvicornWorker fastapi_test:app > /dev/null 2>&1

Requests per second: 34781.40 [#/sec] (mean)
Time per request: 14.376 [ms] (mean)
Time per request: 0.029 [ms] (mean, across all concurrent requests)
Transfer rate: 4891.13 [Kbytes/sec] received

This is nearly 3x performance than test 1.

FastAPI + def + uvicorn

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
def run():
    return {"message": "hello"}

run command: uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1

Requests per second: 19752.03 [#/sec] (mean)
Time per request: 25.314 [ms] (mean)
Time per request: 0.051 [ms] (mean, across all concurrent requests)
Transfer rate: 2777.63 [Kbytes/sec] received

Change asyc def to def makes FastAPI faster than Flask.

FastAPI + def + gunicorn with uvicorn workers

from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
def run():
    return {"message": "hello"}

run command: gunicorn --log-level error -w 4 -k uvicorn.workers.UvicornWorker fastapi_test:app > /dev/null 2>&1

Requests per second: 20315.62 [#/sec] (mean)
Time per request: 24.612 [ms] (mean)
Time per request: 0.049 [ms] (mean, across all concurrent requests)
Transfer rate: 2856.88 [Kbytes/sec] received

So, in conclusion, for a function that can be defined as both async and sync, the performance rank is:

FastAPI + async def + gunicorn with uvicorn workers
FastAPI + def + gunicorn with uvicorn workers
FastAPI + def + uvicorn
Flask + gunicorn
FastAPI + async def + uvicorn

Arrow-Li · 2021-01-26T06:47:59Z

I'm very interesting about this question. Considering the comments show different opinions, I decide to test by myself, there are the testing results:
1. FastAPI + async def + uvicorn
from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}
* run command: `uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1`
Requests per second: 12160.04 [#/sec] (mean)
Time per request: 41.118 [ms] (mean)
Time per request: 0.082 [ms] (mean, across all concurrent requests)
Transfer rate: 1935.63 [Kbytes/sec] received
1. Flask + gunicorn
import flask

app = flask.Flask(__name__)

@app.route("/")
def run():
    return {"message": "hello"}
* run command: `gunicorn --log-level error -w 4 flask_test:app > /dev/null 2>&1`
Requests per second: 15726.21 [#/sec] (mean)
Time per request: 31.794 [ms] (mean)
Time per request: 0.064 [ms] (mean, across all concurrent requests)
Transfer rate: 2641.51 [Kbytes/sec] received

These first two testings show the same result as @Arrow-Li wrote at the begining.
1. FastAPI + async def + gunicorn with uvicorn workers
from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
async def run():
    return {"message": "hello"}
* run command: `gunicorn --log-level error -w 4 -k uvicorn.workers.UvicornWorker fastapi_test:app > /dev/null 2>&1`
Requests per second: 34781.40 [#/sec] (mean)
Time per request: 14.376 [ms] (mean)
Time per request: 0.029 [ms] (mean, across all concurrent requests)
Transfer rate: 4891.13 [Kbytes/sec] received

This is nearly 3x performance than test 1.
1. FastAPI + def + uvicorn
from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
def run():
    return {"message": "hello"}
* run command: `uvicorn --log-level error --workers 4 fastapi_test:app > /dev/null 2>&1`
Requests per second: 19752.03 [#/sec] (mean)
Time per request: 25.314 [ms] (mean)
Time per request: 0.051 [ms] (mean, across all concurrent requests)
Transfer rate: 2777.63 [Kbytes/sec] received

Change asyc def to def makes FastAPI faster than Flask.
1. FastAPI + def + gunicorn with uvicorn workers
from fastapi import FastAPI

app = FastAPI(debug=False)

@app.get("/")
def run():
    return {"message": "hello"}
* run command: `gunicorn --log-level error -w 4 -k uvicorn.workers.UvicornWorker fastapi_test:app > /dev/null 2>&1`
Requests per second: 20315.62 [#/sec] (mean)
Time per request: 24.612 [ms] (mean)
Time per request: 0.049 [ms] (mean, across all concurrent requests)
Transfer rate: 2856.88 [Kbytes/sec] received

So, in conclusion, for a function that can be defined as both async and sync, the performance rank is:
1. FastAPI + async def + gunicorn with uvicorn workers

2. FastAPI + def + gunicorn with uvicorn workers

3. FastAPI + def + uvicorn

4. Flask + gunicorn

5. FastAPI + async def + uvicorn

Wonderful test! So to reach max performance should async + gunicorn

ycd · 2021-01-26T07:15:16Z

So to reach max performance should async + gunicorn

The answer is no. You should pick the one that fits your case. If you run your ML/DL model in a coroutine (async def endpoint), congrats, you will have a blocking endpoint and that endpoint will block your entire event loop.

async def endpoints does not mean it will be faster, that is not the point of `asynchronous I/O.

I think understanding asynchronous I/O a little bit deeper could help, so i'm copying this from one of my answer in Stackoverflow.

The question completely depends on what your function does and how it does.

Okay, but i need to understand asyncio better.

Then assume you have the following code

async def x():
    a = await service.start()
    return a

This will allocate the stack space for the yielding variable of service().start()
The event loop will execute this and jump to the next statement
1. once start() get's executed it will push the value of the calling stack
2. This will store the stack and the instruction pointer.
3. Then it will store the yielded variable from service().start() to a, then it will restore the stack and the instruction pointer.
When it comes to return a this will push the value of a to calling stack.
After all it will clear the stack and the instruction pointer.

Note that we were able to do all this because service().start() is a coroutine, it is yielding instead of returning.

This may not be clear to you at first glance but as I mentioned async and await are just fancy syntax for declaring and managing coroutines.

import asyncio

@asyncio.coroutine
def decorated(x):
    yield from x 

async def native(x):
    await x

But these two function are identical does the exact same thing. You can think of yield from chains one and more functions together.

But to understand asynchronous I/O deeply we need to have an understanding of what it does and how it does underneath.

In most operating systems, a basic API is available with select() or poll() system calls.

These interfaces enable the user of the API to check whether there is any incoming I/O that should be attended to.

For example, your HTTP server wants to check whether any network packets have arrived in order to service them. With the help of this system calls you are able to check this.

When we check the manual page of select() we will see this description.

select() and pselect() allow a program to monitor multiple file de‐
scriptors, waiting until one or more of the file descriptors become
"ready" for some class of I/O operation (e.g., input possible). A file
descriptor is considered ready if it is possible to perform a corre‐
sponding I/O operation

This gives you a pretty basic idea, and this explains the nature of what asynchronous I/O does.

It lets you check whether descriptors can be read and can be written.

It makes your code more scalable, by not blocking other things. Your code becomes faster as a bonus, but it is not the actual purpose of asynchronous I/O.

So to tidy up.

The event loop just keeps yielding, while something is ready. By doing that it does not block.

Arrow-Li · 2021-01-26T07:15:34Z

Base on @zihaooo test, I post my own result below.

no code&ab change

fastapi with gunicorn 4 workers
gunicorn --log-level error -w 4 -k uvicorn.workers.UvicornWorker fastapi_test:app > /dev/null 2>&1

Requests per second:    3955.81 [#/sec] (mean)
Time per request:       126.396 [ms] (mean)
Time per request:       0.253 [ms] (mean, across all concurrent requests)
Transfer rate:          556.29 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   18 129.9      0    1000
Processing:     0   96  56.3     83     349
Waiting:        0   83  49.7     71     332
Total:          0  114 143.8     84    1193

flask with uwsgi 4 workers
uwsgi --wsgi-file flask_test.py --process 4 --callable app --http :8000 > /dev/null 2>&1

Requests per second:    2183.14 [#/sec] (mean)
Time per request:       229.028 [ms] (mean)
Time per request:       0.458 [ms] (mean, across all concurrent requests)
Transfer rate:          194.01 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   59 235.7      0    3002
Processing:     7  121 408.9     40    3703
Waiting:        0  119 409.0     39    3703
Total:          7  180 567.3     41    4297

So, fastapi about 2x faster than flask, for me its good but not enough to moving from flask.

Arrow-Li · 2021-01-26T07:36:56Z

So to reach max performance should async + gunicorn

The answer is no. You should pick the one that fits your case. If you run your ML/DL model in a coroutine (async def endpoint), congrats, you will have a blocking endpoint and that endpoint will block your entire event loop.

async def endpoints does not mean it will be faster, that is not the point of `asynchronous I/O.

I think understanding asynchronous I/O a little bit deeper could help, so i'm copying this from one of my answer in Stackoverflow.

The question completely depends on what your function does and how it does.

Okay, but i need to understand asyncio better.

Then assume you have the following code
async def x():
    a = await service.start()
    return a
1. This will allocate the stack space for the yielding variable of `service().start()`

2. The event loop will execute this and jump to the next statement
   
   1. once `start()` get's executed it will push the value of the calling stack
   2. This will store the stack and the instruction pointer.
   3. Then it will store the yielded variable from `service().start()` to `a`, then it will restore the stack and the instruction pointer.

3. When it comes to `return a` this will push the value of a to calling stack.

4. After all it will clear the stack and the instruction pointer.
Note that we were able to do all this because service().start() is a coroutine, it is yielding instead of returning.

This may not be clear to you at first glance but as I mentioned async and await are just fancy syntax for declaring and managing coroutines.
import asyncio

@asyncio.coroutine
def decorated(x):
    yield from x 

async def native(x):
    await x 
But these two function are identical does the exact same thing. You can think of yield from chains one and more functions together.

But to understand asynchronous I/O deeply we need to have an understanding of what it does and how it does underneath.

In most operating systems, a basic API is available with select() or poll() system calls.

These interfaces enable the user of the API to check whether there is any incoming I/O that should be attended to.

For example, your HTTP server wants to check whether any network packets have arrived in order to service them. With the help of this system calls you are able to check this.

When we check the manual page of select() we will see this description.

select() and pselect() allow a program to monitor multiple file de‐
scriptors, waiting until one or more of the file descriptors become
"ready" for some class of I/O operation (e.g., input possible). A file
descriptor is considered ready if it is possible to perform a corre‐
sponding I/O operation

This gives you a pretty basic idea, and this explains the nature of what asynchronous I/O does.

It lets you check whether descriptors can be read and can be written.

It makes your code more scalable, by not blocking other things. Your code becomes faster as a bonus, but it is not the actual purpose of asynchronous I/O.

So to tidy up.

The event loop just keeps yielding, while something is ready. By doing that it does not block.

I know async not fit in any cases, what I mean is fastapi advantage compare flask is asgi (dont know if I'm right), if in sync case I rather use flask.

Mause · 2021-01-26T10:01:20Z

You can't realistically compare fastapi to flask anyway, as they are intended to do different things. Flask is designed for general websites with no real specialisation, whereas FastAPI has many built in features to specifically aid in the construction of rest-ish APIs.

ycd · 2021-01-26T10:13:02Z

if in sync case I rather use flask.

That's the main point of the comment i wrote above actually, you can mix up both async def and def endpoints in one router. Select the one that fits your case, you do not have to write up all the endpoints to be the same. You don't have that option in Flask.

cloud11665 · 2021-02-23T23:49:26Z

thanks for the benchmarks, because i had the same problem.

DavidKimDY · 2021-07-22T01:30:58Z

This is another test. Uvicorn Async vs Gunicorn Async with Uvicorn workers.
Test goes with 4 sizes of data : 17B, 277B, 416KB, 1.49MB
Test structure is like this : MongoDB(local1) <-> FastApi(AWS EC2) <-> Jupiter notebook(local2)

Uvicorn Async

1. 17B data

1st : 39.5 ms ± 1.44 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 40.9 ms ± 1.48 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 41.2 ms ± 869 µs per loop (mean ± std. dev. of 10 runs, 50 loops each)

2. 277B data

1st : 62.9 ms ± 1.03 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 65 ms ± 1.73 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 65.1 ms ± 742 µs per loop (mean ± std. dev. of 10 runs, 50 loops each)

3. 416KB data

1st : 269 ms ± 2.35 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 267 ms ± 1.17 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 268 ms ± 2.98 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)

4. 1.49MB data

1st : 624 ms ± 6.88 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 619 ms ± 10.2 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 615 ms ± 8.5 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)

Gunicorn Async with Uvicorn workers

1. 17B data

1st : 38.6 ms ± 921 µs per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 38.9 ms ± 806 µs per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 40 ms ± 867 µs per loop (mean ± std. dev. of 10 runs, 50 loops each)

2. 277B data

1st : 62.3 ms ± 1.33 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 62.9 ms ± 1.24 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 63.3 ms ± 1.27 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)

3. 416KB data

1st : 285 ms ± 3.33 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 289 ms ± 2.53 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 285 ms ± 3.05 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)

4. 1.49MB data

1st : 681 ms ± 6.68 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
2nd : 663 ms ± 14.5 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)
3rd : 673 ms ± 10 ms per loop (mean ± std. dev. of 10 runs, 50 loops each)

The result is interesting. For me, using uvicorn is better. Is there anyone who knows what make this result?
I guess I need to test again with apache benchmarking.

DavidKimDY · 2021-07-22T02:25:01Z

Uvicorn

Concurrency Level:      10
Time taken for tests:   1.982 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      78500 bytes
HTML transferred:       6500 bytes
Requests per second:    252.21 [#/sec] (mean)
Time per request:       39.649 [ms] (mean)
Time per request:       3.965 [ms] (mean, across all concurrent requests)
Transfer rate:          38.67 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       13   20  24.0     16     284
Processing:    14   19   4.4     18      62
Waiting:       14   18   4.4     18      62
Total:         27   39  25.0     34     300

Concurrency Level:      10
Time taken for tests:   10.315 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      227000 bytes
HTML transferred:       154500 bytes
Requests per second:    48.47 [#/sec] (mean)
Time per request:       206.298 [ms] (mean)
Time per request:       20.630 [ms] (mean, across all concurrent requests)
Transfer rate:          21.49 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       13   21  28.8     17     248
Processing:    34  183  21.4    187     255
Waiting:       34  157  28.4    159     255
Total:         49  204  36.6    205     458

Concurrency Level:      10
Time taken for tests:   67.452 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      269621000 bytes
HTML transferred:       269547000 bytes
Requests per second:    7.41 [#/sec] (mean)
Time per request:       1349.044 [ms] (mean)
Time per request:       134.904 [ms] (mean, across all concurrent requests)
Transfer rate:          3903.53 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       13   46  25.6     40     164
Processing:   911 1291 210.1   1275    2309
Waiting:      134  550 270.0    542    1170
Total:        945 1337 212.9   1305    2339

Concurrency Level:      10
Time taken for tests:   190.623 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      951847500 bytes
HTML transferred:       951773000 bytes
Requests per second:    2.62 [#/sec] (mean)
Time per request:       3812.455 [ms] (mean)
Time per request:       381.245 [ms] (mean, across all concurrent requests)
Transfer rate:          4876.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       13   32  53.1     23    1032
Processing:  1099 3757 1042.8   3732    8124
Waiting:      329 1014 527.3    935    2973
Total:       1128 3789 1043.5   3762    8141

Gunicorn

Concurrency Level:      10
Time taken for tests:   2.011 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      78500 bytes
HTML transferred:       6500 bytes
Requests per second:    248.61 [#/sec] (mean)
Time per request:       40.224 [ms] (mean)
Time per request:       4.022 [ms] (mean, across all concurrent requests)
Transfer rate:          38.12 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       12   20  23.6     16     220
Processing:    13   19   8.6     17      87
Waiting:       13   19   8.6     17      87
Total:         26   39  26.4     33     242

Concurrency Level:      10
Time taken for tests:   3.710 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      227000 bytes
HTML transferred:       154500 bytes
Requests per second:    134.79 [#/sec] (mean)
Time per request:       74.191 [ms] (mean)
Time per request:       7.419 [ms] (mean, across all concurrent requests)
Transfer rate:          59.76 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       12   18  17.8     15     203
Processing:    30   54  26.8     42     206
Waiting:       30   48  21.5     39     206
Total:         43   71  31.5     59     255

Concurrency Level:      10
Time taken for tests:   53.380 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      269621000 bytes
HTML transferred:       269547000 bytes
Requests per second:    9.37 [#/sec] (mean)
Time per request:       1067.594 [ms] (mean)
Time per request:       106.759 [ms] (mean, across all concurrent requests)
Transfer rate:          4932.62 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       12   37  54.0     27    1095
Processing:   325 1021 397.7    936    2650
Waiting:      160  485 208.6    410    1536
Total:        389 1058 400.9    971    2664

Concurrency Level:      10
Time taken for tests:   165.735 seconds
Complete requests:      500
Failed requests:        0
Total transferred:      951847500 bytes
HTML transferred:       951773000 bytes
Requests per second:    3.02 [#/sec] (mean)
Time per request:       3314.692 [ms] (mean)
Time per request:       331.469 [ms] (mean, across all concurrent requests)
Transfer rate:          5608.60 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       13   34  25.8     26     308
Processing:   790 3258 1977.6   2687   11721
Waiting:      380 1091 466.6   1071    4038
Total:        826 3291 1977.9   2712   11736

waketzheng · 2021-09-19T06:41:05Z

Why not testing with Framework+Redis+Database? Such a simple case can not tell the true.

cirospaciari · 2023-02-07T13:12:39Z

@Arrow-Li if you are looking for raw throughput both Flask and FastAPI are limited by WSGI/ASGI servers and framework overhead of course. PyPy can reduce the overhead of Python if used with a server that is made for it.

Take a look:
https://www.techempower.com/benchmarks/#section=test&runid=adce24e2-9277-45b2-845c-3dbce439d727&test=plaintext&l=hra0hr-35r

Socketify is a web framework and also provides WSGI and ASGI server
https://github.com/cirospaciari/socketify.py

Granian is a WSGI and ASGI server
https://github.com/emmett-framework/granian

If you wanna give a performance boost to existing code go with socketify or granian, socketify performs better for now.
If you wanna move to an entirely new thing, do not use WSGI or ASGI go for pure socketify.
If the flask is enough for you, you will find that socketify should be enough too.

FastAPI is more feature complete than pure socketify, so if performance is not the only critical point, you can use FastAPI + socketify ASGI.

Arrow-Li added the question Question or problem label Jan 22, 2021

tiangolo added the question-migrate label Feb 28, 2023

fastapi locked and limited conversation to collaborators Feb 28, 2023

tiangolo converted this issue into discussion #8999 Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

FastAPI+Uvicorn is running slow than Flask+uWSGI #2690

FastAPI+Uvicorn is running slow than Flask+uWSGI #2690

Arrow-Li commented Jan 22, 2021

dstlny commented Jan 22, 2021

Example

Result

Environment

Additional context

ycd commented Jan 22, 2021

FalseDev commented Jan 22, 2021 •

edited

Loading

Kludex commented Jan 22, 2021

FalseDev commented Jan 22, 2021 •

edited

Loading

ycd commented Jan 23, 2021

FalseDev commented Jan 23, 2021

falkben commented Jan 24, 2021

zihaooo commented Jan 25, 2021

Arrow-Li commented Jan 26, 2021

ycd commented Jan 26, 2021

Arrow-Li commented Jan 26, 2021

Arrow-Li commented Jan 26, 2021

Okay, but i need to understand asyncio better.

Mause commented Jan 26, 2021

ycd commented Jan 26, 2021

cloud11665 commented Feb 23, 2021

DavidKimDY commented Jul 22, 2021 •

edited

Loading

DavidKimDY commented Jul 22, 2021

waketzheng commented Sep 19, 2021

cirospaciari commented Feb 7, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

FastAPI+Uvicorn is running slow than Flask+uWSGI #2690

FastAPI+Uvicorn is running slow than Flask+uWSGI #2690

Comments

Arrow-Li commented Jan 22, 2021

Example

Result

Environment

Additional context

dstlny commented Jan 22, 2021

Example

Result

Environment

Additional context

ycd commented Jan 22, 2021

FalseDev commented Jan 22, 2021 • edited Loading

Kludex commented Jan 22, 2021

FalseDev commented Jan 22, 2021 • edited Loading

ycd commented Jan 23, 2021

FalseDev commented Jan 23, 2021

falkben commented Jan 24, 2021

zihaooo commented Jan 25, 2021

Arrow-Li commented Jan 26, 2021

ycd commented Jan 26, 2021

Okay, but i need to understand asyncio better.

Arrow-Li commented Jan 26, 2021

Arrow-Li commented Jan 26, 2021

Okay, but i need to understand asyncio better.

Mause commented Jan 26, 2021

ycd commented Jan 26, 2021

cloud11665 commented Feb 23, 2021

DavidKimDY commented Jul 22, 2021 • edited Loading

Uvicorn Async

1. 17B data

2. 277B data

3. 416KB data

4. 1.49MB data

Gunicorn Async with Uvicorn workers

1. 17B data

2. 277B data

3. 416KB data

4. 1.49MB data

DavidKimDY commented Jul 22, 2021

Uvicorn

Gunicorn

waketzheng commented Sep 19, 2021

cirospaciari commented Feb 7, 2023

This issue was moved to a discussion.

FalseDev commented Jan 22, 2021 •

edited

Loading

FalseDev commented Jan 22, 2021 •

edited

Loading

DavidKimDY commented Jul 22, 2021 •

edited

Loading