Scale FastApi with sync endpoints

### First Check

- [X] I added a very descriptive title to this issue.
- [X] I used the GitHub search to find a similar issue and didn't find it.
- [X] I searched the FastAPI documentation, with the integrated search.
- [X] I already searched in Google "How to X in FastAPI" and didn't find any information.
- [X] I already read and followed all the tutorial in the docs and didn't find an answer.
- [X] I already checked if it is not related to FastAPI but to [Pydantic](https://github.com/samuelcolvin/pydantic).
- [X] I already checked if it is not related to FastAPI but to [Swagger UI](https://github.com/swagger-api/swagger-ui).
- [X] I already checked if it is not related to FastAPI but to [ReDoc](https://github.com/Redocly/redoc).

### Commit to Help

- [X] I commit to help with one of those options 👆

### Example Code

```python
import logging
from fastapi import FastAPI

app = FastAPI()
logger = logging.getLogger()

@app.get("/")
def root():
    logger.info(f"Running on {os.getpid()}")
    time.sleep(3600)
    return {"message": "Hello World"}
```


### Description

I've noticed lately that we have some latency problems when the servers are busy. I dived into it and found out that, if for example, I have four uvicorn workers, while one worker is very busy, the rest three are significantly less busier. That has two problems - 
1. We're not taking advantage of all our parallelism power
2. We're suffering more from Python's GIL problems on the same worker.

In the example code, if I run it with `gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000`, the first five requests will result in the following output - 
```
INFO:root:Running on 643
INFO:root:Running on 643
INFO:root:Running on 643
INFO:root:Running on 643
INFO:root:Running on 642
```

I investigated uvicorn workers and found out they use asyncio [server](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.Server) interface, which accepts new connections on python's event loop, which means that as long we're using not-async endpoints (which run on AnyIo threads), there is no limit on the number of new connections a worker gets, which again takes us back to the GIL problem.

To sum it all up, I'm a little confused about what is my best option for using FastApi with not-async endpoints and get better performance. Should I use more workers? Should I use more container instances?
Maybe we should implement a new worker which able to control better multithreading on such cases ? I have some implementation ideas for this purpose, but I’d like to have your advice here.

### Operating System

Linux

### Operating System Details

-

### FastAPI Version

0.88.0

### Python Version

3.9.4

### Additional Context

-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Scale FastApi with sync endpoints #5759

First Check

Commit to Help

Example Code

Description

Operating System

Operating System Details

FastAPI Version

Python Version

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Scale FastApi with sync endpoints #5759

Description

First Check

Commit to Help

Example Code

Description

Operating System

Operating System Details

FastAPI Version

Python Version

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions