Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable TCP_NODELAY for all synchronous sockets. #651

Merged
merged 3 commits into from
May 22, 2023

Conversation

plietar
Copy link
Contributor

@plietar plietar commented Jan 12, 2023

The widely documented poor interaction between the Nagle algorithm and TCP's delayed ACK feature, when making short successive writes, leads to unnecessary delays (around 50ms on Linux). This happens in httpcore whenever a POST request is made, since the headers and body are sent as two separate calls.

The TCP_NODELAY can be enabled to disable Nagle's algorithm, and therefore avoid this delay.

The option is already enabled by default for all asyncio and Trio sockets. It is also enabled by CPython's http.client module (which urllib and requests use) and by many high-level HTTP libraries found in other languages, including libcurl, Java's Netty, Rust's reqwest and Go's standard library, to name a few.

@plietar plietar force-pushed the plietar/no-delay branch 2 times, most recently from 3d7fad4 to 635db1e Compare January 12, 2023 16:34
The widely documented poor interaction between the Nagle algorithm and
TCP's delayed ACK feature, when making short successive writes, leads to
unnecessary delays (around 50ms on Linux). This happens in httpcore
whenever a POST request is made, since the headers and body are sent as
two separate calls.

The TCP_NODELAY option can be enabled to disable Nagle's algorithm, and
therefore avoid this delay.

The option is already enabled by default for all asyncio and Trio
sockets. It is also enabled by CPython's http.client module (which
urllib and requests use) and by many high-level HTTP libraries found in
other languages, including libcurl, Java's Netty, Rust's reqwest and
Go's standard library, to name a few.
@tomchristie
Copy link
Member

Thanks, seems like this might be worth review.

I wasn't aware that urllib3 uses TCP_NODELAY by default. Are you able to point me to how I can verify this / where this is documented / where in the code stack this gets set?

@zanieb
Copy link
Contributor

zanieb commented Jan 12, 2023

There was some discussion about this recently in Go; golang/go#57530 and https://withinboredom.info/blog/2022/12/29/golang-is-evil-on-shitty-networks/

@plietar
Copy link
Contributor Author

plietar commented Jan 12, 2023

@tomchristie
Copy link
Member

tomchristie commented Jan 17, 2023

One option here would be to start by adding a socket_options config to the ConnectionPool and proxy classes without changing the behaviour. This could be an incremental step that wouldn't initially introduce any behaviour change, and that would still leave us with the option of updating the defaults.

https://www.encode.io/httpcore/connection-pools/#other-options

I'm also curious if we'd be able to demonstrate a difference in performance with TCP_NODELAY enabled or not.

@plietar
Copy link
Contributor Author

plietar commented Jan 17, 2023

Here's a sample code to measure the performance:

import httpx
import requests
import asyncio
from time import perf_counter

N = 100
URL = "http://127.0.0.1:8000"

def sync_version():
    with httpx.Client() as client:
        before = perf_counter()
        for i in range(N):
            client.post(URL, content=b"Hello")
        after = perf_counter()
    return (after-before) / N

async def async_version():
    async with httpx.AsyncClient() as client:
        before = perf_counter()
        for i in range(N):
            await client.post(URL, content=b"Hello")
        after = perf_counter()
    return (after-before) / N

def requests_version():
    with requests.Session() as session:
        before = perf_counter()
        for i in range(N):
            session.post(URL, data=b"Hello")
        after = perf_counter()
    return (after-before) / N

print("Sync:", sync_version())
print("Async:", asyncio.run(async_version()))
print("Requests:", requests_version())

Without this change, and running against a local server, I get average request durations on the order of 2-3ms for the async and requests version, and 43ms for the sync one. After applying the change from the PR, the request duration for the sync version is about 2ms.

Here's the server code I used. I also get similar results with a Go server. Note that I've set TCP_NODELAY here, as otherwise it adds another 40ms across the board.

from http.server import HTTPServer, BaseHTTPRequestHandler
import socket

PORT = 8000

class Handler(BaseHTTPRequestHandler):
    protocol_version = "HTTP/1.1"
    def do_POST(self):
        length = int(self.headers['Content-Length'])
        self.rfile.read(length)
        self.send_response(200)
        self.send_header('Content-Length', '14')
        self.end_headers()
        self.wfile.write(b"Hello, World!\n")

with HTTPServer(("", PORT), Handler) as httpd:
    print("serving at port", PORT)
    httpd.socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
    httpd.serve_forever()

@zanieb
Copy link
Contributor

zanieb commented May 19, 2023

@plietar it looks like #662 is the next step before considering changing the default.

@tomchristie
Copy link
Member

The motivation in #651 (comment) is compelling here.

Would we prefer to accept this PR as-is, or should we make the socket options configurable first? (#662, resolved via #668)

rephrased: Is having configurability actually desirable, or do we really just want to make this one change?

Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we oughta okay this, yup.

needs a note in the changelog then it should be good to go.

Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can be firmer on this one.
Yes, urllib3 do this and we should too.

Thanks for nudging us on it, @plietar.

@tomchristie tomchristie merged commit 834000d into encode:master May 22, 2023
5 checks passed
@karpetrosyan karpetrosyan mentioned this pull request May 23, 2023
@tomchristie
Copy link
Member

Link to asyncio...

In asyncio TCP sockets on Linux are now created with TCP_NODELAY flag set by default.

In the 3.7 release notes.

Link to trio...

By default for TCP sockets, :class:SocketStream enables TCP_NODELAY,
and (on platforms where it's supported) enables TCP_NOTSENT_LOWAT with
a reasonable buffer size (currently 16 KiB) – see issue #72 <https://github.com/python-trio/trio/issues/72>__ for discussion. You can
of course override these defaults by calling :meth:setsockopt.

See the docs, and the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants