-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
curl ngtcp2 performance regression on fast links #15267
Comments
This sounds like recv/sendmmsg does not seem to work well on your system. So, just to be sure that we are not dealing with another effec, assuming you build with automake, could you add the following patch, do you see the good performance returned? diff --git a/configure.ac b/configure.ac
index 4a9a7fc5f8..73d086decd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -575,11 +575,11 @@ sac
# In order to detect support of sendmmsg(), we need to escape the POSIX
# jail by defining _GNU_SOURCE or <sys/socket.h> will not expose it.
-case $host_os in
- linux*)
- CPPFLAGS="$CPPFLAGS -D_GNU_SOURCE"
- ;;
-esac
+#case $host_os in
+# linux*)
+# CPPFLAGS="$CPPFLAGS -D_GNU_SOURCE"
+# ;;
+#esac
dnl Build unit tests when option --enable-debug is given.
if test "x$want_debug" = "xyes" && |
The source code with the patch is failing to compile with the following error:
|
Ok, let's try disabling mmsg a simpler way via: diff --git a/lib/vquic/vquic.c b/lib/vquic/vquic.c
index 16bfe4ccd..4926b4e1a 100644
--- a/lib/vquic/vquic.c
+++ b/lib/vquic/vquic.c
@@ -52,6 +52,8 @@
#ifdef USE_HTTP3
+#undef HAVE_SENDMMSG
+
#ifdef O_BINARY
#define QLOGMODE O_WRONLY|O_CREAT|O_BINARY
#else
`` |
While investigating this regression I noticed the traffic started to hit flow control limits. From that I suspect that it is not related to GSO. Also enabling/disabling offloads on the interfaces doesn't make SDB frames disappear. |
FTR: I am unable to reproduce. Using the setup described in #15415 on my local network with 1GB/s ethernet, the scorecard download runs QUIC/HTTP/3 at 84 MB/s. If I understood this issue description correctly, this is not what the reporter observes in their setup. |
Tried curl w/o MMSG as suggested above. Also w/ recent versions of the components.
|
So the loss is definitely induced by use of recvmmsg. Instrumented the code (recvmmsg at lib/vquic/vquic.c and recv_pkt at curl/lib/vquic/curl_ngtcp2.c) to see how are the data passed around. I'm not familiar w/ curl internals but I assume the "recv_pkt" callback should be invoked per complete received packet. From the debug prints I see cases like below.
I'll dig further to see whether the problem is in our system/linux or in lib/vquic/vquic.c handling or recvmmsg call.
|
So, either the kernel or our vquic.c code does not asses the individual packet lengths correctly and therefore, it gets dropped by ngtcp2 and leads to effectively packet loss for the QUIC connection? /cc @tatsuhiro-t |
Yes; this is how I understand the problem. It also explains the fact regression get's bigger the faster the link gets (the bigger number of packets get's read by single recvmmsg call). |
In recvmmsg_packets, each buffer has 2048 bytes which is too small for GRO and I think the input is truncated. |
@tatsuhiro-t I see. Maybe we need to move those off the stack then. I'll make a PR. |
One more experiment: And the data get's truncated the same way... So it seems the issue is rather in the application's handling of recvmmsg than in the system or recvmmsg itself. |
It performs much better; however still not 100%. However this one looks wrong:
There should be no single 13200 byte packet (it should be split). |
There is an issue in message header buffer size. mmsg[i].msg_hdr.msg_control = &msg_ctrl[i * CMSG_SPACE(sizeof(int))];
mmsg[i].msg_hdr.msg_controllen = CMSG_SPACE(sizeof(int)); |
do you want to create a pr? i think others will need to be updated such as |
I wanted to confirm replacing all CMSG_SPACE occurrences w/ |
I'll update #15454 with this. |
Thanks everyone! #15454 now uses the correct CMSG_SPACE(int) and also a larger set of 64k buffers. |
Thank you @icing. I was a bit busy today. |
np, @tatsuhiro-t, an opportunity to learn more about it. |
I did this
Starting from this change: 97c0f89bd I observe slight performance improvement on slower links, but also a significant performance regression on fast links when downloading an http/3 resource.
Steps to reproduce:
Build curl with http/3 support from these component version:
OpenSSL: OpenSSL_1_1_1w-37-g612d8e44d6
nghttp3: v1.6.0-1-g7f40779
ngtcp2: v1.8.0-1-g8d5ab787
nghttp2: v1.63.0-15-g4202608e
curl: curl-8_10_1-146-g97c0f89bd
Build a second binary from the same components, but with previous curl git commit
Setup a http3 server and a bandwith limit using netem with the following params:
rate=1Gbit ,rtt=2ms, loss=0 , jitter=0
download 100MB file and measure the time
I expected the following
The download duration with older binary is approximatly 3s, but 100s with the newer one.
The faster the emulated link is the bigger regression we get. for example on 1.5GBit line the regression is 1000% !
curl/libcurl version
curl 8.10 and newer (tested with the main branch today)
operating system
Ubuntu 22.04
The text was updated successfully, but these errors were encountered: