-
Notifications
You must be signed in to change notification settings - Fork 775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random high cpu usage with QUIC/H3 #1903
Comments
I experienced pretty similar behaviour and had to disable QUIC advertisement as a result, but didn't know exactly what data to collect and share for an issue about it. Happy to share any relevant command output if it helps too. |
Thanks for the report. Can you confirm me that the issue is not present for 2.7-dev6 ? |
It is present in ae1e14d which is part of 2.7-dev6:
which was my "golden" 2.7-dev commit to trial running QUIC in wider production deployment |
@a-denoyelle i have no experience with previous versions on this particular server ( or it's standby instance ) just dev7 and dev8 . I have another instance that has run 2.7 dev3, dev4, dev7 and dev8 that had no problems with cpu usage, however those balancers' use case is very different |
I don't know if the exported prometheus metrics may help but here you can see that haproxy active tasks dropped drastically i can add more graphs if you think they may help |
i switched one of the servers to run the older 2.7-dev4 compiled binary i had left laying around and created a symlink to 2.7-dev8 in case it restarted.
upon restart, after the crash systemd started dev8 , which after a couple of hours, got stuck and i had to manually restart after examining the logs i discovered that 2.7-dev3 also crashed periodically back in august, but i didn't notice because i wasn't monitoring systemd restart counter for haproxy....
My guess is that the fix for the crash has some kind of side effect causing haproxy to get stuck I hope this helps your investigation |
This may not be so simple, as QUIC implementation still evolves frequently on haproxy. Also, we have notice recently that the QUIC traffic seems to increase on haproxy.org : maybe recent browsers are more willing to use it and this may expose our QUIC stack to new traffic level. Can you please activate and report haproxy tasks profiling please ? This will help us to detect if there is a task which is woken up too many times :
Also, it might be useful to record QUIC traces during a few minutes when you notice a CPU spike. Thanks for your help. |
ok, so as soon as i enabled profiling it crashed but i managed to query profiling twice a few seconds before that
The thing is that haproxy doesn't dump a core file , our apps do dump cores on that machine.
systemd unit has LimitCORE=infinity , /var/crash and /var/lib/haproxy/var/crash ( haproxy chroot ) both exist and are writable by haproxy user. |
@gabrieltz https://docs.haproxy.org/2.6/configuration.html#set-dumpable might help Also worth checking the permissions of the chroot folder if you use it, to ensure it will allow the user haproxy workers run as to write the file out |
@Tristan971 permissions was the first thing i checked, thank you for the heads up, i enabled set-dumpable in global section |
here is a gdb backtrace for the latest crash. |
Remove ABORT_NOW() statement on unhandled sendto error. Instead use a dedicated counter sendto_err_unknown to report these cases. If we detect increment of this counter, strace can be used to detect errno value : $ strace -p $(pidof haproxy) -f -e trace=sendto -Z This should be backported up to 2.6. This should help to debug github issue #1903.
This crash happens because sendto returned an errno value that we did not anticipate. I just push a patch to suppress Can you please update your haproxy to the latest master tip and run strace in parallel to report the errors you encounter ? |
ok, deployed and running strace , what should i look for ? |
This patch complete the previous incomplete commit. The new counter sendto_err_unknown is now displayed on stats page/CLI show stats. This is related to github issue #1903. This should be backported up to 2.6.
Whoops, I made a typo on the strace command. Replace Besides, my previous commit was incomplete. I pushed a new patch. You can now see a new counter |
Please note that if you use the HTML stats page, you need |
The following are from the stats page.
and the compressed strace output , i already filtered out the SIGARLM signals I hope those help ( i'm keeping strace running in case it captures something else apart from ECONNRESET and EPIPE ) |
This is strange as both |
This is indeed strange
|
it crashed again , this is the last strace output
i'm adding set-dumpable and rerunning it |
I think i may have found what you are looking for it's EINVAL stats
strace output
File descriptor 9 is ( lsof output ):
|
It seems dst_port is 0 which is the reason of EINVAL for sendto. Good catch thanks ! |
you're welcome :) |
@gabrieltz I have inspected the code but I did not see a good explanation for a sendto invocation with a nul port. One possibility though would be that we have previously received a datagram with a nul port. To confirm this hypothesis, can you please my branch I fixed some issues I have noticed on datagram reception. Its last commit will simply ignore such datagrams on reception. Also, If this scenario is confirmed, we can then move on from this problem and refocus our attention on the CPU consumption you first mentioned. Thanks. |
@a-denoyelle it fails to compile
|
nevermind the above commend, it just needed |
I have the latest core dump backtrace compiled from branch ade-quic-g1903 I was polling ctr0 with |
Hmmm can you please share the output of "show pools" ? |
I didn't record it the show pools output this morning that i reported the issue but i remember it being 9Gb due to the buffer pool leak i mentioned in #1964 , i have restarted two instances to the latest just now, i'll keep the third running with the old leaky version until tomorrow and post the output , it's currently at 3.5 gigs. |
yes, the problem is only in prometheus stats, it's helpful to set alerts for those metrics before they get too high on memory constrained installations |
latest 2.8-dev0-46bea1-99 crashes frequently ( about 4 times / hour ) from syslog i get
short backtrace
full backtrace
|
I got another crash with BUG_ON() on another server with the same haproxy version but the core dump was unusable
|
About the last crash, I'm clueless for the moment. If you encounter it frequently, can you activate In the meantime we're looking on the previous crash you reported. You mention that it occurs 4 times per hour. Do you have always have the same backtrace (with |
Indeed, Prometheus exporter and the stats applets both use a 32-bits integer |
I had rolled back to a previous version , i'll set it back to 46bea1 to get a few backtraces and compare |
syslog holds all kinds of errors when i was running 46bea1 , they look kind of random , unless someone was playing
|
here is the first crash. In this case there was no syslog entries like the ones above. This was the most frequent crash. short backtrace
backtrace full
|
All these kinds of errors must indicate there is a memory corruption. Can you run with |
so , it did crash while having this argument , should i see something special ? there was the following in syslog : short backtrace
full backtrace
|
Then it means it cannot be a double free of an element allocated from a pool, but that most likely some memory got corrupted after being freed. This caused the damage of the allocation chains that free() detected. We do have special detection for this using
This will consume slightly more CPU (due to integrity checks) but should remain pretty much OK. However it will increase chances that an overwritten area will be detected before being freed to the OS, and the dumps of the memory area contents may even indicate us what line of code freed it last, and help figure what is invalid (hence what field of the structure was modified). We may then have to guide you through some commands if this triggers. |
I found a memory issue related to mux-quic code. I don't know if this will solve your crashes. Can you try the attached patch and report me if you see any improvment please ? 0001-BUG-MEDIUM-mux-quic-fix-double-delete-from-qcc.openi.patch.txt |
Should i remove the extra arguments from haproxy command line before restarting with the patch ? |
It doesn't really matter, these arguments will trigger if the bug is elsewhere. They're only here to perform self-checks, they will not hide a bug. So if it works fine with the patch and the args, we know the patch is the fix. And if it doesn't work better, there are great chances for the core to be exploitable to figure where the problem is. |
reached 1 hour uptime, in most cases, with this specific version, it wouldn't have exceeded 27 minutes uptime without the patch Thank you |
Thanks for the feedback. I will merge the patch soon. |
it's still very stable , uptime 17 hours and no crash. |
qcs instances for bidirectional streams are inserted in <qcc.opening_list>. It is removed from the list once a full HTTP request has been parsed. This is required to implement http-request timeout. In case a stream is deleted before receiving full HTTP request, it also must be removed from <qcc.opening_list>. This was not the case on first implementation but has been fixed by the following patch : 641a65f BUG/MINOR: mux-quic: remove qcs from opening-list on free This means that now a stream can be deleted from the list in two different functions. Sadly, as LIST_DELETE was used in both cases, nothing prevented a double-deletion from the list, even though LIST_INLIST was used. Both calls are replaced with LIST_DEL_INIT which is idempotent. This bug causes memory corruption which results in most cases in a segfault, most of times outside of mux-quic code itself. It has been found first by gabrieltz who reported it on the github issue #1903. Big thanks to him for his testing. This bug also causes failures on several 'M' transfer testcase of QUIC interop-runner. The s2n-quic client is particularly useful in this case as segfaults triggers were most of the times on the LIST_DELETE operation itself. This is probably due to its encapsulating of HEADERS frame with fin bit delayed in a following empty STREAM frame. This must be backported wherever the above patch is, up to 2.6.
FYI, it should be fixed in 2.8-DEV. |
qcs instances for bidirectional streams are inserted in <qcc.opening_list>. It is removed from the list once a full HTTP request has been parsed. This is required to implement http-request timeout. In case a stream is deleted before receiving full HTTP request, it also must be removed from <qcc.opening_list>. This was not the case on first implementation but has been fixed by the following patch : 641a65f BUG/MINOR: mux-quic: remove qcs from opening-list on free This means that now a stream can be deleted from the list in two different functions. Sadly, as LIST_DELETE was used in both cases, nothing prevented a double-deletion from the list, even though LIST_INLIST was used. Both calls are replaced with LIST_DEL_INIT which is idempotent. This bug causes memory corruption which results in most cases in a segfault, most of times outside of mux-quic code itself. It has been found first by gabrieltz who reported it on the github issue haproxy#1903. Big thanks to him for his testing. This bug also causes failures on several 'M' transfer testcase of QUIC interop-runner. The s2n-quic client is particularly useful in this case as segfaults triggers were most of the times on the LIST_DELETE operation itself. This is probably due to its encapsulating of HEADERS frame with fin bit delayed in a following empty STREAM frame. This must be backported wherever the above patch is, up to 2.6. (cherry picked from commit 15337fd) Signed-off-by: Willy Tarreau <w@1wt.eu>
qcs instances for bidirectional streams are inserted in <qcc.opening_list>. It is removed from the list once a full HTTP request has been parsed. This is required to implement http-request timeout. In case a stream is deleted before receiving full HTTP request, it also must be removed from <qcc.opening_list>. This was not the case on first implementation but has been fixed by the following patch : 641a65f BUG/MINOR: mux-quic: remove qcs from opening-list on free This means that now a stream can be deleted from the list in two different functions. Sadly, as LIST_DELETE was used in both cases, nothing prevented a double-deletion from the list, even though LIST_INLIST was used. Both calls are replaced with LIST_DEL_INIT which is idempotent. This bug causes memory corruption which results in most cases in a segfault, most of times outside of mux-quic code itself. It has been found first by gabrieltz who reported it on the github issue haproxy#1903. Big thanks to him for his testing. This bug also causes failures on several 'M' transfer testcase of QUIC interop-runner. The s2n-quic client is particularly useful in this case as segfaults triggers were most of the times on the LIST_DELETE operation itself. This is probably due to its encapsulating of HEADERS frame with fin bit delayed in a following empty STREAM frame. This must be backported wherever the above patch is, up to 2.6. (cherry picked from commit 15337fd) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 151737f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
This one was backported to 2.6.8 but we forgot to close it. |
Detailed Description of the Problem
2.7-dev7 and dev8 both appear to consume all available cpu resources from the system and goes from 30-40% cpu to 1400% for hours until we restart it.
While haproxy is in that state, all CPU cores are C0 state 100% of the time, while the normal usage is 5-10% at C0 state.
This is the output of
show info
using socatoutput of show threads
but when i used
show tasks
haproxy hung never returned anything, but also stopped servicing requests and i had to restart it.i also captured a perf record -g
Expected Behavior
normal cpu usage
Steps to Reproduce the Behavior
none, sorry
Do you have any idea what may have caused this?
I am suspecting it has to do with QUIC / h3
Do you have an idea how to solve the issue?
No response
What is your configuration?
Additional Information
The text was updated successfully, but these errors were encountered: