I wrote code that created a WebSocket connection and used it do download data for multiple days.
I expected the following
I expected the WebSocket connection to be stable for extended periods of time. (Yes, I do know this is experimental.)
What actually happened
The connection remained stable for a while, but then started ignoring data and using 100% cpu. When this happened, curl_ws_recv was returning CURLE_OK, the recv parameter was set to 0, and the curl_ws_frame contained bytesleft of 0. My code repeatedly processed these phantom frames and called curl_ws_recv again for the next one.
The verbose output leading up to the problem was:
WS: received 160 bytes payload (0 left, buflen was 332)
WS: 168 bytes left to decode
WS:235 received FIN bit 1
WS: received OPCODE TEXT
WS: received 163 bytes payload (0 left, buflen was 168)
WS: 1 bytes left to decode
WS: plen == 1, EAGAIN
Following this, the last two lines were repeated indefinitely.
I dug into this behavior and believe I have found the source of the problem in ws.c. When curl_ws_recv (https://github.com/curl/curl/blob/master/lib/ws.c#L409) is called at the end of a frame (wsp->frame.bytesleft == 0) with one byte in the buffer (wsp->stillblen == 1):
curl_easy_recv on line 431 is not called due to the condition on line 428.
ws_decode on line 449 is called and returns CURLE_AGAIN since the buffer does not include enough bytes for a frame.
The break on line 453 results in return value of CURLE_OK with a buffer of length zero.
thanks @simplerobot, your remarks make perfect sense but I think your change will lose a byte? When the loop happens due to result == CURLE_AGAIN it shouldn't store contents at index 0 in the data->state.buffer since in your example case there is already one byte there.
Let me show you my take at a slightly adjusted fix: #10856 - what do you think?
I did this
I wrote code that created a WebSocket connection and used it do download data for multiple days.
I expected the following
I expected the WebSocket connection to be stable for extended periods of time. (Yes, I do know this is experimental.)
What actually happened
The connection remained stable for a while, but then started ignoring data and using 100% cpu. When this happened, curl_ws_recv was returning CURLE_OK, the recv parameter was set to 0, and the curl_ws_frame contained bytesleft of 0. My code repeatedly processed these phantom frames and called curl_ws_recv again for the next one.
The verbose output leading up to the problem was:
Following this, the last two lines were repeated indefinitely.
I dug into this behavior and believe I have found the source of the problem in ws.c. When curl_ws_recv (https://github.com/curl/curl/blob/master/lib/ws.c#L409) is called at the end of a frame (wsp->frame.bytesleft == 0) with one byte in the buffer (wsp->stillblen == 1):
I fixed this locally by updating:
https://github.com/curl/curl/blob/master/lib/ws.c#L453
< break;
> continue;
https://github.com/curl/curl/blob/master/lib/ws.c#L428
< if(!wsp->stillblen) {
> if(!wsp->stillblen || result == CURLE_AGAIN) {
So far, this has seemed to work, but I'm sure you can come up with something better.
curl/libcurl version
curl 7.88.1 (x86_64-pc-linux-gnu) libcurl/7.88.1 OpenSSL/3.0.2 zlib/1.2.11
Release-Date: 2023-02-20
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS HSTS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL threadsafe TLS-SRP UnixSockets
operating system
Linux virt-linux 5.19.0-35-generic #36~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 17 15:17:25 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: