Skip to content

Websockets - 150 seconds to connect to the endpoint #10157

@mahisnghrwt

Description

@mahisnghrwt

I did this

I am using the multi_socket interface with libev to connect to a WebSocket endpoint. I have observed an intermittent bug where curl gets stuck CURLMstate::MSTATE_CONNECTING stage for around 150 seconds because the underlying socket wasn't writable(is_connected done out parameter is false) when curl checked. And, nearest timeout set is for EXPIRE_DNS_PER_NAME which defaults to 200 seconds. But, every single time, eventually, curl connects to the endpoint.

I expected the following

I'd expect curl to routinely check the underlying socket until it becomes writable.

This might help in reproducing the bug

  • Websocket endpoint - wss://fstream.binance.com/stream?streams=btcusdt@depth5
  • Set CURLOPT_HAPPY_EYEBALLS_TIMEOUT_MS to 100(ms) (Not sure how this is related.)
  • Only add a single easy handle to the multi interface.
  • Log
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    [Detaching after vfork from child process 7672]
    OpenSSL 1.1.1s  1 Nov 2022
    libcurl version: 7.87.0
    * STATE: INIT => CONNECT handle 0x1346808; line 1909 (connection #-5000)
    * Added connection 0. The cache now contains 1 members
    [New Thread 0x7ffff76a1700 (LWP 7674)]
    * STATE: CONNECT => RESOLVING handle 0x1346808; line 1955 (connection #0)
    * [CONN-0-0] setup, init filter chain
    * [CONN-0-0] cf_add(filter=SOCKET)
    * [CONN-0-0] cf_add(filter=SSL)
    * [CONN-0-0][CF-SOCKET] setup(remotehost=fstream.binance.com)
    * STATE: RESOLVING => CONNECTING handle 0x1346808; line 2029 (connection #0)
    * family0 == v4, family1 == v6
    *   Trying 3.113.111.236:443...
    [Thread 0x7ffff76a1700 (LWP 7674) exited]
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    [New Thread 0x7ffff76a1700 (LWP 7677)]
    // Stuck for 150 seconds
    
  • Log from gdb
    [Detaching after vfork from child process 13557]
    OpenSSL 1.1.1s  1 Nov 2022
    libcurl version: 7.87.0
    
    Breakpoint 1, Curl_expire (data=0x1346808, milli=0, id=EXPIRE_RUN_NOW) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 0
    id = EXPIRE_RUN_NOW
    * STATE: INIT => CONNECT handle 0x1346808; line 1909 (connection #-5000)
    * Added connection 0. The cache now contains 1 members
    [New Thread 0x7ffff76a1700 (LWP 13559)]
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=1, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 1
    id = EXPIRE_ASYNC_NAME
    * STATE: CONNECT => RESOLVING handle 0x1346808; line 1955 (connection #0)
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=2, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 2
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=4, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 4
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=8, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 8
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=16, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 16
    id = EXPIRE_ASYNC_NAME
    [Thread 0x7ffff76a1700 (LWP 13559) exited]
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=0, id=EXPIRE_RUN_NOW) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 0
    id = EXPIRE_RUN_NOW
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=7, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 7
    id = EXPIRE_ASYNC_NAME
    * [CONN-0-0] setup, init filter chain
    * [CONN-0-0] cf_add(filter=SOCKET)
    * [CONN-0-0] cf_add(filter=SSL)
    * [CONN-0-0][CF-SOCKET] setup(remotehost=fstream.binance.com)
    * STATE: RESOLVING => CONNECTING handle 0x1346808; line 2029 (connection #0)
    * family0 == v4, family1 == v6
    *   Trying 52.197.50.89:443...
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=149988, id=EXPIRE_DNS_PER_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 149988
    id = EXPIRE_DNS_PER_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=149988, id=EXPIRE_DNS_PER_NAME2) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 149988
    id = EXPIRE_DNS_PER_NAME2
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=100, id=EXPIRE_HAPPY_EYEBALLS) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 100
    id = EXPIRE_HAPPY_EYEBALLS
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    [New Thread 0x7ffff76a1700 (LWP 13560)]
    
    

curl/libcurl version

curl 7.87.0 (x86_64-pc-linux-gnu) libcurl/7.87.0 OpenSSL/1.1.1s zlib/1.2.11
Release-Date: 2022-12-21
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS Debug HSTS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL threadsafe TLS-SRP TrackMemory UnixSockets

operating system

Linux mahi 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions