Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IdTCPClient ConnectTimeout broken on linux >6.1.32 (potentially affects higher-ups like TIdHTTP* ? ) #491

Closed
kralo opened this issue Jul 18, 2023 · 4 comments
Labels
Element: Operating System Issues related to specific Operating Systems Element: Socket Stacks Issues related to OS socket APIs, TIdStack and TIdSocketList descedants, etc Element: TCP Issues related to TCP handling, TIdTCPClient and TIdTCPServer descendants, etc Status: Won't Fix Issue will not be fixed, no further work needed Type: Bug Issue is a bug in existing code

Comments

@kralo
Copy link

kralo commented Jul 18, 2023

When trying to connect to a port that silently drops the SYN packets (i.e. doesn't answer), this program used to work

program conecttimeout;

{$mode objfpc}{$H+}

uses
  {$IFDEF UNIX}{$IFDEF UseCThreads}
  cthreads,
  {$ENDIF}{$ENDIF}
  Classes,
  IdBaseComponent,
  IdTCPClient;

var
  aIdClient:TIdTcpClient;
begin

  aIdClient := TIdTCPClient.Create;
  try
    aIdClient.ConnectTimeout:=3000; // ms
    aIdClient.ReadTimeout:=200; //ms
    // for testing do:   iptables -I INPUT -p tcp --dport 8083 -j DROP
    aIdClient.Connect('127.0.0.1',8083);
    aIdClient.Disconnect;
  finally
  end;
  aIdClient.free;
end.

On Linux up to (including) commit torvalds/linux@6ffc57e it works as expected, waiting 3 seconds and then stopping:

$ strace -f -e trace=open,socket,connect,bind -t ./connecttimeout
09:46:57 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:46:57 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1163 attached
[pid  1163] 09:46:57 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNRESET (Die Verbindung wurde vom Kommunikationspartner zurückgesetzt)
[pid  1163] 09:47:00 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E2A1C:
EIdConnectTimeout: Connect timed out.
  $00000000004E2A1C  DOCONNECTTIMEOUT,  line 328 of ../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E1D20  OPEN,  line 369 of ../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DA98  CONNECT,  line 328 of ../lib/indy10/Core/IdTCPClient.pas
  $000000000045E3A4  CONNECT,  line 514 of ../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 20 of connecttimeout.lpr
  $0000007F80782E18
  $0000000000400928

09:47:02 +++ exited with 217 +++

However, with commit torvalds/linux@4faeee0 the timeout mechanism is broken; it now waits around 2 minutes, which is roughly the same that wget would.
net.ipv4.tcp_syn_retries = 6 so I assume this is expected from retries after 2+4+8+16+32+64=126 seconds.

$ strace -f -e trace=open,socket,connect,bind -t ./connecttimeout
09:36:50 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:36:50 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1107 attached
[pid  1107] 09:36:50 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ETIMEDOUT (Die Wartezeit für die Verbindung ist abgelaufen)
[pid  1107] 09:38:59 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E2A1C:
EIdConnectTimeout: Connect timed out.
  $00000000004E2A1C  DOCONNECTTIMEOUT,  line 328 of ../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E1D20  OPEN,  line 369 of ../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DA98  CONNECT,  line 328 of ../lib/indy10/Core/IdTCPClient.pas
  $000000000045E3A4  CONNECT,  line 514 of ../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 20 of connecttimeout.lpr
  $0000007F936D5E18
  $0000000000400928

09:39:01 +++ exited with 217 +++
~/sockettest $ strace -f -e trace=open,socket,connect,bind -t wget 127.0.0.1:8083
--2023-07-18 09:40:24--  http://127.0.0.1:8083/
Verbindungsaufbau zu 127.0.0.1:8083 … 09:40:24 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:40:24 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ETIMEDOUT (Die Wartezeit für die Verbindung ist abgelaufen)
fehlgeschlagen: Die Wartezeit für die Verbindung ist abgelaufen.
Erneuter Versuch.

--2023-07-18 09:42:35--  (Versuch: 2)  http://127.0.0.1:8083/
Verbindungsaufbau zu 127.0.0.1:8083 … 09:42:35 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:42:35 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16^C

I can only guess that the new linux code, that looks for waiting threads now interferes with Indy's timeout-counting mechanism.

What can be done?

debian linux/aarch64 fpc 2.0.10+dfsg-4+b2 , Indy Version 5bcd9ce

@kralo kralo added Status: Reported Issue has been reported for review Type: Bug Issue is a bug in existing code labels Jul 18, 2023
@rlebeau
Copy link
Member

rlebeau commented Jul 18, 2023

I'm not familiar with how the Linux kernel works, but Indy has never relied on OS-level timeouts for its ConnectTimeout handling, especially since there is no standard socket option to specify a timeout for connect(). As Indy uses blocking sockets, it runs connect() in its own worker thread, waiting up to ConnectTimeout for that thread to terminate. If the thread does not terminate in time, shutdown() is then called to abort the connection in progress, and close() to close the socket. So, the only hangup should be if connect() is not respecting shutdown()/close() in a timely manner.

That being said, there is an open ticket to re-design ConnectTimeout handling to use a non-blocking connect() with select()/(e)poll() on platforms which support that option. But that is work for a future release.

@rlebeau rlebeau added Element: Socket Stacks Issues related to OS socket APIs, TIdStack and TIdSocketList descedants, etc Element: TCP Issues related to TCP handling, TIdTCPClient and TIdTCPServer descendants, etc Element: Operating System Issues related to specific Operating Systems labels Jul 18, 2023
@kralo
Copy link
Author

kralo commented Jul 18, 2023

So, the only hangup should be if connect() is not respecting shutdown()/close() in a timely manner.

Yes, and as I understand, the kernel now prohibits shutting down the connection attempt, and Indy's mechanism is now useless.

https://github.com/torvalds/linux/blob/4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c/net/ipv4/tcp.c#L3084-L3088

shutdown() does not work... further down TIdStackUnix.WSShutdown returns -1 . So the thread keeps waiting until eventually the OS decides to give up on connect()

@rlebeau
Copy link
Member

rlebeau commented Jul 19, 2023

Yes, and as I understand, the kernel now prohibits shutting down the connection attempt, and Indy's mechanism is now useless.

Oh, well, that would certainly suck :-( That doesn't seem like a very good change for them to make in the kernel, and it certainly isn't documented behavior, AFAICS.

https://github.com/torvalds/linux/blob/4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c/net/ipv4/tcp.c#L3084-L3088

shutdown() does not work... further down TIdStackUnix.WSShutdown returns -1 . So the thread keeps waiting until eventually the OS decides to give up on connect()

Then I guess I need to bump up the priority on #3.

@kralo
Copy link
Author

kralo commented Nov 24, 2023

I have verified it working again. Without changing anything in freepascal, it works in linux 6.6.2 and 6.1.63. (timeout in the program is 3000ms, see above).

Most probably fixed by torvalds/linux@419ce13 , as mentioned in #493 .

$ uname -a
Linux 6.6.2-v8+ #1 SMP PREEMPT Tue Nov 21 22:28:24 UTC 2023 
aarch64 GNU/Linux

and 

$ uname -a
Linux 6.1.63-v8+ #1700 SMP PREEMPT Thu Nov 23 14:10:01 GMT 2023 
aarch64 GNU/Linux

/dev/shm $  strace -r -f -e trace=open,socket,connect,bind 
 ./conecttimeout
      0.000000 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
     0.000889 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1162 attached
[pid  1162]      0.002929 connect(3, {sa_family=AF_INET, sin_port=htons(8090), sin_addr=inet_addr("192.168.1.184")}, 16) = -1 ECONNRESET 
[pid  1162]      3.008621 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E40A4:
EIdConnectTimeout: Connect timed out.
  $00000000004E40A4  DOCONNECTTIMEOUT,  line 328 of ../../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E32CC  OPEN,  line 369 of ../../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DFC8  CONNECT,  line 328 of ../../lib/indy10/Core/IdTCPClient.pas
  $000000000045E90C  CONNECT,  line 514 of ../../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 22 of ../../../../../dev/shm/conecttimeout
  $0000007F9AF1EE18
  $0000000000400928

     1.503324 +++ exited with 217 +++ 

@kralo kralo closed this as completed Nov 24, 2023
@rlebeau rlebeau added Status: Won't Fix Issue will not be fixed, no further work needed and removed Status: Reported Issue has been reported for review labels Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Element: Operating System Issues related to specific Operating Systems Element: Socket Stacks Issues related to OS socket APIs, TIdStack and TIdSocketList descedants, etc Element: TCP Issues related to TCP handling, TIdTCPClient and TIdTCPServer descendants, etc Status: Won't Fix Issue will not be fixed, no further work needed Type: Bug Issue is a bug in existing code
Projects
None yet
Development

No branches or pull requests

2 participants