Skip to content

Commit

Permalink
Fix timeout when underlying socket is changed in a MultiConnection (#…
Browse files Browse the repository at this point in the history
…7377)

When there are multiple localhost entries in /etc/hosts like following
/etc/hosts:
```
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1   localhost
```

multi_cluster_management check will failed:
```

@@ -857,20 +857,21 @@
 ERROR:  group 14 already has a primary node
 -- check that you can add secondaries and unavailable nodes to a group
 SELECT groupid AS worker_2_group FROM pg_dist_node WHERE nodeport = :worker_2_port \gset
 SELECT 1 FROM master_add_node('localhost', 9998, groupid => :worker_1_group, noderole => 'secondary');
  ?column?
 ----------
         1
 (1 row)

 SELECT 1 FROM master_add_node('localhost', 9997, groupid => :worker_1_group, noderole => 'unavailable');
+WARNING:  could not establish connection after 5000 ms
  ?column?
 ----------
         1
 (1 row)
```

This actually isn't just a problem in test environments, but could occur
as well during actual usage when a hostname in pg_dist_node
resolves to multiple IPs and one of those IPs is unreachable.
Postgres will then automatically continue with the next IP, but
Citus should listen for events on the new socket. Not on the
old one.

Co-authored-by: chuhx43211 <chuhx43211@hundsun.com>
(cherry picked from commit 9a91136)
  • Loading branch information
hslightdb authored and JelteF committed Apr 17, 2024
1 parent db391c0 commit 2a6164d
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion src/backend/distributed/connection/connection_management.c
Original file line number Diff line number Diff line change
Expand Up @@ -1046,8 +1046,15 @@ FinishConnectionListEstablishment(List *multiConnectionList)

continue;
}

bool beforePollSocket = PQsocket(connectionState->connection->pgConn);
bool connectionStateChanged = MultiConnectionStatePoll(connectionState);

if (beforePollSocket != PQsocket(connectionState->connection->pgConn))
{
/* rebuild the wait events if MultiConnectionStatePoll() changed the socket */
waitEventSetRebuild = true;
}

if (connectionStateChanged)
{
if (connectionState->phase != MULTI_CONNECTION_PHASE_CONNECTING)
Expand Down

0 comments on commit 2a6164d

Please sign in to comment.