Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing connection cleanup in case of mass connection breaking. #2614

Conversation

upadhyay-prashant
Copy link
Contributor

CAUSE:
In some cases when credentials expire, or servers encounters a
blip and closes all connections. The driver gets close message on all
connections. While processing those close messages, the driver was
getting into race conditions, where in multiple threads were trying to
close connections and trying to update the connections object i.e. list
of connections in the pool. This was leading to uncaught exceptions and
stale connections in the pool. These connections are never cleanedup
post this.

FIX:
Iterate the connections list while creating the connectionPool Info.
Since the list used is copyOnWrite, the iterator API creates a clone
and uses that clone for referring the element. Thus providing thread
safe interface.

However the information provided by this iteration is a bit stale, but
this doesn't matter.

 CAUSE:
 In some cases when credentials expire, or servers encounters a
 blip and closes all connections. The driver gets close message on all
 connections. While processing those close messages, the driver was
 getting into race conditions, where in multiple threads were trying to
 close connections and trying to update the connections object i.e. list
 of connections in the pool. This was leading to uncaught exceptions and
 stale connections in the pool. These connections are never cleanedup
 post this.

 FIX:
 Iterate the connections list while creating the connectionPool Info.
 Since the list used is copyOnWrite, the iterator API creates a clone
 and uses that clone for referring the element. Thus providing thread
 safe interface.

 However the information provided by this iteration is a bit stale, but
 this doesn't matter.
@kenhuuu
Copy link
Contributor

kenhuuu commented May 22, 2024

Could you rebase and retarget this PR to point to 3.6-dev? This change most likely applies to version 3.6.x and above rather than 4.0.x (master).

for (int ix = 0; ix < connectionCount; ix++) {
final Connection c = connections.get(ix);
if (c.equals(connectionToCallout))
final Iterator<Connection> it = connections.iterator();
Copy link
Contributor

@kenhuuu kenhuuu May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: There aren't any tests for this PR, but this might be ok since appendConnections is only used for logging. But seeing as how we are now dependent on the CopyOnWriteArrayList's iterator behavior, we might want to add a small comment and maybe even explicitly declare connections as a CopyOnWriteArrayList<Connection> rather than just a List<Connection>.

@kenhuuu
Copy link
Contributor

kenhuuu commented May 22, 2024

VOTE +1 pending resolution of some nits.

@upadhyay-prashant upadhyay-prashant deleted the stale_connections_v1 branch May 24, 2024 21:14
@upadhyay-prashant
Copy link
Contributor Author

@kenhuuu can you please take a look. I raised another pull request with the fix and your nits addressed.
#2618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants