Skip to content

Fix for the race condition being created when two threads were simultaneously doing refresh#8

Merged
Sfurti-yb merged 2 commits into
9.0.2.xfrom
9.0.2.x-bug-fix
Dec 11, 2025
Merged

Fix for the race condition being created when two threads were simultaneously doing refresh#8
Sfurti-yb merged 2 commits into
9.0.2.xfrom
9.0.2.x-bug-fix

Conversation

@Sfurti-yb
Copy link
Copy Markdown
Collaborator

@Sfurti-yb Sfurti-yb commented Nov 25, 2025

In the Refresh() function initialHosts list is being read by one thread while the other thread might be removing an unreachable host from it, mutating the list.

 internal override bool Refresh()
    {
       
        initialHosts = initialHosts.Distinct().ToList();
        foreach (var host in initialHosts.ToList())
        {
            try
            {
                // Setting up control connection with host and doing refresh 
            }
            catch (Exception)
            {
               // if host connection fails remove it from the initialHosts list
                _connectionLogger.LogDebug("Failed to connect to {host}", host);
                initialHosts.Remove(host);
                if (initialHosts.Count == 0)
                {
                    _connectionLogger.LogError("Failed to create control connection. No suitable host found");
                    throw;
                }

            }

        }
        unreachableHostsIndices.Clear();
        unreachableHosts.Clear();
        return true;
    }

This is creating the following race condition and throwing the error:

System.ArgumentException: Source array was not long enough. Check the source index, length, and the array's lower bounds. (Parameter 'sourceArray')
   at System.Array.CopyImpl(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length, Boolean reliable)
   at System.Collections.Generic.List`1.set_Capacity(Int32 value)
   at System.Collections.Generic.List`1.AddRange(IEnumerable`1 collection)
   at YBNpgsql.ClusterAwareDataSource.Refresh()
   at YBNpgsql.NpgsqlConnection.SetupDataSource()

This fix sets up a lock before needsRefresh() is called so that only one thread actually does the refresh, the other threads can move past this step.

Testing

Tested it with a multithreaded application that was long running and read from the database. A shell script was used to periodically down and restart the server nodes.

@Sfurti-yb Sfurti-yb requested a review from ashetkar November 25, 2025 11:22
@Sfurti-yb Sfurti-yb self-assigned this Nov 25, 2025
Comment thread src/Npgsql/NpgsqlConnection.cs Outdated
Comment thread src/Npgsql/NpgsqlConnection.cs
Comment thread src/Npgsql/NpgsqlConnection.cs
@Sfurti-yb Sfurti-yb marked this pull request as ready for review December 4, 2025 04:16
Comment thread src/Npgsql/NpgsqlConnection.cs
@Sfurti-yb Sfurti-yb merged commit 2a3fbb9 into 9.0.2.x Dec 11, 2025
2 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants