-
Couldn't load subscription status.
- Fork 1.3k
Description
Hi everyone, i have encountered race condition transport reconnect issue with stable reproduction.
Problem details
Looks like NATS transport occasionally looses the network connection.

Investigation showed that there is race condition between consumer restart process and native nats client callback.
-
TransportCheckProcessor occasionally checks transport for healthy:

_register.ReStart(); -
IConsumerRegister.Default.ReStart calls method Pulse that cancels task cancellation token _cts , sets _isHealthy to true, and calls Execute in order to run consumer:

if (!IsHealthy() || force) -
Meanwhile already running consumer thread receives _cts cancellation, then calls dispose that causes native nats client to be closed and disposed.

client.Listening(_pollingDelay, _cts.Token); -
When native nats client closes, it triggers ConnectedEventHandler(looks like typo, Disconnected is implied) event, that calls OnLogCallback that reassigns _isHealthy value to false


OnLogCallback!(logArgs);
_isHealthy = false;
As a result of race, _isHealthy seems to be false almost the time, that causes infinity reconnect loop
Reproduction
See my repo for reproduction: Just set your nats server and run Worker2
https://github.com/truecooler/CAPTest/tree/master/Worker2
Excepted: consumer connects without problem, no network issues
Actual: consumer looses the connection and occasianally prints:
"NATS server connection error."