-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Hi!
First of all thanks for creating this application!
In version CAP 3.0.1, the transport checking may be response false positive results.
My application (which is a microservice) is configured to work with RabbitMQ and Postgre; if I start the application, before the RabbitMQ server is started, the following exception will throw:
None of the specified endpoints were reachable RabbitMQ.Client.Exceptions.BrokerUnreachableException: None of the specified endpoints were reachable ---> System.AggregateException: One or more errors occurred. (Connection failed) ---> RabbitMQ.Client.Exceptions.ConnectFailureException: Connection failed ---> System.Net.Internals.SocketExceptionFactory+ExtendedSocketException (10061): No connection could be made because the target machine actively refused it. 127.0.0.1:5672
Which is what we can expect, but the application launch, and after 30 seconds, the "Transport connection checking..." and "Transport connection healthy!" logs will appear - which is not true, because there is no connection to the broker, no retry to connect.
When I start the RabbitMQ server, nothing changed, the application always thinks the connection is healthy, and therefore there will be no retry. If I restart the microservice, it can successfully connect to the Rabbit broker, and work as expected.
So, steps to reproduce the issue:
- Start microservices without starting RabbitMQ
- AFTER all microservices have started: Start RabbitMQ
- Send a message, publish succeeds (entry in database has succeeded status and I can see queued message in RabbitMQ management console)
- Message is not received by consumer, even after waiting for a couple of minutes
- Restart microservice which consumes messages: Message is received
Please see the attached log file to more information. cap_log.txt
As you can see, this issue is similar to #329.
My guess is in this case, the DotNetCore.CAP.RabbitMQ.RabbitMQConsumerClient.Subscribe method throws an exception, which is located in the RabbiMQ client, which is not handled in the ConsumerRegister class - only in case of BrokerConnectionException will the CAP retry, not in any other exception.
I think it would be better, if there is any problem starting the consumer, the health-check service should retry it.
Probably I'll send a pull request about it, because it's quite a simple modification - but in any case, thanks in advance for taking time for this issue.