New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling / retry strategy of failed JMS authentications #71
Comments
This is a problem that stems from ActiveMQ's JMS ConnectionFactory, together with a fairly aggressive retrying from Mats where it attempts to acquire a connection after a failure. In a degenerate situation like here, where the authentication is erroneous, this will obviously not succeed. Since there is a set of threads per stage (default 2x cpus), x number of stages per endpoint, x endpoints, there are rather many threads that tries in vain to get a connection. While each individual stage processor actually do have a small sleep to avoid tight-spam-loops in such situations, this does not help much when there are something like hundred or more threads that all want a connection. The pooling solution should probably hold back the threads in such a situation, instead of letting every thread attempt to actually get a connection. That is, if we've just failed to get a connection, it makes little sense to let the next thread in line immediately try again. Rather, we could hold them all back, thus getting a single retry per retry interval, instead of a heap of retries per retry interval. |
The new and default |
…DeliveryCount props to interceptor and logging Made a new JmsMatsJmsSessionHandler, called 'JmsMatsJmsSessionHandler_PoolingSerial', which serializes all getting of Connections and Sessions. This is used for all testing, and all defaults - but the old 'JmsMatsJmsSessionHandler_Pooling' is still present, effectively unchanged, since this is a rather important as well as thread and sync-heavy little piece of the JmsMatsFactory machinery. Also added DeliveryCount to the interceptor API, and use this in the MatsMetricsLoggingInterceptor.
…DeliveryCount props to interceptor and logging Made a new JmsMatsJmsSessionHandler, called 'JmsMatsJmsSessionHandler_PoolingSerial', which serializes all getting of Connections and Sessions. This is used for all testing, and all defaults - but the old 'JmsMatsJmsSessionHandler_Pooling' is still present, effectively unchanged, since this is a rather important as well as thread and sync-heavy little piece of the JmsMatsFactory machinery. Also added DeliveryCount to the interceptor API, and use this in the MatsMetricsLoggingInterceptor.
When we setup Mats against an ActiveMQ message broker which requires authentication, it goes into a fairly aggressive retry mode on authentication failure - ends up doing a bunch of authentication attempts over time.
For an example, we have a single service doing over 4.000 attempts through Mats in 30min and for some reason this results in over 18.000 login attempts on the ActiveMQ message broker.
Following you find the log message from the JmsMatsStageProcessor along with the stacktrace from the JmsMatsJmsException which is thrown when a JMS connection could not be initialized. This is from one of the attempts for the example above where we try to create a connection without username and password (anonymous) against a ActiveMQ which was configured to not allow anonymous connections.
Log statement:
Stack trace:
(The SecurityException message comes from the message broker)
Would it be an idea to improve the handling of failed authentications in Mats? At least have a more "chill" retry strategy when JmsMatsJmsException has a SecurityException cause?
The text was updated successfully, but these errors were encountered: