We are using Spring JMS with DefaultMessageListenerContainer using AQ JMS with very high volumes and we have spent quite some time on performance tuning. Finally we have found our silver (or at least copper) bullet: client-side waits.
The problem is how AQ handles receiveTimeout. If there are 100 threads waiting for a new message on a queue they consume virtually no resources as long as there is nothing to read. However, if a single message is posted all 100 threads try to get it, issuing a SQL SELECT statement. One will succeed, the others fail but still consume CPU in the database. As messages are posted this quickly adds up and the database is bogged down by pointless selects. When there is a backlog this is no issue, but when consumers keep up with producers it kills the database and eats resources that would be needed elsewhere.
We subclassed DefaultMessageListenerContainer and added a clientPollingDelay parameter. A receive call with a null message would sleep for clientPollingDelay milliseconds before proceeding. That improved our overall performance immensely. Our current implementation (subclassing and overriding a protected method) is not ideal, as it puts the delay inside a transaction. I will submit a pull request with a better option. The main point is to move the sleep from the database (JMS provider) to the application server (JMS client) as that works much better with AQ.
This is of major importance for us, but as we have a solution in place I have classified the issue as a minor one.
#18774 DefaultMessageListenerContainer doesn't shutdown gracefully if long recovery interval is set
#18786 Support non-blocking receiveTimeout in AbstractPollingMessageListenerContainer
So for the time being, you're overriding receiveAndExecute accordingly? Looks like this is even doable with the delay outside the transaction, in contrast to when overriding doReceiveAndExecute?
I'd generally like to stay away from Thread.sleep() calls in the core listener container, since #18774 just removed the last one we had there... the problem being that sleep calls do not react to notifyAll signals on shutdown. When kept short enough, such sleeps may be roughly equivalent to blocking receive calls, but receive calls can at least react to resource interruptions underneath, whereas a sleep is just generally blocking. For a short delay, a bounded wait call on the lifecycle monitor might be a worthwhile alternative to a sleep call. For a non-trivial sleep period, I'd even rather let the current task end and reschedule the invoker with a delay, returning its thread to the pool for that period. Of course, the actual benefits/effects depend a lot on the thread pool setup in any case, in interaction with the resource driver.
All in all, there is nothing wrong with your solution in your scenario. I'm just wondering whether we should turn that particular approach into a first-class feature in the core listener container implementation. There are so many variants for JMS listener setup already... DefaultMessageListenerContainer's extensibility might be the actual key feature here, not adding even more configuration properties.
A long client-side delay would be bad, it is better to ramp down if there is nothing to read for extended periods. We are using 100 ms in our setup. That works well for our workload. A bounded sleep for up to 500 ms (or whatever) on the lifecycle monitor is certainly a reasonable alternative. If you wish I can modify the pull request in that direction?
If not I can buy your reasoning. I think the messaging classes are very extensible, I had no problems fixing this with a subclass. One advantage of getting it into the core framework is that it makes it visible. We spent quite some time before we found this silver bullet and it really makes a huge difference for our performance. If we had seen the option when we looked at the API we would have tested it much earlier.
How about a subclass that is included out of the box? That would still make it visible in the JavaDocs, but it would not affect DefaultMessageListenerContainer?
The related PR has been closed and this didn't proposal didn't get a lot of traction so I suggest to close this one as well. If there is something that prevents a sub-class to do this we can adapt the core API to make that easier.