#2020 Add enableExpiry option to PolicyEntry to skip O(n) expiry scan#2019
#2020 Add enableExpiry option to PolicyEntry to skip O(n) expiry scan#2019mwashburn2 wants to merge 6 commits into
Conversation
…picSubscription
When a slow-consumer's pending-message buffer exceeds the high-water mark
(default: 1000 messages), TopicSubscription.add() calls removeExpiredMessages()
on every single message add. That method iterates every pending message and
calls isExpired(), which is an O(n) scan over the full buffer.
When messages carry no TTL (isExpired() always returns false), this scan
provides no benefit -- it iterates the entire buffer on every add with zero
messages removed. With a large pending limit (e.g. 20,000), this adds up to
millions of no-op iterations per second on busy topics.
This commit adds an enableExpiry boolean (default true) to PolicyEntry.
Setting it to false inserts a single guard in TopicSubscription.add():
if (enableExpiry && !matched.isEmpty() && matched.size() > max) {
removeExpiredMessages();
}
The flag lives on PolicyEntry -- the existing home for all subscription
policy configuration -- rather than on individual PendingMessageLimitStrategy
implementations, keeping the strategies as pure limit calculators.
Configuration in activemq.xml:
<policyEntry topic=">" enableExpiry="false">
<pendingMessageLimitStrategy>
<constantPendingMessageLimitStrategy limit="20000"/>
</pendingMessageLimitStrategy>
</policyEntry>
Also adds:
- TopicSubscriptionEnableExpiryTest: 11 correctness/propagation/integration tests
- TopicSubscriptionEnableExpiryThroughputTest: throughput comparison test
|
see also apache/activemq-website#178 |
|
I think a simpler approach here that avoids turning off the expiration check might be to just add a short circuit to the expiration loop. Right now the loop runs through all the messages which is the performance problem (as described) but I think we could add a short circuit that just aborts the loop as soon as the first message is seen that isn't expired. This approach should work in the vast majority of use cases because most of the time messages on the same topic are going to be very similar (ie the same producer is setting the same TTL) so if the current message is not expired it is pretty unlikely any of the other messages are expired so there is no reason to check them. This of course also works well for the use case of never setting a TTL (always aborts on the first message) while still working if there's a bunch of expired messages in a row. The downside is this would miss expiration on use cases where data is mixed (some messages have TTL and some don't or different TTL times) but that is not that common and there is expiration handling in other places in the broker as well such as on dispatch so I don't think this is a big downside. @mattrpav - What do you think? |
|
@cshannon Interesting! I like how your approach sort of 'auto adjusts' for most workflows. I think ultimately a ExpiryStrategy/ExpiryPolicy interface would be best to encapsulate different desired behaviors across policy entries and destination types. For workflows that specifically do not to do any expiry checking, an unequivocal flag (like @mwashburn2 has proposed) to disable entirely is probably useful at the very least. Also, for hot-spot performance testing with Virtual Threads we'll want to be able to toggle choke points that enter some sort of synchronization or 0(n) processing (think: parameterized unit tests to toggle cache, producerAudit, expiry, etc). |
When a slow-consumer's pending-message buffer exceeds the high-water mark TopicSubscription.add() calls removeExpiredMessages() on every single message add. That method iterates every pending message and calls isExpired(), which is an O(n) scan over the full buffer. When messages carry no TTL (isExpired() always returns false), this scan provides no benefit -- it iterates the entire buffer on every add with zero messages removed. With a large pending limit (e.g. 20,000), this adds up to millions of no-op iterations per second on busy topics. This commit adds an enableExpiry boolean (default true) to PolicyEntry. Setting it to false inserts a single guard in TopicSubscription.add():
if (enableExpiry && !matched.isEmpty() && matched.size() > max) {
removeExpiredMessages();
}
Configuration in activemq.xml:
Also adds: