Optimize batch index ACK performance #988

BewareMyPower · 2023-03-09T05:38:49Z

Motivation

Currently, when EnableBatchIndexAck is true, the ACK performance is very poor. There are two main reasons:

Acknowledgment by list is not supported. It means that even N MessageIDs are grouped, there are still N ACK requests to send.
The implementation of ACK grouping tracker is wrong. Give a batch that has N messages, when batch index ACK is enabled, each MessageID is cached. However, after all these N MessageIDs arrived, the current implementation does not clear them.

Modifications

Add a func(id []*pb.MessageIdData) to the ACK grouping tracker. When flushing individual ACKs, construct the slice and wrap the slice to CommandAck directly.
Refactor the implementation of the ACK grouping tracker to have these two data structures:
- pendingAcks: Cache the non-batched MessageIDs
- pendingBatchAcks: Cache the batched MessageIDs, once all messages in a batch have been added, remove them from it and add the non-batched MessageID to pendingAcks
Add TestBatchIndexAckAllMessages to verify the new behavior.

After this change, the ACK order cannot be guaranteed, sort the acknowledged MessageIDs in the ack_grouping_tracker_test.go.

BewareMyPower · 2023-03-09T05:52:14Z

Performance test setup

Run a Pulsar 2.11.0 standalone locally with acknowledgmentAtBatchIndexLevelEnabled=true.

Build the perf binary:

cd perf
go build

Run the consumer in terminal 1:

./perf consume --profile \
    -s sub \
    --enable-batch-index-ack \
    my-topic

Run the producer in terminal, the throughput is about 20 MB/s:

./perf produce \
  --batching-num-messages=10 \
  --size 105 \
  --rate 200000 \
  my-topic

Test result

Before this patch

INFO[13:31:17.584] Stats - Publish rate: 150038.6 msg/s -  120.2 Mbps -
                                Latency ms: 50%   3.1 -95%  13.0 - 99%  48.9 - 99.9%  48.9 - max   48.9
INFO[13:31:27.581] Stats - Publish rate: 179909.8 msg/s -  144.1 Mbps -
                                Latency ms: 50%   2.4 -95%   8.7 - 99%  26.2 - 99.9%  26.2 - max   26.2
INFO[13:31:37.581] Stats - Publish rate: 188957.1 msg/s -  151.4 Mbps -
                                Latency ms: 50%   2.3 -95%   7.8 - 99%  15.9 - 99.9%  15.9 - max   15.9

INFO[13:31:11.592] Stats - Consume rate: 22956.0 msg/s -   18.4 Mbps
INFO[13:31:21.592] Stats - Consume rate: 46805.3 msg/s -   37.5 Mbps
INFO[13:31:31.593] Stats - Consume rate: 47494.7 msg/s -   38.0 Mbps
INFO[13:31:41.594] Stats - Consume rate: 48100.3 msg/s -   38.5 Mbps
INFO[13:31:51.592] Stats - Consume rate: 59705.8 msg/s -   47.8 Mbps
INFO[13:32:01.597] Stats - Consume rate: 61694.0 msg/s -   49.4 Mbps
INFO[13:32:11.592] Stats - Consume rate: 58100.2 msg/s -   46.5 Mbps
INFO[13:32:21.592] Stats - Consume rate: 60818.2 msg/s -   48.7 Mbps
INFO[13:32:31.596] Stats - Consume rate: 60681.5 msg/s -   48.6 Mbps

The consumer can not catch up the producer even if the produce rate is only 20 MB/s. So it fell back to catch up read and the broker reads messages directly with BK.

After this patch

INFO[16:19:40.725] Stats - Publish rate: 152244.3 msg/s -  122.0 Mbps -
                                Latency ms: 50%   3.4 -95%  11.8 - 99%  63.5 - 99.9%  63.5 - max   63.5
INFO[16:19:50.725] Stats - Publish rate: 176077.8 msg/s -  141.1 Mbps -
                                Latency ms: 50%   2.6 -95%   6.3 - 99%  19.3 - 99.9%  19.3 - max   19.3
INFO[16:20:00.725] Stats - Publish rate: 134651.8 msg/s -  107.9 Mbps -
                                Latency ms: 50%   3.2 -95%   9.7 - 99%  26.6 - 99.9%  26.6 - max   26.6
INFO[16:20:10.725] Stats - Publish rate: 147599.8 msg/s -  118.2 Mbps -
                                Latency ms: 50%   2.9 -95%   6.7 - 99%  15.8 - 99.9%  15.8 - max   15.8
INFO[16:20:20.725] Stats - Publish rate: 146333.1 msg/s -  117.2 Mbps -
                                Latency ms: 50%   2.9 -95%   7.2 - 99%  22.8 - 99.9%  22.8 - max   22.8
INFO[16:20:30.725] Stats - Publish rate: 146170.7 msg/s -  117.1 Mbps -
                                Latency ms: 50%   2.9 -95%   7.3 - 99%  19.3 - 99.9%  19.3 - max   19.3
INFO[16:20:40.726] Stats - Publish rate: 145344.2 msg/s -  116.4 Mbps -
                                Latency ms: 50%   2.9 -95%   6.9 - 99%  16.8 - 99.9%  16.8 - max   16.8
INFO[16:20:50.724] Stats - Publish rate: 143197.0 msg/s -  114.7 Mbps -
                                Latency ms: 50%   3.0 -95%   7.7 - 99%  20.7 - 99.9%  20.7 - max   20.7
INFO[16:21:00.724] Stats - Publish rate: 148160.2 msg/s -  118.7 Mbps -
                                Latency ms: 50%   2.9 -95%   7.4 - 99%  17.6 - 99.9%  17.6 - max   17.6

INFO[16:19:38.663] Stats - Consume rate: 114350.6 msg/s -   91.6 Mbps
INFO[16:19:48.663] Stats - Consume rate: 183788.1 msg/s -  147.2 Mbps
INFO[16:19:58.663] Stats - Consume rate: 139771.1 msg/s -  112.0 Mbps
INFO[16:20:08.663] Stats - Consume rate: 141662.6 msg/s -  113.5 Mbps
INFO[16:20:18.663] Stats - Consume rate: 146116.1 msg/s -  117.1 Mbps
INFO[16:20:28.663] Stats - Consume rate: 144571.8 msg/s -  115.8 Mbps
INFO[16:20:38.662] Stats - Consume rate: 144215.5 msg/s -  115.5 Mbps
INFO[16:20:48.662] Stats - Consume rate: 143453.2 msg/s -  114.9 Mbps
INFO[16:20:58.663] Stats - Consume rate: 147724.5 msg/s -  118.3 Mbps
INFO[16:21:08.663] Stats - Consume rate: 148170.2 msg/s -  118.7 Mbps

Flame Graph

Before this patch:

We can see internalSendRequest takes about 40% of the total, which also leads to the result that runEventLoop method also takes 8.87% of the total.

After this patch:

We can see the internalSendRequest only takes 1.26% of the total. Now the bottleneck becomes the performance of the ACK grouping tracker.

Conclusion

From my observation, sometimes the backlog was still increasing slowly, once it became the catch up read, the performance could downgrade fast in my local env. But it's still a big improvement and can be improved by adjusting the ACK grouping options.

### Motivation Currently, when `EnableBatchIndexAck` is true, the ACK performance is very poor. There are two main reasons: 1. Acknowledgment by list is not supported. It means that even N MessageIDs are grouped, there are still N ACK requests to send. 2. The implementation of ACK grouping tracker is wrong. Give a batch that has N messages, when batch index ACK is enabled, each MessageID is cached. However, after all these N MessageIDs arrived, the current implementation does not clear them. ### Modifications - Add a `func(id []*pb.MessageIdData)` to the ACK grouping tracker. When flushing individual ACKs, construct the slice and wrap the slice to `CommandAck` directly. - Refactor the implementation of the ACK grouping tracker: - Do not save each MessageID instance, instead, save the ledger id and the entry id as the key of `pendingAcks`. - Release the mutex before calling ACK functions - Add `TestTrackerPendingAcks` to verify the list of MessageIDs to ACK. After this change, the ACK order cannot be guaranteed, sort the acknowledged MessageIDs in the `ack_grouping_tracker_test.go`.

BewareMyPower added the performance label Mar 9, 2023

BewareMyPower added this to the v0.10.0 milestone Mar 9, 2023

BewareMyPower requested review from merlimat, nodece, RobertIndie, wolfstudy and shibd March 9, 2023 05:38

BewareMyPower self-assigned this Mar 9, 2023

BewareMyPower marked this pull request as draft March 9, 2023 06:08

BewareMyPower force-pushed the bewaremypower/batch-index-ack-perf branch from 0e77afd to a8289cf Compare March 9, 2023 08:32

BewareMyPower marked this pull request as ready for review March 9, 2023 08:32

shibd approved these changes Mar 9, 2023

View reviewed changes

BewareMyPower merged commit 352c463 into apache:master Mar 10, 2023

BewareMyPower deleted the bewaremypower/batch-index-ack-perf branch March 10, 2023 01:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize batch index ACK performance #988

Optimize batch index ACK performance #988

BewareMyPower commented Mar 9, 2023

BewareMyPower commented Mar 9, 2023 •

edited

Loading

Optimize batch index ACK performance #988

Optimize batch index ACK performance #988

Conversation

BewareMyPower commented Mar 9, 2023

Motivation

Modifications

BewareMyPower commented Mar 9, 2023 • edited Loading

Performance test setup

Test result

Before this patch

After this patch

Flame Graph

Conclusion

BewareMyPower commented Mar 9, 2023 •

edited

Loading