New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pulsar-broker] capture managed-ledger add-latency #4419
Conversation
rerun integration tests |
@@ -201,7 +199,7 @@ public void closeComplete(int rc, LedgerHandle lh, Object ctx) { | |||
} | |||
|
|||
private void updateLatency() { | |||
ml.mbean.addAddEntryLatencySample(System.nanoTime() - startTime, TimeUnit.NANOSECONDS); | |||
ml.mbean.addAddEntryLatencySample(System.nanoTime() - lastInitTime, TimeUnit.NANOSECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both latencies could be reported.
Right now, I think the addEntry is correct in that it's the time it takes to persist an entry, including the time of ledger append and the eventual queuing, if we don't have a ledger ready.
It would be good to add a new metric to account just for the Ledger.addEntry() operation, though we shouldn't remove the current one since, in the end, it's the total time that matters to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, added new metrics for ledger::addEntry
@rdhabalia do we need this issue for 2.4.0? or can we move it to 2.5.0? |
ping @rdhabalia |
moving to 2.5 for now and I will address the change. |
64b314d
to
5a8541a
Compare
addressed all changes .. can we please review this PR. |
rerun java8 tests |
789743a
to
0a2c9a0
Compare
@rdhabalia Looks this PR is related to #6705, can you help to confirm this? |
@rdhabalia I moved this PR to 2.7.0. Looks #6705 have added these metrics. Feel free to move it back if need. |
@codelipenghui this is additionally metrics to capture ml add latency. |
/pulsarbot run-failure-checks |
2 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
@rdhabalia Looks the failed CI can't be triggered again, I'm not sure what the problem is. And I try to merge the apache/master to this branch, it still nothing happens in this branch. Interesting thing! |
add additional metrics for ledger::addEntry
@codelipenghui I have rebased the PR. I think this should fix it. |
### Motivation With apache#4290 , now broker has capability to capture e2e publish latency since publish-request arrives till it completes. now, we can capture bk-client latency to find out exact break down for broker to bookie latency. Right now broker does capture ml-add latency but ml-add-latency starts timer as soon as add-ops inserted into queue which also adds waiting time at the queue and doesn't give correct broker to bookie latency. ### Modification To capture broker to bookie latency: start ml-add-ops latency timer when bk-add entry request initiates. ### Result with this change: broker is able to provide bookie-persistent latency. It will add additional metrics: `brk_ml_LedgerAddEntryLatencyBuckets` ``` "brk_ml_AddEntryErrors": 0.0, "brk_ml_AddEntryLatencyBuckets_0.0_0.5": 0.0, "brk_ml_AddEntryLatencyBuckets_0.5_1.0": 0.0, "brk_ml_AddEntryLatencyBuckets_1.0_5.0": 0.0, "brk_ml_AddEntryLatencyBuckets_10.0_20.0": 0.0, "brk_ml_AddEntryLatencyBuckets_100.0_200.0": 0.0, "brk_ml_AddEntryLatencyBuckets_20.0_50.0": 0.0, "brk_ml_AddEntryLatencyBuckets_200.0_1000.0": 0.0, "brk_ml_AddEntryLatencyBuckets_5.0_10.0": 0.0, "brk_ml_AddEntryLatencyBuckets_50.0_100.0": 0.0, "brk_ml_AddEntryLatencyBuckets_OVERFLOW": 0.0, "brk_ml_AddEntryMessagesRate": 0.0, "brk_ml_AddEntrySucceed": 0.0, "brk_ml_EntrySizeBuckets_0.0_128.0": 0.0, "brk_ml_EntrySizeBuckets_1024.0_2084.0": 0.0, "brk_ml_EntrySizeBuckets_102400.0_1232896.0": 0.0, "brk_ml_EntrySizeBuckets_128.0_512.0": 0.0, "brk_ml_EntrySizeBuckets_16384.0_102400.0": 0.0, "brk_ml_EntrySizeBuckets_2084.0_4096.0": 0.0, "brk_ml_EntrySizeBuckets_4096.0_16384.0": 0.0, "brk_ml_EntrySizeBuckets_512.0_1024.0": 0.0, "brk_ml_EntrySizeBuckets_OVERFLOW": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_0.0_0.5": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_0.5_1.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_1.0_5.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_10.0_20.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_100.0_200.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_20.0_50.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_200.0_1000.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_5.0_10.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_50.0_100.0": 0.0, "brk_ml_LedgerAddEntryLatencyBuckets_OVERFLOW": 0.0, ```
Hi @momo-jun can you follow up on the docs? Thanks |
Motivation
With #4290 , now broker has capability to capture e2e publish latency since publish-request arrives till it completes. now, we can capture bk-client latency to find out exact break down for broker to bookie latency. Right now broker does capture ml-add latency but ml-add-latency starts timer as soon as add-ops inserted into queue which also adds waiting time at the queue and doesn't give correct broker to bookie latency.
Modification
To capture broker to bookie latency: start ml-add-ops latency timer when bk-add entry request initiates.
Result
with this change: broker is able to provide bookie-persistent latency.
It will add additional metrics:
brk_ml_LedgerAddEntryLatencyBuckets