Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pulsar-broker] capture managed-ledger add-latency #4419

Merged
merged 1 commit into from Nov 4, 2020

Conversation

rdhabalia
Copy link
Contributor

@rdhabalia rdhabalia commented May 31, 2019

Motivation

With #4290 , now broker has capability to capture e2e publish latency since publish-request arrives till it completes. now, we can capture bk-client latency to find out exact break down for broker to bookie latency. Right now broker does capture ml-add latency but ml-add-latency starts timer as soon as add-ops inserted into queue which also adds waiting time at the queue and doesn't give correct broker to bookie latency.

Modification

To capture broker to bookie latency: start ml-add-ops latency timer when bk-add entry request initiates.

Result

with this change: broker is able to provide bookie-persistent latency.

It will add additional metrics: brk_ml_LedgerAddEntryLatencyBuckets

 "brk_ml_AddEntryErrors": 0.0,
        "brk_ml_AddEntryLatencyBuckets_0.0_0.5": 0.0,
        "brk_ml_AddEntryLatencyBuckets_0.5_1.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_1.0_5.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_10.0_20.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_100.0_200.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_20.0_50.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_200.0_1000.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_5.0_10.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_50.0_100.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_OVERFLOW": 0.0,
        "brk_ml_AddEntryMessagesRate": 0.0,
        "brk_ml_AddEntrySucceed": 0.0,
        "brk_ml_EntrySizeBuckets_0.0_128.0": 0.0,
        "brk_ml_EntrySizeBuckets_1024.0_2084.0": 0.0,
        "brk_ml_EntrySizeBuckets_102400.0_1232896.0": 0.0,
        "brk_ml_EntrySizeBuckets_128.0_512.0": 0.0,
        "brk_ml_EntrySizeBuckets_16384.0_102400.0": 0.0,
        "brk_ml_EntrySizeBuckets_2084.0_4096.0": 0.0,
        "brk_ml_EntrySizeBuckets_4096.0_16384.0": 0.0,
        "brk_ml_EntrySizeBuckets_512.0_1024.0": 0.0,
        "brk_ml_EntrySizeBuckets_OVERFLOW": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_0.0_0.5": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_0.5_1.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_1.0_5.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_10.0_20.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_100.0_200.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_20.0_50.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_200.0_1000.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_5.0_10.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_50.0_100.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_OVERFLOW": 0.0,

@rdhabalia rdhabalia added this to the 2.4.0 milestone May 31, 2019
@rdhabalia rdhabalia self-assigned this May 31, 2019
@rdhabalia
Copy link
Contributor Author

rerun integration tests

@@ -201,7 +199,7 @@ public void closeComplete(int rc, LedgerHandle lh, Object ctx) {
}

private void updateLatency() {
ml.mbean.addAddEntryLatencySample(System.nanoTime() - startTime, TimeUnit.NANOSECONDS);
ml.mbean.addAddEntryLatencySample(System.nanoTime() - lastInitTime, TimeUnit.NANOSECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both latencies could be reported.
Right now, I think the addEntry is correct in that it's the time it takes to persist an entry, including the time of ledger append and the eventual queuing, if we don't have a ledger ready.
It would be good to add a new metric to account just for the Ledger.addEntry() operation, though we shouldn't remove the current one since, in the end, it's the total time that matters to users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, added new metrics for ledger::addEntry

@codelipenghui
Copy link
Contributor

@rdhabalia do we need this issue for 2.4.0? or can we move it to 2.5.0?

@sijie
Copy link
Member

sijie commented Jun 14, 2019

ping @rdhabalia

@rdhabalia
Copy link
Contributor Author

moving to 2.5 for now and I will address the change.

@rdhabalia rdhabalia modified the milestones: 2.4.0, 2.5.0 Jun 14, 2019
@sijie sijie modified the milestones: 2.5.0, 2.6.0 Nov 25, 2019
@rdhabalia rdhabalia force-pushed the ml_stat branch 2 times, most recently from 64b314d to 5a8541a Compare January 30, 2020 01:19
@rdhabalia
Copy link
Contributor Author

rdhabalia commented Jan 30, 2020

addressed all changes .. can we please review this PR.

@rdhabalia
Copy link
Contributor Author

rerun java8 tests
rerun integration tests

@rdhabalia rdhabalia force-pushed the ml_stat branch 2 times, most recently from 789743a to 0a2c9a0 Compare April 15, 2020 02:23
@codelipenghui
Copy link
Contributor

@rdhabalia Looks this PR is related to #6705, can you help to confirm this?

@codelipenghui
Copy link
Contributor

@rdhabalia I moved this PR to 2.7.0. Looks #6705 have added these metrics. Feel free to move it back if need.

@codelipenghui codelipenghui modified the milestones: 2.6.0, 2.7.0 Jun 4, 2020
@rdhabalia
Copy link
Contributor Author

@codelipenghui this is additionally metrics to capture ml add latency.

@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

@rdhabalia
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codelipenghui
Copy link
Contributor

/pulsarbot run-failure-checks

@codelipenghui
Copy link
Contributor

@rdhabalia Looks the failed CI can't be triggered again, I'm not sure what the problem is. And I try to merge the apache/master to this branch, it still nothing happens in this branch. Interesting thing!

add additional metrics for ledger::addEntry
@rdhabalia
Copy link
Contributor Author

@codelipenghui I have rebased the PR. I think this should fix it.

@codelipenghui codelipenghui added the doc-required Your PR changes impact docs and you will update later. label Nov 4, 2020
@codelipenghui codelipenghui merged commit 04b6468 into apache:master Nov 4, 2020
flowchartsman pushed a commit to flowchartsman/pulsar that referenced this pull request Nov 17, 2020
### Motivation
With apache#4290 , now broker has capability to capture e2e publish latency since publish-request arrives till it completes. now, we can capture bk-client latency to find out exact break down for broker to bookie latency. Right now broker does capture ml-add latency but ml-add-latency starts timer as soon as add-ops inserted into queue which also adds waiting time at the queue and doesn't give correct broker to bookie latency.


### Modification
To capture broker to bookie latency: start ml-add-ops latency timer when bk-add entry request initiates. 

### Result
with this change: broker is able to provide bookie-persistent latency.

It will add additional metrics: `brk_ml_LedgerAddEntryLatencyBuckets`
```
 "brk_ml_AddEntryErrors": 0.0,
        "brk_ml_AddEntryLatencyBuckets_0.0_0.5": 0.0,
        "brk_ml_AddEntryLatencyBuckets_0.5_1.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_1.0_5.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_10.0_20.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_100.0_200.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_20.0_50.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_200.0_1000.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_5.0_10.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_50.0_100.0": 0.0,
        "brk_ml_AddEntryLatencyBuckets_OVERFLOW": 0.0,
        "brk_ml_AddEntryMessagesRate": 0.0,
        "brk_ml_AddEntrySucceed": 0.0,
        "brk_ml_EntrySizeBuckets_0.0_128.0": 0.0,
        "brk_ml_EntrySizeBuckets_1024.0_2084.0": 0.0,
        "brk_ml_EntrySizeBuckets_102400.0_1232896.0": 0.0,
        "brk_ml_EntrySizeBuckets_128.0_512.0": 0.0,
        "brk_ml_EntrySizeBuckets_16384.0_102400.0": 0.0,
        "brk_ml_EntrySizeBuckets_2084.0_4096.0": 0.0,
        "brk_ml_EntrySizeBuckets_4096.0_16384.0": 0.0,
        "brk_ml_EntrySizeBuckets_512.0_1024.0": 0.0,
        "brk_ml_EntrySizeBuckets_OVERFLOW": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_0.0_0.5": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_0.5_1.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_1.0_5.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_10.0_20.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_100.0_200.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_20.0_50.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_200.0_1000.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_5.0_10.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_50.0_100.0": 0.0,
        "brk_ml_LedgerAddEntryLatencyBuckets_OVERFLOW": 0.0,
```
@Anonymitaet
Copy link
Member

Hi @momo-jun can you follow up on the docs? Thanks

@Anonymitaet Anonymitaet added doc-complete Your PR changes impact docs and the related docs have been already added. and removed doc-required Your PR changes impact docs and you will update later. labels Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc-complete Your PR changes impact docs and the related docs have been already added.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants