Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segment/unavailable/count metric is misleading during handoff #9677

Open
jihoonson opened this issue Apr 11, 2020 · 2 comments
Open

segment/unavailable/count metric is misleading during handoff #9677

jihoonson opened this issue Apr 11, 2020 · 2 comments

Comments

@jihoonson
Copy link
Contributor

jihoonson commented Apr 11, 2020

Affected Version

Probably all versions.

Description

The segment/unavailable/count metric is computed as the number of published segments that are not being served by historicals. But this is not the correct definition of "available": it should include realtime tasks as well. The effect is that the metric is misleading during handoff: it appears that unavailability spikes up before the new segments are loaded by historicals, even if all segments actually are continuously available on some combination of realtime tasks and historicals.

See https://druid.apache.org/docs/latest/design/architecture.html#indexing-and-handoff for definitions of terms that we should be using here.

@glasser
Copy link
Contributor

glasser commented Apr 20, 2020

I see this mentioned in the 0.18 release notes, but I should take your "affected version" above to mean that you don't believe this to be a recent regression?

@jihoonson
Copy link
Contributor Author

@glasser no, I don't think it's a regression. I believe this issue has been there ever since that metric was supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants