Emit metrics for distribution of number of rows per segment #12730

TSFenwick · 2022-07-01T18:14:22Z

Adds the ability to see the number of rows per segment.

Description

Adds metrics to help users of druid to understand how many rows there are in a segment. As part of this we report the average num of rows in a segment and also a distribution of segments in predefined buckets.

Adding metrics

The use of buckets was chosen because average wasn't good enough to understand the number of rows that are there in segments since its possible to have outliers and such. I looked into using min/max/median but that would involve too much memory keeping track of the number of rows in every segment for every datasource. Using buckets gives us a good idea of the distribution of rows in segments that only requires keeping track of a fixed number of values.

doing this at SegmentManager whenever we load or drop a segment allows for not counting the number of rows(which can be an expensive operation as it scans the segment) more than once. The downside to this is handling lazily loaded segments. These metrics will not make sense if lazy loading is enabled.

We chose to not handle lazy loading with this as it can be hard to understand if a segment is loaded 2x once lazily then actually how to keep track of that. And also it is hard to know if a dropped segment was originally lazy loaded or not. So instead this will throw an exception if SegmentStatsMonitor is enabled and lazyLoadOnStart is true

Key changed/added classes in this PR

SegmentRowCountDistribution.java
SegmentStatsMonitor.java

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

return counts of segments that have rowcount in a bucket size for a datasource return average value of rowcount per segment in a datasource added unit test naming could use a lot of work buckets right now are not finalized added javadocs altered metrics.md

TSFenwick · 2022-07-01T18:15:09Z

The bucket sizing hasn't been decided yet so please give lots of input on those

server/src/main/java/org/apache/druid/server/SegmentManager.java

docs/operations/metrics.md

suneet-s

I really like this bucket idea that shows the distribution of row count in segments!

I've provided some naming suggestions, but I'm not married to those names - however I do think the current metric names need some workshopping.

Since this metric is not applicable in all deployments (clusters with lazy loading) and is new - what are your thoughts on introducing this as a new monitor so that Druid operators can opt in to this new metric. As written, there is no escape hatch in case it starts producing too many metrics for clusters with thousands of very small datasources where this metric is maybe not as important.

EDIT: As an alternative to making this a separate monitor - we could introduce a property that defines the buckets, so that cluster operators have flexibility in choosing the bucket sizes and by default, we set it to null. So a cluster operator add something like

druid.historical.metrics.segments.ranges=0,10000,200000,4000000,6000000,10000000 to create buckets < 0, 0-10k, 10k-2M, 2M-4M, 4M-6M, 6M-10M, 10M+

docs/operations/metrics.md

server/src/main/java/org/apache/druid/server/SegmentManager.java

server/src/main/java/org/apache/druid/server/metrics/SegmentRowCountBuckets.java

server/src/main/java/org/apache/druid/server/metrics/HistoricalMetricsMonitor.java

add monitor test move added functionality to new monitor update docs

maytasm

+1 on the overall design. I like the idea of creating a property that defines the buckets, and by default can be set to null. Please also verify that this works if encountering tombstone segments.

docs/configuration/index.md

docs/operations/metrics.md

server/src/main/java/org/apache/druid/server/metrics/HistoricalAdvancedSegmentMonitor.java

renamed monitor handle tombstones better update docs added javadocs

suneet-s

+1 after CI

docs/operations/metrics.md

server/src/main/java/org/apache/druid/server/metrics/SegmentStatsMonitor.java

and rename variable to be more accurate

TSFenwick · 2022-07-12T13:54:17Z

@suneet-s made changes to get tests to pass. Please take a look

suneet-s · 2022-07-12T14:02:46Z

Thanks @TSFenwick !

TSFenwick commented Jul 1, 2022

View reviewed changes

server/src/main/java/org/apache/druid/server/SegmentManager.java Outdated Show resolved Hide resolved

TSFenwick changed the title ~~initial commit of bucket dimensions for number of rows in a segment~~ num of rows in a segment dimensions Jul 1, 2022

fix checkstyle issues

37c7d1e

TSFenwick commented Jul 1, 2022

View reviewed changes

docs/operations/metrics.md Outdated Show resolved Hide resolved

suneet-s added Design Review Area - Metrics/Event Emitting labels Jul 1, 2022

suneet-s reviewed Jul 2, 2022

View reviewed changes

addressed review comments

882fd13

add monitor test move added functionality to new monitor update docs

maytasm approved these changes Jul 6, 2022

View reviewed changes

suneet-s reviewed Jul 7, 2022

View reviewed changes

docs/configuration/index.md Outdated Show resolved Hide resolved

docs/operations/metrics.md Outdated Show resolved Hide resolved

server/src/main/java/org/apache/druid/server/metrics/HistoricalAdvancedSegmentMonitor.java Outdated Show resolved Hide resolved

TSFenwick added 4 commits July 7, 2022 13:29

address comments

cce188e

renamed monitor handle tombstones better update docs added javadocs

Add support for tombstones in the segment distribution

e5db7e9

undo changes to tombstone segmentizer factory

3bc89ab

fix accidental whitespacing changes

c9b8df5

maytasm approved these changes Jul 8, 2022

View reviewed changes

suneet-s approved these changes Jul 11, 2022

View reviewed changes

docs/operations/metrics.md Outdated Show resolved Hide resolved

server/src/main/java/org/apache/druid/server/metrics/SegmentStatsMonitor.java Outdated Show resolved Hide resolved

TSFenwick added 5 commits July 11, 2022 10:23

address comments regarding metrics documentation

e9f50a3

and rename variable to be more accurate

fix tests

7bed37e

fix checkstyle issues

3c9a599

fix broken test

4240dcc

undo removal of timeout

f418c8f

suneet-s changed the title ~~num of rows in a segment dimensions~~ Emit metrics for distribution of number of rows per segment Jul 12, 2022

suneet-s merged commit 8c02880 into apache:master Jul 12, 2022

abhishekagarwal87 added this to the 24.0.0 milestone Aug 26, 2022

techdocsmith mentioned this pull request Aug 26, 2022

[Draft] 24.0 Release notes #12825

Closed

abhishekagarwal87 mentioned this pull request Sep 8, 2022

Test issue [Please ignore] #13055

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit metrics for distribution of number of rows per segment #12730

Emit metrics for distribution of number of rows per segment #12730

TSFenwick commented Jul 1, 2022 •

edited

TSFenwick commented Jul 1, 2022

suneet-s left a comment •

edited

maytasm left a comment

suneet-s left a comment

TSFenwick commented Jul 12, 2022

suneet-s commented Jul 12, 2022

Emit metrics for distribution of number of rows per segment #12730

Emit metrics for distribution of number of rows per segment #12730

Conversation

TSFenwick commented Jul 1, 2022 • edited

Description

Adding metrics

Key changed/added classes in this PR

TSFenwick commented Jul 1, 2022

suneet-s left a comment • edited

Choose a reason for hiding this comment

maytasm left a comment

Choose a reason for hiding this comment

suneet-s left a comment

Choose a reason for hiding this comment

TSFenwick commented Jul 12, 2022

suneet-s commented Jul 12, 2022

TSFenwick commented Jul 1, 2022 •

edited

suneet-s left a comment •

edited