Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics for bytes and objects written during compaction #360

Merged

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Nov 19, 2020

What this PR does:
Adds three new new metrics to track compaction performance and help estimate capacity. The metrics include a level label which is the input compaction level.

  • tempodb_compaction_objects_written and tempodb_compaction_bytes_written which are updated when the compactor flushes to the backend.
  • tempodb_compaction_blocks_total Count of input blocks compacted.

Removes tempodb_compaction_duration_seconds metric as it is obsoleted by the new metrics.

Which issue(s) this PR fixes:
n/a

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop metricCompactionDuration? I feel like that metric loses it's meaning with the variable block compactions.

Also, what do you think about tempodb_compaction_blocks_total{level=''}. A counter that increments once for every processed block of each level. We could drop the duration histogram while still having some vision on which compactors are compacting blocks at which levels.

Oh, and feel free to add any of this to the operational dashboard that makes sense. I think there's a compaction section now.

@mdisibio
Copy link
Contributor Author

Should we drop metricCompactionDuration? I feel like that metric loses it's meaning with the variable block compactions.

Also, what do you think about tempodb_compaction_blocks_total{level=''}. A counter that increments once for every processed block of each level. We could drop the duration histogram while still having some vision on which compactors are compacting blocks at which levels.

Oh, and feel free to add any of this to the operational dashboard that makes sense. I think there's a compaction section now.

Agree, added the level label and new metric, and added both new metrics to the operational dashboard. I am not 100% satisfied with sum(increase(tempodb_compaction_blocks_total[5m])) but different sum intervals ($__interval, $__rate_interval, [1m]) lead to poor quality graphs in local testing.

I'm inclined to leave the duration metric for now, while we settle on the best KPIs for compaction.

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of final touches and it will be ready to merge. Also make sure to list the removed metric in the changelog. Thanks!

operations/tempo-mixin/tempo-operational.json Outdated Show resolved Hide resolved
operations/tempo-mixin/tempo-operational.json Outdated Show resolved Hide resolved
tempodb/compactor.go Outdated Show resolved Hide resolved
…objects_written metrics when flushing for efficiency
@mdisibio mdisibio force-pushed the compaction-objects-processed-metric branch from d942072 to 47d7f97 Compare November 23, 2020 13:52
@mdisibio mdisibio changed the title Add metric tempodb_compaction_objects_processed_total Add metrics for bytes and objects written during compaction Nov 23, 2020
@joe-elliott joe-elliott merged commit 82bf5fc into grafana:master Nov 23, 2020
@mdisibio mdisibio deleted the compaction-objects-processed-metric branch February 3, 2021 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants