New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persist: add some more useful metrics #13244
Conversation
@@ -37,6 +37,8 @@ pub struct Metrics { | |||
pub cmds: CmdsMetrics, | |||
/// Metrics for each retry loop. | |||
pub retries: RetriesMetrics, | |||
/// Metrics for various encodings and decodings. | |||
pub codec: CodecMetrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! I should make this a set of vec counters with state, key, val as the labels. also I should add trace batch encode/decode to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These make a lot of sense! I can only say that the ones you added are good, I can't say which ones we should yet add. 😅
CodecsMetrics { | ||
state: self.codec_metrics("state"), | ||
batch: self.codec_metrics("batch"), | ||
// Maybe {key,val}_{encode,decode} too. Probably not time or diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, or just a TODO for later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any thoughts on whether we should do this? easy enough to do but I wasn't sure if we thought the overhead of the measurement would be worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless we're concerned with metrics overhead, as a user I'd always be in favor of more metrics 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm! we can always rip it out later. added a helper for actually doing the measurements to CodecMetrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -279,6 +283,7 @@ where | |||
>( | |||
&mut self, | |||
cmd: &CmdMetrics, | |||
metrics: &Metrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious - is there a reason this needs to be passed in as a param vs accessing the metrics via self
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 it's been a week, apparently. done!
CodecsMetrics { | ||
state: self.codec_metrics("state"), | ||
batch: self.codec_metrics("batch"), | ||
// Maybe {key,val}_{encode,decode} too. Probably not time or diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless we're concerned with metrics overhead, as a user I'd always be in favor of more metrics 😁
This adds prometheus metric coverage for encoding and decoding of state (might have helped diagnose the CI slowness we investigated while trying to land MaterializeInc#13160). It also adds coverage for CaS retries as a result of expectation mismatch (ditto).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
@@ -279,6 +283,7 @@ where | |||
>( | |||
&mut self, | |||
cmd: &CmdMetrics, | |||
metrics: &Metrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 it's been a week, apparently. done!
CodecsMetrics { | ||
state: self.codec_metrics("state"), | ||
batch: self.codec_metrics("batch"), | ||
// Maybe {key,val}_{encode,decode} too. Probably not time or diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wfm! we can always rip it out later. added a helper for actually doing the measurements to CodecMetrics
This adds prometheus metric coverage for encoding and decoding of state
(might have helped diagnose the CI slowness we investigated while trying
to land #13160). It also adds coverage for CaS retries as a result of
expectation mismatch (ditto).
Motivation
This PR adds a feature that has not yet been specified.
[Write a brief specification for the feature, including justification
for its inclusion in Materialize, as if you were writing the original
feature specification.]
Testing
Release notes
This PR includes the following user-facing behavior changes: