metricstarttimeprocessor: Copy true reset strategy from prometheus receiver #37855

dashpole · 2025-02-11T18:10:21Z

Description

Copy the "metric adjuster" from the prometheus receiver to a new "true_reset_point" strategy in the metricstarttimeprocessor

Link to tracking issue

Part of #37186

Testing

Copied unit tests

Documentation

Updated the README.

Notes for reviewers

I would recommend reviewing commit-by-commit. The first commit is copied verbatim, with minimal changes other than splitting into multiple files.
I changed the function signature of AdjustMetrics to match the processMetrics function so that it doesn't need an additional wrapper layer.
I removed the MetricsAdjuster interface, since it isn't needed (we implement the processor function for metrics now).
I had to remove the validation of job + instance being present because it didn't pass generated tests. Regardless, we should not rely on those in a generic processor.

dashpole · 2025-02-12T18:42:26Z

cc @ridwanmsharif @jmacd @ArthurSens

ridwanmsharif · 2025-02-13T18:32:31Z

processor/metricstarttimeprocessor/README.md

+    metricstarttime:
+
+        # specify the strategy to use for setting the start time
+        strategy: true_reset_point


What are other strategies that you envision this processor supporting? using another metric's value for the start time (process_start_time_seconds)? Wouldn't those also need to use the true_reset_point strategy to deal with resets?

If a metric's value resets, it seems like we should always reset the start time, but I don't think that should involve inserting a "true reset" point. I think it should be implicit for all adjustment strategies (maybe with an on/off config option, similar to fallback to collector start time?).

I looked at https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#resets-and-gaps hoping to find a term for the other common mode -- but didn't find one.

If I had to make one up, it would be "missing_start_point" or similar, which is nearly identical logic to the presentation in this PR, except the value is unchanged. The first time the processor sees the series w/ a missing start time, it will enter the first timestamp it sees as the start time w/ unchanged value. Subsequent resets will be detected using the heuristic for cumulative values.

I think this option should only determine how to handle the initial missing start point. If we know a series has reset (e.g. because the value decreases), we should transition to a generic "use the most recent point's start timestamp going forward" handler.

ridwanmsharif

LGTM overall

I think the JobsMap should be renamed and use a hash of the resource attributes instead but I saw you created a TODO for it. Feel free to create an issue and assign it to me.

ridwanmsharif · 2025-02-24T16:44:13Z

processor/metricstarttimeprocessor/internal/truereset/adjuster.go

+func (a *Adjuster) AdjustMetrics(_ context.Context, metrics pmetric.Metrics) (pmetric.Metrics, error) {
+	for i := 0; i < metrics.ResourceMetrics().Len(); i++ {
+		rm := metrics.ResourceMetrics().At(i)
+		// TODO: Produce a hash of all resource attributes, rather than just job + instance.


[nit] Create an issue for this and assign to me?

fwiw I like to use the OTel Go attributes.Set for this sort of mapping.

ridwanmsharif · 2025-02-24T18:56:42Z

processor/metricstarttimeprocessor/internal/truereset/adjuster.go

+		}
+
+		if currentDist.Flags().NoRecordedValue() {
+			// TODO: Investigate why this does not reset.


[nit] Is there an issue for this already? Not sure if this is being tracked anywhere yet (I also see the TODO int he prometheusreceiver, just don't want to lose this when we remove the adjuster from there)

I assume this is copied from existing Prometheus code?

This is copied from the prometheus receiver. There isn't an issue for this. Not sure it is big enough to warrant one.

jmacd · 2025-02-25T18:01:11Z

processor/metricstarttimeprocessor/README.md

+    metricstarttime:
+
+        # specify the strategy to use for setting the start time
+        strategy: true_reset_point


I looked at https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#resets-and-gaps hoping to find a term for the other common mode -- but didn't find one.

If I had to make one up, it would be "missing_start_point" or similar, which is nearly identical logic to the presentation in this PR, except the value is unchanged. The first time the processor sees the series w/ a missing start time, it will enter the first timestamp it sees as the start time w/ unchanged value. Subsequent resets will be detected using the heuristic for cumulative values.

jmacd · 2025-02-25T18:02:29Z

processor/metricstarttimeprocessor/internal/truereset/adjuster.go

+func (a *Adjuster) AdjustMetrics(_ context.Context, metrics pmetric.Metrics) (pmetric.Metrics, error) {
+	for i := 0; i < metrics.ResourceMetrics().Len(); i++ {
+		rm := metrics.ResourceMetrics().At(i)
+		// TODO: Produce a hash of all resource attributes, rather than just job + instance.


fwiw I like to use the OTel Go attributes.Set for this sort of mapping.

jmacd · 2025-02-25T18:03:05Z

processor/metricstarttimeprocessor/internal/truereset/adjuster.go

+		}
+
+		if currentDist.Flags().NoRecordedValue() {
+			// TODO: Investigate why this does not reset.


I assume this is copied from existing Prometheus code?

andrzej-stencel · 2025-03-03T09:27:41Z

processor/metricstarttimeprocessor/config.go

-type Config struct{}
+type Config struct {
+	Strategy   string        `mapstructure:"strategy"`
+	GCInterval time.Duration `mapstructure:"gc_interval"`


Shouldn't the gc_interval configuration option be documented?

Also it looks like this option is specific to the true_reset_point strategy and not a general processor option?

gc_interval will be used by other strategies once they are added. @ridwanmsharif can you document gc_interval in a follow-up?

github-actions bot added the processor/metricstarttime label Feb 11, 2025

dashpole force-pushed the true_reset branch 5 times, most recently from 3260ddf to 769ca83 Compare February 11, 2025 20:45

dashpole added the enhancement label Feb 12, 2025

dashpole marked this pull request as ready for review February 12, 2025 18:42

dashpole requested a review from a team as a code owner February 12, 2025 18:42

dashpole requested a review from andrzej-stencel February 12, 2025 18:42

github-actions bot assigned andrzej-stencel Feb 12, 2025

dashpole force-pushed the true_reset branch from cfc027d to fe149bf Compare February 12, 2025 19:02

ridwanmsharif reviewed Feb 13, 2025

View reviewed changes

ridwanmsharif approved these changes Feb 24, 2025

View reviewed changes

jmacd approved these changes Feb 25, 2025

View reviewed changes

dashpole added 9 commits March 1, 2025 20:42

copy metric adjuster from prometheus reciever

7724e3f

remove created timestamp support, because it is being removed

7b40c6b

add configuration, and plumb true reset to processor

1d20fc0

rename InitialPointAdjuster to Adjuster

b6d5c18

add documentation

cef464f

relax requirement to have job + instance to pass generated tests

4840b59

fix presubmits

3a3344f

fix lint errors

5a95f05

gofmt

725677f

dashpole force-pushed the true_reset branch from fe149bf to 725677f Compare March 1, 2025 20:55

dashpole mentioned this pull request Mar 1, 2025

[metricstarttimeprocessor] Identify resource by hash of all attributes #38286

Closed

add issue

e522705

dashpole added the ready to merge label Mar 2, 2025

andrzej-stencel approved these changes Mar 3, 2025

View reviewed changes

andrzej-stencel merged commit a2f2b8c into open-telemetry:main Mar 3, 2025
167 checks passed

github-actions bot added this to the next release milestone Mar 3, 2025

dashpole deleted the true_reset branch March 3, 2025 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metricstarttimeprocessor: Copy true reset strategy from prometheus receiver #37855

metricstarttimeprocessor: Copy true reset strategy from prometheus receiver #37855

dashpole commented Feb 11, 2025

dashpole commented Feb 12, 2025

ridwanmsharif Feb 13, 2025

dashpole Feb 13, 2025

jmacd Feb 25, 2025

dashpole Mar 1, 2025

ridwanmsharif left a comment

ridwanmsharif Feb 24, 2025

jmacd Feb 25, 2025

ridwanmsharif Feb 24, 2025

jmacd Feb 25, 2025

dashpole Mar 1, 2025 •

edited

Loading

jmacd Feb 25, 2025

jmacd Feb 25, 2025

jmacd Feb 25, 2025

andrzej-stencel Mar 3, 2025

dashpole Mar 4, 2025

metricstarttimeprocessor: Copy true reset strategy from prometheus receiver #37855

metricstarttimeprocessor: Copy true reset strategy from prometheus receiver #37855

Conversation

dashpole commented Feb 11, 2025

Description

Link to tracking issue

Testing

Documentation

Notes for reviewers

dashpole commented Feb 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ridwanmsharif left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dashpole Mar 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dashpole Mar 1, 2025 •

edited

Loading