Skip to content

[METRICS] Drop stale async attribute sets from cumulative exports#4140

Open
pranitaurlam wants to merge 2 commits into
open-telemetry:mainfrom
pranitaurlam:fix/async-cumulative-stale-attributes
Open

[METRICS] Drop stale async attribute sets from cumulative exports#4140
pranitaurlam wants to merge 2 commits into
open-telemetry:mainfrom
pranitaurlam:fix/async-cumulative-stale-attributes

Conversation

@pranitaurlam
Copy link
Copy Markdown

Fixes #4108

Summary

The OpenTelemetry SDK spec states:

The implementation SHOULD NOT produce aggregated metric data for a previously-observed attribute set which is not observed during a successful callback.

Bug: Under cumulative temporality, async instruments (ObservableCounter, ObservableGauge, ObservableUpDownCounter) kept emitting attribute sets indefinitely after the callback stopped reporting them. The frozen values would appear in every subsequent export with the stale last-observed value (and for ObservableGauge, even with a frozen timestamp).

Root cause in TemporalMetricStorage::buildMetrics(): the cumulative merge unconditionally carried every entry from last_reported_metrics_ into the output even if it was absent from the current delta.

Changes (3 files)

temporal_metric_storage.h / .cc

  • Add is_async_ boolean (default false) to TemporalMetricStorage
  • In the cumulative merge lambda, add else if (!is_async_) guard: sync instruments still carry forward all attribute sets (existing behaviour, spec-correct); async instruments skip stale entries not present in the current delta

async_metric_storage.h

  • Pass is_async = true when constructing TemporalMetricStorage
  • In Collect(), rebuild cumulative_hash_map_ to contain only entries present in delta_metrics — prevents unbounded memory growth and ensures correct delta computation if an attribute set reappears after a gap

Behaviour

Scenario Before After
Async cumulative, attribute dropped by callback Emitted forever with stale value Dropped from next export
Sync cumulative, attribute not measured this cycle Carried forward (correct) Unchanged
Delta temporality (async or sync) Correct (unaffected) Unchanged

Sister SDKs: opentelemetry-dotnet #6883 and opentelemetry-rust #2618 have shipped equivalent fixes.

cc @marcalff @lalitb @ThomsonTan @kyusic

@pranitaurlam pranitaurlam requested a review from a team as a code owner June 7, 2026 18:49
@pranitaurlam
Copy link
Copy Markdown
Author

Hi @marcalff @lalitb @ThomsonTan @kyusic — could you please approve the workflows for PR #4140 (fix #4108)? Thank you!

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.03%. Comparing base (4352a63) to head (830579f).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4140      +/-   ##
==========================================
+ Coverage   82.01%   82.03%   +0.03%     
==========================================
  Files         385      385              
  Lines       16093    16103      +10     
==========================================
+ Hits        13197    13209      +12     
+ Misses       2896     2894       -2     
Files with missing lines Coverage Δ
...telemetry/sdk/metrics/state/async_metric_storage.h 92.31% <100.00%> (+1.40%) ⬆️
sdk/src/metrics/state/temporal_metric_storage.cc 97.78% <100.00%> (+0.06%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[METRICS SDK] Async instruments don't drop unreported attribute sets under Cumulative temporality

1 participant