Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/cumulativetodelta]: exclude the first point after a restart in monotonic sums #18298

Merged

Conversation

sigilioso
Copy link
Contributor

Description:

Following up #18224, this PR drops the first datapoint after a restart is detected, so unexpected spikes are avoided.

Simple example: before the changes, 100, 105, 120, 100, 110 would be converted to 5, 15, 100, 10, when we omit the value when a restart is detected it would be 5, 15, 10.

Link to tracking Issue:
#17190, #18053

Testing:

  • A new test case was included in the proccesor to check that the value is actually omitted.
  • The tracker's convert tests were updated to reflect the new behavior.
❯ go test ./... -race
ok      github.com/open-telemetry/opentelemetry-collector-contrib/processor/cumulativetodeltaprocessor  0.454s
ok      github.com/open-telemetry/opentelemetry-collector-contrib/processor/cumulativetodeltaprocessor/internal/tracking        1.268s

Documentation:

@sigilioso sigilioso requested review from a team and TylerHelmuth as code owners February 6, 2023 07:05
@github-actions github-actions bot added the processor/cumulativetodelta Cumulative To Delta processor label Feb 6, 2023
@runforesight
Copy link

runforesight bot commented Feb 6, 2023

Foresight Summary

    
Major Impacts

build-and-test-windows duration(10 seconds) has decreased 43 minutes 5 seconds compared to main branch avg(43 minutes 15 seconds).
View More Details

⭕  build-and-test-windows workflow has finished in 10 seconds (41 minutes 56 seconds less than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  check-links workflow has finished in 47 seconds (1 minute 41 seconds less than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 57 seconds (2 minutes 7 seconds less than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  changelog workflow has finished in 2 minutes 38 seconds and finished at 23rd Feb, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 3 minutes 35 seconds (5 minutes 43 seconds less than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

✅  e2e-tests workflow has finished in 11 minutes 45 seconds (4 minutes 9 seconds less than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
kubernetes-test (v1.26.0) -     🔗  N/A See Details
kubernetes-test (v1.25.3) -     🔗  N/A See Details
kubernetes-test (v1.24.7) -     🔗  N/A See Details
kubernetes-test (v1.23.13) -     🔗  N/A See Details

✅  load-tests workflow has finished in 26 minutes 39 seconds (⚠️ 9 minutes 10 seconds more than main branch avg.) and finished at 23rd Feb, 2023.


Job Failed Steps Tests
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 18  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details

✅  build-and-test workflow has finished in 46 minutes 12 seconds (26 minutes 21 seconds less than main branch avg.) and finished at 1st Mar, 2023.


Job Failed Steps Tests
lint-matrix (other) -     🔗  N/A See Details
lint -     🔗  N/A See Details
cross-compile (darwin, amd64) -     🔗  N/A See Details
cross-compile (darwin, arm64) -     🔗  N/A See Details
cross-compile (linux, 386) -     🔗  N/A See Details
cross-compile (linux, amd64) -     🔗  N/A See Details
cross-compile (linux, arm) -     🔗  N/A See Details
cross-compile (linux, arm64) -     🔗  N/A See Details
cross-compile (linux, ppc64le) -     🔗  N/A See Details
cross-compile (windows, 386) -     🔗  N/A See Details
cross-compile (windows, amd64) -     🔗  N/A See Details
build-package (deb) -     🔗  N/A See Details
build-package (rpm) -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
publish-check -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 21, 2023
@sigilioso
Copy link
Contributor Author

sigilioso commented Feb 21, 2023

As I see it, this approach would be consistent with dropping the first value (as set in #18224). Does it make sense to you?

@TylerHelmuth
Copy link
Member

@sigilioso please rebase to re-trigger CI

Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monotonic means it should be increasing, not must, so this could cause trouble if we ever have a monotonic sum that is not increasing (see e.g. the discussion on #14956 for sums coming from statsd counts). Still, I think it makes sense to try this out for consistency; we can always reconsider if we get these edge cases in the future.

@mx-psi
Copy link
Member

mx-psi commented Feb 22, 2023

unittest (1.20) is not running (needs a rebase maybe?)

@Aneurysm9
Copy link
Member

Monotonic means it should be increasing, not must, so this could cause trouble if we ever have a monotonic sum that is not increasing

Not sure this is correct.

mon·o·ton·ic
/ˌmänəˈtänik/
adjective: monotonic

  1. Mathematics
    (of a function or quantity) varying in such a way that it either never decreases or never increases.

Monotonic means it won't go down. 0 is a perfectly fine result from a cumulative-to-delta conversion of a monotonic sum, but not -1. This is already accounted for in the existing checks.

@mx-psi
Copy link
Member

mx-psi commented Feb 23, 2023

@Aneurysm9 that's also how I use the term, but the spec uses 'SHOULD':

A flag denoting whether the Sum is monotonic. In this case of metrics, this means the sum is nominally increasing, which we assume without loss of generality.

  • For delta monotonic sums, this means the reader SHOULD expect non-negative values.
  • For cumulative monotonic sums, this means the reader SHOULD expect values that are not less than the previous value.

So (although I find this quite counterintuitive) I don't think we can say that an OTLP monotonic sum will never go down

@mx-psi
Copy link
Member

mx-psi commented Mar 1, 2023

@Aneurysm9 Is this PR good to go from your side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/cumulativetodelta Cumulative To Delta processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants