-
Notifications
You must be signed in to change notification settings - Fork 469
chore: increase backoff iterations on tracer flare tests cleanup #15203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: increase backoff iterations on tracer flare tests cleanup #15203
Conversation
Signed-off-by: Juanjo Alvarez <juanjo.alvarezmartinez@datadoghq.com>
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 261 ± 3 ms. The average import time from base is: 267 ± 4 ms. The import time difference between this PR and base is: -6.8 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate juanjux/retry-flares-cleanup-increase-backoff-iterations (2276668) with baseline main (73f2611) 📈 Performance Regressions (1 suite)📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 4.296µs (SLO: <10.000µs 📉 -57.0%) vs baseline: +0.6% Memory: ✅ 39.872MB (SLO: <41.000MB -2.8%) vs baseline: +4.9% ✅ ospathbasename_noaspectTime: ✅ 1.090µs (SLO: <10.000µs 📉 -89.1%) vs baseline: +0.1% Memory: ✅ 39.872MB (SLO: <41.000MB -2.8%) vs baseline: +5.1% ✅ ospathjoin_aspectTime: ✅ 6.135µs (SLO: <10.000µs 📉 -38.7%) vs baseline: -1.2% Memory: ✅ 39.793MB (SLO: <41.000MB -2.9%) vs baseline: +4.5% ✅ ospathjoin_noaspectTime: ✅ 2.309µs (SLO: <10.000µs 📉 -76.9%) vs baseline: +0.7% Memory: ✅ 39.813MB (SLO: <41.000MB -2.9%) vs baseline: +4.8% ✅ ospathnormcase_aspectTime: ✅ 3.572µs (SLO: <10.000µs 📉 -64.3%) vs baseline: +0.1% Memory: ✅ 39.833MB (SLO: <41.000MB -2.8%) vs baseline: +4.6% ✅ ospathnormcase_noaspectTime: ✅ 0.569µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +1.4% Memory: ✅ 39.931MB (SLO: <41.000MB -2.6%) vs baseline: +4.8% ✅ ospathsplit_aspectTime: ✅ 5.797µs (SLO: <10.000µs 📉 -42.0%) vs baseline: 📈 +20.0% Memory: ✅ 39.872MB (SLO: <41.000MB -2.8%) vs baseline: +4.8% ✅ ospathsplit_noaspectTime: ✅ 1.589µs (SLO: <10.000µs 📉 -84.1%) vs baseline: +0.2% Memory: ✅ 39.852MB (SLO: <41.000MB -2.8%) vs baseline: +4.7% ✅ ospathsplitdrive_aspectTime: ✅ 3.748µs (SLO: <10.000µs 📉 -62.5%) vs baseline: +0.2% Memory: ✅ 39.734MB (SLO: <41.000MB -3.1%) vs baseline: +4.9% ✅ ospathsplitdrive_noaspectTime: ✅ 0.703µs (SLO: <10.000µs 📉 -93.0%) vs baseline: +0.9% Memory: ✅ 39.911MB (SLO: <41.000MB -2.7%) vs baseline: +5.0% ✅ ospathsplitext_aspectTime: ✅ 4.625µs (SLO: <10.000µs 📉 -53.8%) vs baseline: +0.1% Memory: ✅ 39.911MB (SLO: <41.000MB -2.7%) vs baseline: +5.1% ✅ ospathsplitext_noaspectTime: ✅ 1.394µs (SLO: <10.000µs 📉 -86.1%) vs baseline: +0.9% Memory: ✅ 39.892MB (SLO: <41.000MB -2.7%) vs baseline: +4.8% 🟡 Near SLO Breach (2 suites)🟡 otelspan - 22/22✅ add-eventTime: ✅ 38.609ms (SLO: <47.150ms 📉 -18.1%) vs baseline: -0.6% Memory: ✅ 39.009MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.8% ✅ add-metricsTime: ✅ 259.207ms (SLO: <344.800ms 📉 -24.8%) vs baseline: +1.0% Memory: ✅ 43.273MB (SLO: <47.500MB -8.9%) vs baseline: +4.9% ✅ add-tagsTime: ✅ 315.736ms (SLO: <321.000ms 🟡 -1.6%) vs baseline: +0.2% Memory: ✅ 43.332MB (SLO: <47.500MB -8.8%) vs baseline: +4.9% ✅ get-contextTime: ✅ 79.009ms (SLO: <92.350ms 📉 -14.4%) vs baseline: +0.1% Memory: ✅ 39.337MB (SLO: <46.500MB 📉 -15.4%) vs baseline: +5.2% ✅ is-recordingTime: ✅ 36.204ms (SLO: <44.500ms 📉 -18.6%) vs baseline: +0.3% Memory: ✅ 38.826MB (SLO: <47.500MB 📉 -18.3%) vs baseline: +4.9% ✅ record-exceptionTime: ✅ 56.974ms (SLO: <67.650ms 📉 -15.8%) vs baseline: ~same Memory: ✅ 39.469MB (SLO: <47.000MB 📉 -16.0%) vs baseline: +5.1% ✅ set-statusTime: ✅ 42.363ms (SLO: <50.400ms 📉 -15.9%) vs baseline: -0.2% Memory: ✅ 38.890MB (SLO: <47.000MB 📉 -17.3%) vs baseline: +5.2% ✅ startTime: ✅ 35.153ms (SLO: <43.450ms 📉 -19.1%) vs baseline: -0.4% Memory: ✅ 38.796MB (SLO: <47.000MB 📉 -17.5%) vs baseline: +5.1% ✅ start-finishTime: ✅ 81.937ms (SLO: <88.000ms -6.9%) vs baseline: ~same Memory: ✅ 36.628MB (SLO: <46.500MB 📉 -21.2%) vs baseline: +5.0% ✅ start-finish-telemetryTime: ✅ 83.446ms (SLO: <89.000ms -6.2%) vs baseline: -0.2% Memory: ✅ 36.648MB (SLO: <46.500MB 📉 -21.2%) vs baseline: +4.9% ✅ update-nameTime: ✅ 36.780ms (SLO: <45.150ms 📉 -18.5%) vs baseline: -0.2% Memory: ✅ 38.899MB (SLO: <47.000MB 📉 -17.2%) vs baseline: +4.9% 🟡 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 2.943µs (SLO: <20.000µs 📉 -85.3%) vs baseline: -0.6% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +5.0% ✅ 1-count-metrics-100-timesTime: ✅ 198.316µs (SLO: <220.000µs -9.9%) vs baseline: -1.9% Memory: ✅ 34.505MB (SLO: <35.500MB -2.8%) vs baseline: +4.9% ✅ 1-distribution-metric-1-timesTime: ✅ 3.406µs (SLO: <20.000µs 📉 -83.0%) vs baseline: +0.6% Memory: ✅ 34.564MB (SLO: <35.500MB -2.6%) vs baseline: +5.1% ✅ 1-distribution-metrics-100-timesTime: ✅ 217.381µs (SLO: <220.000µs 🟡 -1.2%) vs baseline: -0.7% Memory: ✅ 34.564MB (SLO: <35.500MB -2.6%) vs baseline: +5.3% ✅ 1-gauge-metric-1-timesTime: ✅ 2.170µs (SLO: <20.000µs 📉 -89.2%) vs baseline: -0.3% Memory: ✅ 34.505MB (SLO: <35.500MB -2.8%) vs baseline: +4.7% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.835µs (SLO: <150.000µs -8.8%) vs baseline: -0.1% Memory: ✅ 34.505MB (SLO: <35.500MB -2.8%) vs baseline: +5.0% ✅ 1-rate-metric-1-timesTime: ✅ 3.135µs (SLO: <20.000µs 📉 -84.3%) vs baseline: +0.8% Memory: ✅ 34.505MB (SLO: <35.500MB -2.8%) vs baseline: +4.6% ✅ 1-rate-metrics-100-timesTime: ✅ 214.538µs (SLO: <250.000µs 📉 -14.2%) vs baseline: -0.6% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +5.0% ✅ 100-count-metrics-100-timesTime: ✅ 20.063ms (SLO: <22.000ms -8.8%) vs baseline: +0.9% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +4.8% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.272ms (SLO: <2.300ms 🟡 -1.2%) vs baseline: +0.3% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.9% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.413ms (SLO: <1.550ms -8.8%) vs baseline: -0.2% Memory: ✅ 34.465MB (SLO: <35.500MB -2.9%) vs baseline: +4.8% ✅ 100-rate-metrics-100-timesTime: ✅ 2.181ms (SLO: <2.550ms 📉 -14.5%) vs baseline: ~same Memory: ✅ 34.485MB (SLO: <35.500MB -2.9%) vs baseline: +4.7% ✅ flush-1-metricTime: ✅ 4.643µs (SLO: <20.000µs 📉 -76.8%) vs baseline: +0.7% Memory: ✅ 34.426MB (SLO: <35.500MB -3.0%) vs baseline: +4.6% ✅ flush-100-metricsTime: ✅ 175.771µs (SLO: <250.000µs 📉 -29.7%) vs baseline: +0.7% Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +5.0% ✅ flush-1000-metricsTime: ✅ 2.113ms (SLO: <2.500ms 📉 -15.5%) vs baseline: +0.1% Memory: ✅ 35.370MB (SLO: <36.500MB -3.1%) vs baseline: +4.8%
|
emmettbutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this makes the test suite a little more reliable!
Voiceover: it didn't. It seems the failure to delete the tracer flares dir on CI is not a timing problem. Let's hope the switch to the native implementation for tracer flares fix this. |
Description
Previous backoff ended at 5 iterations from 0.1 which tops at 0.5 seconds. However tests are still failing, just a little less so let's increase it to 10 iterations which should top at 5.5. If this doesn't fix the issue, we know we probably have a problem with the tracer flare cleanup, at least in CI (could not reproduce it locally).
Testing
Risks
Additional Notes