Tag retried-test non-final attempts as skip in JUnit report by mhdatie · Pull Request #11443 · DataDog/dd-trace-java

mhdatie · 2026-05-21T20:02:37Z

What Does This Do

Add JUnitReport.tagRetriedTests() and wire it into the ResultCollector post-processing pipeline between tagSyntheticFailures() and tagFinalStatuses().

The new method scans every <testcase> element and, when a later sibling shares the same (classname, name), marks the earlier one with dd_tags[test.final_status]=skip via the existing addFinalStatusProperty(..., APPEND_TO_TESTCASE) helper.

Net effect: only the final attempt of a retried test keeps its real status; earlier attempts are tagged skip.

Ordering matters — tagRetriedTests must run before tagFinalStatuses, otherwise the per-testcase fallback tagger would overwrite the skip with fail.

Motivation

When Develocity's testRetry plugin retries a failed test, the JUnit XML contains one <testcase> per attempt, all sharing the same (classname, name). CI treats the test as green if the final attempt passes, but Test Optimization was tagging the earlier failed attempts as final_status=fail and surfacing them as real failures.

Example trace where a retried-then-passed test shows up as a failure

Additional Notes

Alternatives considered

Path B — generalize and delete the initializationError branch from tagSyntheticFailures. The existing initializationError handling is a strict subset of
the new logic: same group-and-skip-all-but-last pattern, just restricted to one literal name. Folding it into tagRetriedTests would remove a few lines, but was
rejected because:

The overlap is cost-free: addFinalStatusProperty short-circuits when a final_status property already exists, so the second tag against the same testcase is
a no-op.
The initializationError branch carries source-pinned links to JUnit 4 and Gradle that justify why that specific literal is treated as a framework synthetic. Deleting it sheds context that was deliberately added in Stop skipping "test exception" test cases, they are not synthetic #11427.

Scope

Touches only .gitlab/collect-result/ (the build-time JUnit XML post-processor that uploads results to Test Optimization). No agent / tracer code paths are affected — CI tooling, not user-visible.

Jira ticket: APMLP-1297

The Develocity testRetry plugin emits one <testcase> per attempt sharing the same (classname, name); CI treats only the final attempt as authoritative but Test Optimization was surfacing earlier failed attempts as real failures. Mark non-final attempts with final_status=skip in the JUnit XML post-processor so reports match what CI considers the outcome. Jira: APMLP-1297 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mhdatie · 2026-05-21T20:05:19Z

@bric3 Tagged you for early review to compare between this change and the alternative provided in the description before I open it up for review 🙇 .

bric3

I'm getting a tad concerned by the lack of visibility on retries.

Also as a matter of followup on the initiative here's the previous work from @cbeauchesne

bric3 · 2026-05-22T16:12:26Z

    var reportChangedBeforeFinalStatus = report.addFileAttribute(sourceFile);
    reportChangedBeforeFinalStatus |= report.normalizeStableTestNames();
    report.tagSyntheticFailures();
+    report.tagRetriedTests();


issue: This happens after normalizeStableTestNames();, so basically two different tests could end up with the same name, and one can get silently skipped.

E.g.

localhost:12345 and final_status=fail => after normalization : localhost:PORT and this one would incorrectly switch to final_status=skip

localhost:23456 end final_status=pass => after normalization : localhost:PORT

Great catch..

Do we normalize localhost failures and then just look at the count in Test Optim.?

What do you think of excluding the normalized cases from this operation?

I'm getting a tad concerned by the lack of visibility on retries.

Can you elaborate? It would be great to autodetect retries. Are these manually detected by us today? (cc @cbeauchesne) - I know that Flaky tests are labelled in Datadog and from there we can detect if they were retried (grouped by name)

On catching retries proactively, I'm not sure if there are plans to auto-detect these retries by @DataDog/ci-app-libraries team.

I did some planning with Claude, and the suggestion was to:

Spend time understanding how detection can be implemented with Develocity's testRetry plugin (Gradle TestListener, JUnit 5 TestExecutionListener, etc)

Use CiVisibilityMetricCollector and CiVisibilityCountMetric since they have the metadata/tags to detect retries

Document ownership

Claude Details

Data model — trivial

Add a RetryReason.develocity enum value in internal-api/.../telemetry/tag/RetryReason.java. The existing TEST_EVENT_FINISHED count metric and Tags.TEST_RETRY_REASON span tag pick it up automatically.

Detection — the real work

Identifying that a given test execution is a Develocity retry requires one of:

A new instrumentation module hooking the plugin's retry executor class.

Extending the existing Gradle build-system instrumentation in agent-ci-visibility to detect the testRetry extension on the Test task and propagate a sysprop into forked test JVMs.

A same-session re-execution heuristic — fallback only, if neither of the above is viable.

Wiring

The resulting signal lands in TestEventsHandlerImpl.java:263, and must not override existing atr / efd / attemptToFix reasons.

Open decisions

Whether to also set IsRetry — probably yes.

Whether to set HasFailedAllRetries — probably no, since Develocity controls the retry budget out-of-process.

Validation

A smoke test under dd-smoke-tests/ with the plugin applied to a flaky test, plus a telemetry assertion that event_finished carries retry_reason:develocity_test_retry.

Ownership

RetryReason.java, TestEventsHandlerImpl.java, and anything under agent-ci-visibility/ are owned by @DataDog/ci-app-libraries — the change will require their review even if apm-lang-platform-java drives
it.

Recommendation

Do a short Phase-0 spike on the Develocity plugin's internals before committing to a detection approach.

Hey @mhdatie! Unfortunately what Claude suggested here seems related to the testing instrumentation inside the tracer itself. In our case, it wouldn't apply because we don't instrument the tracer's tests with the tracer itself. The test reporting is entirely based on the JUnitXML report generated by the testing framework. I've discussed with the team and I'm not sure there would be an easy way of detecting the retries on our side during ingestion (there are several limitations like the ingestion order not necessarily following the execution order, which could then incorrectly tag retries) unless we were able to include some additional information in the report to inform this.

Copilot

Pull request overview

This PR updates the .gitlab/collect-result JUnit XML post-processor so that when the same test (classname, name) appears multiple times (as produced by Develocity’s testRetry), only the final attempt retains its computed status while earlier attempts are tagged with dd_tags[test.final_status]=skip to avoid surfacing them as real failures in Test Optimization.

Changes:

Adds JUnitReport.tagRetriedTests() to tag non-final retry attempts as skip.
Wires tagRetriedTests() into ResultCollector between tagSyntheticFailures() and tagFinalStatuses() to prevent later fallback tagging from overriding the skip status.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`.gitlab/collect-result/ResultCollector.java`	Calls `tagRetriedTests()` during report post-processing, before final status tagging.
`.gitlab/collect-result/JUnitReport.java`	Implements `tagRetriedTests()` to identify duplicate `(classname, name)` testcases and tag earlier ones as `skip`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mhdatie · 2026-05-22T19:51:54Z

+    for (var i = 0; i < all.size(); i++) {
+      var current = all.get(i);
+      var classname = current.getAttribute("classname");
+      var name = current.getAttribute("name");
+      for (var j = i + 1; j < all.size(); j++) {


Will apply based on outcome of above convo #11443 (review)

mhdatie added tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes type: bug Bug report and fix comp: tooling Build & Tooling labels May 21, 2026

mhdatie requested a review from bric3 May 21, 2026 20:04

This comment has been minimized.

Sign in to view

bric3 requested a review from cbeauchesne May 22, 2026 15:58

bric3 reviewed May 22, 2026

View reviewed changes

mhdatie requested a review from Copilot May 22, 2026 19:24

Copilot started reviewing on behalf of mhdatie May 22, 2026 19:25 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag retried-test non-final attempts as skip in JUnit report#11443

Tag retried-test non-final attempts as skip in JUnit report#11443
mhdatie wants to merge 1 commit into
masterfrom
mo.atie/apmlp-1297-skip-on-retried

mhdatie commented May 21, 2026 •

edited by atlassian Bot

Loading

Uh oh!

mhdatie commented May 21, 2026

Uh oh!

This comment has been minimized.

bric3 left a comment •

edited

Loading

Uh oh!

bric3 May 22, 2026

Uh oh!

mhdatie May 22, 2026 •

edited

Loading

Uh oh!

mhdatie May 22, 2026

Uh oh!

daniel-mohedano May 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

mhdatie May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mhdatie commented May 21, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Alternatives considered

Scope

Uh oh!

mhdatie commented May 21, 2026

Uh oh!

This comment has been minimized.

bric3 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bric3 May 22, 2026

Choose a reason for hiding this comment

Uh oh!

mhdatie May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhdatie May 22, 2026

Choose a reason for hiding this comment

Data model — trivial

Detection — the real work

Wiring

Open decisions

Validation

Ownership

Recommendation

Uh oh!

daniel-mohedano May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

mhdatie May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mhdatie commented May 21, 2026 •

edited by atlassian Bot

Loading

bric3 left a comment •

edited

Loading

mhdatie May 22, 2026 •

edited

Loading