Skip to content

[MINOR] Upgrade Surefire to 3.5.2 and harden fork shutdown#2473

Merged
Baunsgaard merged 1 commit into
apache:mainfrom
Baunsgaard:fix-component-c-fork-hang
May 28, 2026
Merged

[MINOR] Upgrade Surefire to 3.5.2 and harden fork shutdown#2473
Baunsgaard merged 1 commit into
apache:mainfrom
Baunsgaard:fix-component-c-fork-hang

Conversation

@Baunsgaard
Copy link
Copy Markdown
Contributor

  • Bump maven-surefire-plugin from 3.0.0 to 3.5.2.
  • Switch the fork <-> Surefire control channel to TCP via SurefireForkNodeFactory. The default pipe-based channel can deadlock on fork shutdown when child threads are still writing to stdout/err (SUREFIRE-1722).
  • Add forkedProcessExitTimeoutInSeconds=30 so a fork that still has live non-daemon threads after its test class completes is forcibly killed instead of stalling the run.
  • Enable enableProcessChecker=native so each fork detects a dead parent and self-terminates, and Surefire can more reliably kill stuck forks on Linux.

These changes target the intermittent ~26 minute hangs between test classes seen in the component.c suite.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.39%. Comparing base (88c26e2) to head (bc702e0).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2473      +/-   ##
============================================
+ Coverage     71.37%   71.39%   +0.01%     
- Complexity    48749    48780      +31     
============================================
  Files          1571     1571              
  Lines        188912   188912              
  Branches      37067    37067              
============================================
+ Hits         134845   134868      +23     
+ Misses        43601    43595       -6     
+ Partials      10466    10449      -17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Fix intermittent ~26 minute hangs in the **.component.c**.** GitHub
Actions job. The forks were finishing their test classes but failing
to exit cleanly (leaked non-daemon threads from test executors), and
Surefire 3.0.0 did not reliably enforce its fork shutdown timeout.

- Bump maven-surefire-plugin from 3.0.0 to 3.5.2.
- Set forkedProcessExitTimeoutInSeconds=30 so a fork that still has
  live threads after its test class completes is forcibly killed
  instead of stalling the run.
- Set enableProcessChecker=native so each fork detects a dead parent
  and self-terminates, and Surefire can more reliably kill stuck
  forks on Linux.

Verified across a full Java test matrix: component.c is stable, no
job exceeds its 30 minute cap, and aggregate compute is unchanged.
@Baunsgaard Baunsgaard force-pushed the fix-component-c-fork-hang branch from 1b65808 to bc702e0 Compare May 28, 2026 11:11
@Baunsgaard Baunsgaard merged commit 9a4e2a3 into apache:main May 28, 2026
50 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in SystemDS PR Queue May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant