Skip to content

Conversation

@jason-lynch
Copy link
Member

@jason-lynch jason-lynch commented Jan 25, 2026

Summary

Simplifies our Makefile by combining some targets and adds an option to rerun flaky tests. This option is useful because we have some occasional port conflict issues when the OS assigns an ephemeral port that we're going to use for an instance of Etcd or other service. These issues are extremely difficult to avoid because we can't easily eliminate the gap between our port allocation and when the service binds to the ports. Simply re-running the tests is a much easier option.

Testing

See the .circleci.yaml file for the changed commands.

Summary by CodeRabbit

  • Chores

    • CI now automatically reruns failing tests (up to 2 attempts) to reduce flakes.
    • Testing and linting workflow streamlined for more consistent CI output.
  • Bug Fixes

    • Improved test diagnostics: richer health reporting and standardized container log capture for easier debugging.

✏️ Tip: You can customize this high-level summary in your review settings.

Simplifies our Makefile by combining some targets and adds an option to
rerun flaky tests. This option is useful because we have some occasional
port conflict issues when the OS assigns an ephemeral port that we're
going to use for an instance of Etcd or other service. These issues are
extremely difficult to avoid because we can't easily eliminate the gap
between our port allocation and when the service binds to the ports.
Simply re-running the tests is a much easier option.
@coderabbitai
Copy link

coderabbitai bot commented Jan 25, 2026

📝 Walkthrough

Walkthrough

The CI and Makefile gains test rerun support (TEST_RERUN_FAILS), linting is unified, health assertions now include JSON-formatted component diagnostics, and container log handling is refactored into a reusable test helper.

Changes

Cohort / File(s) Summary
CI Configuration
.circleci/config.yml
Adds TEST_RERUN_FAILS=2 to CI make invocations for CI, cluster, and e2e jobs.
Build & Test Makefile
Makefile
Adds CI and TEST_RERUN_FAILS variables (default 0); introduces gotestsum and golangci-lint declarations; adds --rerun-fails/--rerun-fails-max-failures flags and --packages usage across test targets; consolidates lint target and updates ci dependency.
Common Make settings
common.mk
Renames golangci-lint variable from golangcilint to golangci-lint (same value).
Cluster test diagnostics
clustertest/cluster_test.go
Enhances Cluster.AssertHealthy to include marshaled JSON (indented) of host.Status.Components in failure messages; adds import for encoding/json.
Host test logging helper
clustertest/host_test.go
Replaces inline container-log handling with new printContainerLogs(ctx, t, hostID, container) helper; centralizes log extraction and printing in error/cleanup paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • rshoemaker

Poem

🐇 I nibble logs and tidy trails,
Retrying tests when error hails,
JSON lights the health-check night,
Containers sing their output bright,
CI hops onward—snug and spry!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding test rerun functionality for flaky tests in the CI system.
Description check ✅ Passed The description is mostly complete with Summary and Testing sections, but is missing the Changes, Checklist, and Notes for Reviewers sections from the template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@clustertest/host_test.go`:
- Around line 121-124: The cleanup path uses t.Context() (which is canceled)
when calling printContainerLogs, causing log extraction to fail; update
printContainerLogs calls to use the active cleanup context instead of
t.Context() — either change the helper signature to accept a ctx (e.g.,
printContainerLogs(ctx, t, id, ctr)) and pass the non-canceled cleanup ctx, or
have printContainerLogs create a short-lived background/timeout context
internally (context.WithTimeout(context.Background(), ...)) before fetching
logs; apply the same fix to the other printContainerLogs usages in this test.

@jason-lynch jason-lynch force-pushed the test/rerun-flaky-tests branch from d333f55 to 146650c Compare January 25, 2026 19:12
Copy link
Contributor

@rshoemaker rshoemaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved.

@jason-lynch jason-lynch merged commit 6aae525 into main Jan 27, 2026
3 checks passed
@jason-lynch jason-lynch deleted the test/rerun-flaky-tests branch January 27, 2026 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants