test: deflake logs -f follow tests and kubo restart-race detect window by Rinse12 · Pull Request #78 · bitsocialnet/bitsocial-cli

Rinse12 · 2026-06-08T11:41:59Z

Fixes the two flaky CI failures seen on master run 27133947713 (macos + ubuntu). Both tests pass locally; the failures were CI-runner-load timing races, confirmed by reading the job logs.

Changes

`logs -f` follow tests (`logs.test.ts`) — macos failure

The three bitsocial logs -f log file rotation tests killed the CLI on a fixed 8s timer anchored to process spawn. That window had to cover CLI cold-start (oclif/pkc-js import, several seconds) plus printing the initial marker, detecting the file change, and flushing the second marker. On a slow runner, startup ate the budget and the process was SIGINT'd before the awaited line was emitted — macos saw the initial line but never APPENDED_LINE.

Replaced the inline duplicated promise logic with a shared runFollowUntil() helper that:

kills the process as soon as the awaited output is observed (doneWhen), decoupling correctness from startup latency;
supports an optional graceMs so the --stdout filter test can still catch an erroneously-unfiltered stderr line before killing (negative assertion stays meaningful);
has a generous 25s safety cap (under the 30s testTimeout so a genuine hang still fails as an assertion, not a timeout).

These tests now finish in ~1–5s locally instead of always 8s, while tolerating up to 25s of startup slowness on CI.

kubo restart-race test (`daemon-kubo-restart-race.test.ts`) — ubuntu failure

The failing line (expect(kuboRestarted).toBe(true)) is a setup precondition, not the assertion under test: after /shutdown, the daemon must auto-restart kubo before we can verify it dies on SIGTERM. The 30s detect window was too tight given the test injects PKC_CLI_TEST_IPFS_READY_DELAY_MS=5000 and the runner may be loaded with parallel kubo nodes. It ran 43.8s before failing on ubuntu CI but lands in ~13s locally. Widened to 60s — still well within the test's 120s cap.

No production code changed; test-harness robustness only.

Verification

Both files pass locally after the change:

logs.test.ts: 29/29 passed (file 51s → 36s)
daemon-kubo-restart-race.test.ts: 4/4 passed

Refs #77

Summary by CodeRabbit

Tests
- Improved daemon shutdown test reliability by expanding timing detection windows to reduce CI flakiness
- Enhanced log-following tests with output-based detection mechanisms for better test robustness

The logs -f rotation tests killed the CLI on a fixed 8s timer anchored to process spawn, so the budget for the awaited output had to share that window with CLI cold-start (oclif/pkc-js import). On a slow CI runner startup ate the budget and the process was killed before the second marker was emitted (macos failure: stdout had the initial line but not APPENDED_LINE). Replace the fixed window with a shared runFollowUntil() helper that kills as soon as the awaited output is observed (with an optional grace period and a generous 25s safety cap), decoupling correctness from startup latency. Tests now finish in ~1-5s locally instead of always 8s. Also widen the restart-race setup precondition (restarted-kubo detect) from 30s to 60s: it must absorb the injected 5s ready-delay plus restart under runner load. Ran 43.8s before failing on ubuntu CI; ~13s locally. Still well within the test's 120s cap. Refs #77

coderabbitai · 2026-06-08T11:42:11Z

Warning

Review limit reached

@Rinse12, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 28 minutes and 57 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae191b21-9e63-41ae-85c6-16e8b8444fcb

📥 Commits

Reviewing files that changed from the base of the PR and between 89371c6 and c26e21f.

📒 Files selected for processing (2)

test/cli/daemon.test.ts
test/cli/logs.test.ts

📝 Walkthrough

Walkthrough

This PR hardens two test suites to reduce CI timing flakiness: the kubo restart-race test gains a documented, expanded detection window, and three log-follow tests replace fixed-timeout kill logic with output-anchored termination via a new runFollowUntil helper.

Changes

Test Timing Hardening

Layer / File(s)	Summary
Kubo Restart Detection Window Expansion `test/cli/daemon-kubo-restart-race.test.ts`	Adds explanatory comment describing the `/shutdown` precondition and restart timing requirements, then increases the `waitForCondition` timeout for detecting kubo restart from 30s to 60s to reduce slow-CI flakiness.
Output-Anchored Log-Follow Test Helper `test/cli/logs.test.ts`	Introduces `runFollowUntil` helper to spawn `bitsocial logs -f` and monitor accumulated stdout/stderr, terminating via SIGINT when a `doneWhen` predicate is satisfied (with optional `graceMs` and `maxMs` safety). Replaces fixed 8s timeouts in three log-follow tests: the first rotation test now detects "NEW_FILE_MARKER" in stdout and "switched to new log file" in stderr; the stdout-filtered rotation test watches for "new stdout msg" with a grace period to catch stderr leaks; the file-append test terminates when "APPENDED_LINE" appears.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

bitsocialnet/bitsocial-cli#77: The changes implement the exact fixes proposed: expanding the restart-detection timeout and replacing fixed-delay log-follow kill timers with output-anchored runFollowUntil completion.
bitsocialnet/bitsocial-cli#75: The runFollowUntil refactoring directly addresses the Windows CI flakiness by driving subprocess shutdown off observed output rather than a fixed wall-clock deadline.

Possibly related PRs

bitsocialnet/bitsocial-cli#43: Updates test/cli/logs.test.ts log-follow tests using output-anchored completion, exercising the follow-mode file-reading behavior in src/cli/commands/logs.ts.
bitsocialnet/bitsocial-cli#71: Both PRs modify test/cli/daemon-kubo-restart-race.test.ts kubo restart/shutdown timing for issue #70; the main PR's expanded restart-detection window builds on the regression test coverage from the retrieved PR.
bitsocialnet/bitsocial-cli#41: The main PR updates test/cli/logs.test.ts to follow bitsocial logs -f by observing output markers, complementing the retrieved PR's watchFile → userspace polling implementation change.

Poem

🐰 A log file watched by timing's cruel hand,
Now springs to life on markers it can understand.
No more the clock; let output be the guide—
CI flakes flee fast when polls turn the tide! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the two main changes: deflaking the logs -f follow tests and increasing the kubo restart-race detection window, matching the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/harden-flaky-follow-restart-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The 'stops kubo when daemon exits during a restart cycle' test failed on ubuntu CI at line 418 (expect(kuboRestarted).toBe(true)) — the same restart-detect precondition class as the sibling test in daemon-kubo-restart-race.test.ts. Two problems found by reading the CI job log: 1. Diagnostics gap: the daemon DEBUG-log dump only fired when the LATER assertion (kuboStoppedAfterKill, line 451) failed, not the restart-detect at line 418 that actually fails. So the CI log contained only the bare assertion with no daemon timeline — couldn't see what kubo did. Extracted the dump into dumpDiagnostics() and call it on BOTH precondition failures. 2. Window too tight under CI contention: vitest runs test files in parallel forks (fileParallelism) and a 2-vCPU ubuntu runner spawns many kubo nodes concurrently (this test ran alongside daemon-kubo-restart-race.test.ts's 4 kubo nodes). Restart lands in ~13s locally but was observed taking >20s on CI. Widened the detect window 20s -> 45s and raised the test timeout 60s -> 120s to match the sibling test. If it still fails, dumpDiagnostics() now prints the daemon's kubo restart/kill timeline so we can distinguish contention from a real bug. Refs #77

The 'continues watching old file' test failed again on ubuntu CI, this time waiting the full 25s safety cap with APPENDED_LINE never appearing — so it is NOT a timing-window flake. The two sibling follow tests (which detect a NEW file via the 3s full-reread poll) passed fast; only this one, which appends to the file already being followed (surfaced by the incremental byte-offset poll), fails. Cannot reproduce locally even under CPU saturation (32 cores). Add targeted diagnostics that fire when APPENDED_LINE is missing, to pin down which of three causes it is on the next CI run: - append never reached disk (appendTriggered / appendError + on-disk content) - logs -f never detected it (on-disk has the line, captured stdout does not) - output truncated on SIGINT exit (timedOut + line on disk but not captured) Also surface appendFile rejections (previously swallowed) and have the helper report whether it hit the safety cap. Refs #77

Rinse12 mentioned this pull request Jun 8, 2026

Flaky CI: logs -f follow tests and kubo restart-race test fail under CI runner load #77

Open

Rinse12 added 2 commits June 8, 2026 11:59

Rinse12 merged commit 9767222 into master Jun 8, 2026
4 checks passed

Rinse12 mentioned this pull request Jun 9, 2026

fix(logs): close race that dropped appends in -f follow mode #79

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: deflake logs -f follow tests and kubo restart-race detect window#78

test: deflake logs -f follow tests and kubo restart-race detect window#78
Rinse12 merged 3 commits into
masterfrom
test/harden-flaky-follow-restart-tests

Rinse12 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Rinse12 commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

logs -f follow tests (logs.test.ts) — macos failure

kubo restart-race test (daemon-kubo-restart-race.test.ts) — ubuntu failure

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Rinse12 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

`logs -f` follow tests (`logs.test.ts`) — macos failure

kubo restart-race test (`daemon-kubo-restart-race.test.ts`) — ubuntu failure

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading