Skip to content

test: fix flaky/stale tests blocking Rust E2E + coverage CI on main#3147

Merged
graycyrus merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/auth-profile-missing-token-test
Jun 1, 2026
Merged

test: fix flaky/stale tests blocking Rust E2E + coverage CI on main#3147
graycyrus merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/auth-profile-missing-token-test

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented Jun 1, 2026

Summary

Fix four pre-existing, unrelated test failures that are red on main and block CI for every PR. All are test-only changes — no production code is touched.

Problem

Each failure is independent and pre-existing on main; none is caused by this PR:

  1. Rust E2E (mock backend) (required)fix(auth): gracefully drop OAuth profiles with missing access_token #3125 changed AuthProfilesStore::load to drop an OAuth profile missing its access_token (instead of erroring) and removed the "OAuth profile missing access_token" string, but left credentials_profile_store_recovers_dropped_entries_empty_files_and_datetime_errors asserting .load().expect_err(...). load() now returns Ok, so expect_err panics (…:5449 unwrap_failed, 49 passed / 1 failed). This blocks the required check on every PR.

  2. Rust Core Coverageinference_provider_admin_round22_raw_coverage_e2e mutates OPENHUMAN_WORKSPACE / OPENHUMAN_OLLAMA_BASE_URL / PATH as process-global env via EnvVarGuard. Its own SAFETY comments say it is "validated with --test-threads=1", but cargo llvm-cov runs the binary's 5 tests in parallel, so the global mutations race and one test reads another's workspace/config — surfacing as a flaky assert object_error.contains("nested provider failure") failure. (Passes in isolation; fails under the parallel coverage run.)

  3. Frontend CoveragememoryGraphLayout.test.tsfeat(memory): redesign sync flow with ingest_summary, graph improvements, audit log #3113 redesigned nodeRadius from shrinking (max(4, 10 - level*0.8)) to growing (5 + level*2.5, chunk 4→3, new source→16) but did not update the test, which still asserted the old values (expected 5 to be 10).

  4. Frontend CoverageOpenhumanLinkModal.notifications.test.tsx — the success-copy query used a curly apostrophe (U+2019) in "didn't", but en.ts renders a straight apostrophe (U+0027), so getByText never matched (two cases failed).

Solution

  1. Update the credentials e2e assertion to the new contract: the missing-access-token profile is dropped, so the load succeeds, the profile + its active pointer are purged, and the drop is persisted — mirroring the existing legacy:bad-kind drop assertions in the same test.
  2. Add an env_lock() mutex (the exact pattern every other *_e2e.rs already uses) and take it at the top of each of the 5 inference tests, so the suite is serialized regardless of the runner's thread count.
  3. Realign memoryGraphLayout radius expectations to the shipped formula (L0→5, L3→12.5, L99→252.5, chunk→3; contact→9 unchanged).
  4. Use a straight apostrophe in the notification-success regex to match the rendered en.ts string.

Verified locally:

# Rust E2E credentials test
cargo test --test config_auth_app_state_connectivity_e2e \
  credentials_profile_store_recovers_dropped_entries_empty_files_and_datetime_errors   # 1 passed
# Inference provider suite (all 5)
cargo test --test inference_provider_admin_round22_raw_coverage_e2e                     # 5 passed
# Frontend
vitest run memoryGraphLayout OpenhumanLinkModal.notifications                           # 13 passed
prettier --check (both files)                                                           # clean
cargo fmt --check (inference test)                                                      # clean

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) — realigns four pre-existing tests to merged behavior; the drop/recover, growing-radius, and notification-success paths are all asserted.
  • Diff coverage ≥ 80%N/A: test-only change (changed lines are test code executed by the tests themselves).
  • Coverage matrix updated — N/A: no feature row added/removed/renamed; aligns existing tests with already-merged behavior.
  • All affected feature IDs listed under ## RelatedN/A: no matrix feature touched.
  • No new external network dependencies introduced.
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: test-only.
  • Linked issue closed via Closes #NNNN/A: no tracking issue; fixes CI regressions left by #3125 and #3113.

Impact

  • Platform: none at runtime — test-only.
  • CI: unblocks the required Rust E2E (mock backend) check and greens the Rust Core Coverage + Frontend Coverage jobs, all of which are currently red on main for every PR.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/auth-profile-missing-token-test
  • Commit SHA: 8be47d6758f7523b5add837d0790d7b74b0d7807

Validation Run

  • Focused Rust: cargo test --test config_auth_app_state_connectivity_e2e <credentials test> (1 passed); cargo test --test inference_provider_admin_round22_raw_coverage_e2e (5 passed).
  • Focused Vitest: vitest run memoryGraphLayout OpenhumanLinkModal.notifications (13 passed).
  • Rust fmt: cargo fmt --check on the changed test — clean.
  • Prettier: --check on both changed .test.ts(x) — clean.
  • pnpm typecheckN/A: changes are test files only; CI Frontend Quality runs full tsc.

Validation Blocked

  • command: pre-push hook (pnpm format:check/lint/compile/rust:check + lint:commands-tokens)
  • error: not run from the fix worktree (no node_modules/submodules installed); the hook's TS/Tauri steps are unrelated to these test-only fixes
  • impact: none on this diff; pushed with --no-verify. CI runs the full suite.

Behavior Changes

  • Intended behavior change: none (test-only).
  • User-visible effect: none.

Parity Contract

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None known
  • Canonical PR: This PR
  • Resolution: N/A

Summary by CodeRabbit

  • Bug Fixes
    • Improved resilience when loading incomplete OAuth profiles: invalid profiles are now dropped and references purged so loading succeeds instead of failing.
  • User Interface
    • Minor UI copy tweak for the notification message to match displayed text.
    • Adjusted memory graph node sizing so summary nodes scale with level and chunk/contact nodes use fixed sizes.
  • Tests
    • Serialised environment access in tests to prevent interference from concurrent runs.

PR tinyhumansai#3125 ("fix(auth): gracefully drop OAuth profiles with missing
access_token") changed AuthProfilesStore::load so an OAuth profile with
no access_token is dropped (like a bad-kind entry) instead of failing the
whole load. It updated src/openhuman/credentials/profiles.rs but left the
`credentials_profile_store_recovers_dropped_entries_empty_files_and_datetime_errors`
e2e test still asserting `.load().expect_err(...)` with the now-removed
"OAuth profile missing access_token" error string.

As a result the required "Rust E2E (mock backend)" check fails on main
(49 passed / 1 failed, panic at tests/config_auth_app_state_connectivity_e2e.rs
expect_err -> unwrap_failed), blocking every PR.

Update the assertion to the new contract: the missing-access-token OAuth
profile is dropped, so the load succeeds, the profile and its active
pointer are purged, and the drop is persisted back to disk -- mirroring
the existing legacy:bad-kind drop assertions in the same test.

Verified locally: the single test now passes
(`cargo test --test config_auth_app_state_connectivity_e2e \
  credentials_profile_store_recovers_dropped_entries_empty_files_and_datetime_errors`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team June 1, 2026 12:01
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Tests updated: env-var mutations serialized via a global ENV_LOCK; OAuth-profile load test now expects incomplete OAuth profiles to be dropped and persisted JSON rewritten; two UI tests use ASCII apostrophes; memoryGraphLayout test now asserts growing summary node radius with level and fixed chunk/contact radii.

Changes

Test updates and fixes

Layer / File(s) Summary
Serialize env var mutations with ENV_LOCK
tests/inference_provider_admin_round22_raw_coverage_e2e.rs
Add OnceLock<Mutex<()>> ENV_LOCK, env_lock() helper, and acquire the lock at several #[tokio::test] starts to serialize process-global env var changes.
OAuth profile missing access_token test expectations
tests/config_auth_app_state_connectivity_e2e.rs
Change test to expect successful AuthProfilesStore load that drops incomplete OAuth profiles, removes corresponding active_profiles entries, and rewrites persisted auth-profiles.json without the dropped profile.
UI notification apostrophe fixes
app/src/components/__tests__/OpenhumanLinkModal.notifications.test.tsx
Update expected notification text to use ASCII apostrophe (“didn't”) in initial success and retry-success assertions.
memoryGraphLayout nodeRadius expectations
app/src/components/intelligence/memoryGraphLayout.test.ts
Update nodeRadius test: summary node radius increases with level; chunk and contact node radii are fixed to specific values.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

rust-core

Suggested reviewers

  • graycyrus

Poem

🐰 A mutex snug for envs at play,
Tokens dropped when they've gone astray.
Apostrophes fixed, nodes grown tall,
Tests now sing—no sudden fall. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title summarizes the main change: fixing flaky/stale tests blocking Rust E2E and coverage CI. This aligns with the multiple test updates across OAuth profile handling, environment variable locking, memory graph expectations, and frontend UI strings.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label Jun 1, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 1, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanil-23 hey! the code looks good to me. The three assertions are exactly right — profile dropped from the map, active pointer purged, persisted JSON cleaned up. Mirrors the existing bad-kind drop pattern cleanly.

There are some CI failures (Playwright lane 1, Frontend Coverage, Rust Core Coverage) but they look pre-existing on main and unrelated to this test-only Rust change — the Rust E2E check itself is now green, which is the whole point of this PR. Once those other failures are resolved on main and the full suite is green, I'll come back and approve this.

Three pre-existing test failures that are red on main independently of
each other, all surfaced by the coverage CI jobs:

1. Rust Core Coverage — `inference_provider_admin_round22_raw_coverage_e2e`
   mutated `OPENHUMAN_WORKSPACE`/`OPENHUMAN_OLLAMA_BASE_URL`/`PATH` as
   process-global env via `EnvVarGuard` with no serialization. The file's
   own SAFETY comments said it was "validated with --test-threads=1", but
   `cargo llvm-cov` runs the binary's 5 tests in parallel, so the global
   mutations raced and a test read another's workspace/config — a flaky
   `nested provider failure` assertion failure. Add an `env_lock()` mutex
   (the pattern every other `*_e2e.rs` already uses) and take it at the
   top of each test so the suite is serialized regardless of thread count.

2. Frontend Coverage — `memoryGraphLayout.test.ts` still asserted the old
   shrinking `nodeRadius` (10 - level*0.8). tinyhumansai#3113 redesigned it to grow
   (5 + level*2.5; chunk 4->3) but did not update the test. Realign the
   expectations to the shipped formula.

3. Frontend Coverage — `OpenhumanLinkModal.notifications.test.tsx` queried
   the success copy with a curly apostrophe (U+2019) in "didn't", while
   en.ts renders a straight apostrophe (U+0027), so `getByText` never
   matched. Use the straight apostrophe to match the rendered string.

Test-only changes. Verified locally: the 5 inference tests pass, and the
two Vitest files pass (13 tests) with Prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 changed the title test(auth): expect dropped OAuth profile missing access_token (fixes red Rust E2E on main) test: fix flaky/stale tests blocking Rust E2E + coverage CI on main Jun 1, 2026
@coderabbitai coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label Jun 1, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/inference_provider_admin_round22_raw_coverage_e2e.rs (1)

84-89: 💤 Low value

Consider clarifying the caller's responsibility in SAFETY comments.

The SAFETY comments claim "mutation is serialized by env_lock()" but EnvVarGuard itself doesn't enforce that the lock is held—it relies on the caller convention. While this is acceptable for test-only code and all current usages correctly hold the lock, the comments could be clearer about the contract.

📝 Optional: Make the caller contract more explicit
             Some(value) => {
-                // SAFETY: mutation is serialized by `env_lock()` (see below).
+                // SAFETY: Caller must hold `env_lock()` to serialize this mutation.
                 unsafe { std::env::set_var(self.key, value) }
             }
             None => {
-                // SAFETY: mutation is serialized by `env_lock()` (see below).
+                // SAFETY: Caller must hold `env_lock()` to serialize this mutation.
                 unsafe { std::env::remove_var(self.key) }
             }

Alternatively, for stronger type safety, EnvVarGuard::set/unset could require a &MutexGuard<'static, ()> parameter to prove lock ownership at compile time, but that's likely overkill for test utilities.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/inference_provider_admin_round22_raw_coverage_e2e.rs` around lines 84 -
89, The SAFETY comments around the unsafe std::env::set_var/remove_var calls
should explicitly state that EnvVarGuard does not itself enforce the lock and
that callers are responsible for holding the global env_lock() mutex when
calling EnvVarGuard::set/EnvVarGuard::unset (or whatever methods contain the
unsafe blocks); update the SAFETY text to mention the caller contract (e.g.,
"Caller must hold env_lock() to serialize mutations") and optionally note the
stronger alternative of requiring a &MutexGuard<'static, ()> parameter on
EnvVarGuard::set/unset for compile-time proof of lock ownership if you want
stricter safety guarantees.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/inference_provider_admin_round22_raw_coverage_e2e.rs`:
- Around line 84-89: The SAFETY comments around the unsafe
std::env::set_var/remove_var calls should explicitly state that EnvVarGuard does
not itself enforce the lock and that callers are responsible for holding the
global env_lock() mutex when calling EnvVarGuard::set/EnvVarGuard::unset (or
whatever methods contain the unsafe blocks); update the SAFETY text to mention
the caller contract (e.g., "Caller must hold env_lock() to serialize mutations")
and optionally note the stronger alternative of requiring a &MutexGuard<'static,
()> parameter on EnvVarGuard::set/unset for compile-time proof of lock ownership
if you want stricter safety guarantees.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a65ba505-38e3-4521-b724-24c3b3230eff

📥 Commits

Reviewing files that changed from the base of the PR and between 1765410 and 8be47d6.

📒 Files selected for processing (3)
  • app/src/components/__tests__/OpenhumanLinkModal.notifications.test.tsx
  • app/src/components/intelligence/memoryGraphLayout.test.ts
  • tests/inference_provider_admin_round22_raw_coverage_e2e.rs
✅ Files skipped from review due to trivial changes (1)
  • app/src/components/tests/OpenhumanLinkModal.notifications.test.tsx

@graycyrus graycyrus merged commit a40cd7e into tinyhumansai:main Jun 1, 2026
21 of 26 checks passed
oxoxDev added a commit to oxoxDev/openhuman that referenced this pull request Jun 1, 2026
…ansai#3055

The forced-response chain never reaches CANARY_FINAL within 45s; the
in-process core then dies and every subsequent spec on the same
Playwright shard fails with ECONNREFUSED, cascading the lane. Pre-existing
on main (not touched by tinyhumansai#3147). test.skip with FIXME(tinyhumansai#3055) until the
persist-then-resume path is fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants