test: expand unit + e2e coverage from test-map (batch 3)#1778
Conversation
Adds Rust unit tests and mega-flow scenarios for 7 high-priority features picked from test-map.md, continuing the pattern of tinyhumansai#1724 (batch 1) and tinyhumansai#1752 (batch 2). Rust (124 new tests, all passing): - update/ops_tests.rs (11): update.version contract — CARGO_PKG_VERSION, target triple, openhuman-core-{triple} asset prefix, no remote calls; update.run orchestration result shapes (already_current, missing_asset, apply_failure, supervisor strategy); update_run rejected when rpc_mutations_enabled=false. - service/mock_tests.rs (15): OPENHUMAN_SERVICE_MOCK truthy/falsy parsing; state machine (NotInstalled → Stopped → Running → Stopped → NotInstalled); forced-failure injection via JSON state file; dispatch routing to mock when env flag set. - notifications/bus_tests.rs (14): event_to_notification mapping for WebhookProcessed (silent <400, emits ≥400, error wins over status, error text replaces status code, deep_link target); CronJobCompleted id format and deep_link; SubagentCompleted/Failed deep_links + error truncation; subscriber publish/subscribe roundtrip. - notifications/store_tests.rs (13): exists_recent dedup window — true for identical (provider, account, title, body) within 60s, false past 60s boundary, false when any field differs, NULL account_id handling; insert_if_not_recent preserves first entry, allows re-insert after expiry, allows distinct provider/body. - tokenjuice/text_tests.rs (26): strip_ansi multi-byte safety (CJK/emoji/Arabic/ZWJ); dedupe_adjacent edge cases; clamp_text_middle grapheme safety + omission marker; 3-layer overlay precedence (project > user > builtin); rule loader gracefully handles invalid regex (single bad pattern dropped, all-bad → empty, disk-loaded bad regex doesn't panic). - agent/harness/session/transcript_tests.rs (+4): find_root_transcript for thread isolation, empty thread_id, newest-when-multiple, excludes subagent files. E2E (mega-flow.spec.ts, +200 lines): - Scenario 12: openhuman.update_version RPC contract — semver, target_triple, asset prefix; no outbound mock request to update host. - Scenario 13: openhuman.notification_ingest dedup — same payload twice yields ≤1 entry on notification_list. - Scenario 14: threads create_new → message_append → messages_list roundtrip on a single thread within one session. Invalid / out-of-scope items (reported, not faked): - service.shutdown flush-window delay logic — the delay lives in ShutdownSubscriber, not the RPC handler; the existing inline tests already cover the event-publish contract. - Frontend ServiceBlockingGate state transitions — TS/React component, no Rust counterpart. - Turn-state mirror persistence and startup interruption marking — no Rust persistence struct; handled at the UI hydration layer. - service.* RPC mock-backend lifecycle scenario — env var read at process boot, can't flip mid-session; no service namespace in any controller schema either way. - Pick 7 (conversation threads RPCs) — already fully covered by an existing conversations/store_tests.rs (37+ tests) from a prior batch; not duplicated. Mega-flow scenario 14 still added as a thin end-to-end smoke through the threads namespace.
📝 WalkthroughWalkthroughAdds unit tests for notifications::bus (webhook translation, deep-link/ID invariants, publish/subscribe) and for tokenjuice text utilities and rule loader (ANSI stripping, dedupe, grapheme-safe truncation, overlay precedence, invalid-regex handling). All changes are test-only. ChangesNotifications bus tests
Tokenjuice text & rule loader tests
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
app/test/e2e/specs/mega-flow.spec.ts (1)
606-658: ⚡ Quick winAdd explicit failure diagnostics for the new E2E scenarios.
These scenarios log happy-path details, but they don’t capture structured diagnostics on assertion failure (request-log snapshot + accessibility dump), which slows triage for flaky/CI-only failures.
A small helper wrapper per scenario (try/catch → print
getRequestLog()+dumpAccessibilityTree()→ rethrow) would satisfy the diagnostics rule consistently.As per coding guidelines, "Add failure diagnostics (request logs,
dumpAccessibilityTree()) in E2E specs for faster debugging by agents".Also applies to: 667-721, 731-790
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/e2e/specs/mega-flow.spec.ts` around lines 606 - 658, The test lacks structured failure diagnostics; wrap the body of the "it('update.version: returns version...')" test (and the other affected specs) in a try/catch that on catch logs getRequestLog() and await dumpAccessibilityTree(), then rethrows so failures still fail the test; implement a small helper (e.g. withFailureDiagnostics(async () => { ... })) that accepts the test async function, runs it, and on error prints JSON.stringify(getRequestLog(), null, 2) and the result of await dumpAccessibilityTree() before rethrowing; update the calls around callOpenhumanRpc(), getRequestLog(), and dumpAccessibilityTree() in this spec and the other noted specs (lines ~667-721 and ~731-790) to use the helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/test/e2e/specs/mega-flow.spec.ts`:
- Around line 650-654: The current no-egress check only inspects getRequestLog()
from the mock server and misses direct outbound HTTP calls (so regressions in
openhuman.update_version could still egress); update the test to force all
outbound HTTP through a controllable interceptor and assert zero hits there —
e.g., configure the test runtime to use an HTTP proxy or set the update host
override to point to the mock server (or inject a mocked HTTP client used by
openhuman.update_version), then replace the getRequestLog() assertion with an
assertion against the proxy/interceptor's request log (or verify the injected
client received no requests) to ensure no real network egress occurs.
In `@src/openhuman/notifications/bus_tests.rs`:
- Around line 198-237: The test is order-dependent because it assumes the first
message from subscribe_core_notifications() is the one just published; change
the assertion to wait for and assert on the matching CoreNotificationEvent by
looping over rx.try_recv() (or rx.recv_timeout) until you find an event with id
== "test-123" (or until a short timeout/attempt limit) and ignore other
unrelated events; update the code around publish_core_notification(evt.clone()),
replacing the single rx.try_recv() assert with a loop that discards non-matching
events and only asserts the id/title/category when a matching
CoreNotificationEvent is received (fail if timeout/attempts exhausted).
In `@src/openhuman/tokenjuice/text_tests.rs`:
- Around line 126-145: The tests clamp_text_middle_output_is_valid_utf8 and
clamp_text_middle_does_not_split_emoji currently only assert UTF-8 validity
(tautological for String); change them to assert grapheme-boundary correctness:
call clamp_text_middle(&s, n) and then verify that the retained prefix and
suffix each end/start on full grapheme cluster boundaries (e.g., using
unicode_segmentation::UnicodeSegmentation::graphemes or GraphemeCursor) and that
concatenating those preserved graphemes equals the same sequence of grapheme
clusters from the original string at those ranges; additionally add a specific
check for ZWJ-family clusters so neither prefix nor suffix contains a partial
ZWJ sequence. Ensure you update both test functions
(clamp_text_middle_output_is_valid_utf8 and
clamp_text_middle_does_not_split_emoji) and the similar checks mentioned for
lines 183-191.
- Around line 401-406: The test currently uses an if-let that silently passes if
the "test/disk-bad-regex" rule is missing; change it to assert that a rule with
rule.id == "test/disk-bad-regex" is present and then assert its
compiled.skip_patterns is empty. Specifically, replace the if-let lookup over
rules.iter().find(|r| r.rule.id == "test/disk-bad-regex") with a firm assertion
that the find returned Some(...) (or use unwrap()/expect with a clear message),
then check bad.compiled.skip_patterns.is_empty() with the existing failure
message so the test fails when the bad-rule is dropped as well as when its
skip_patterns are non-empty.
---
Nitpick comments:
In `@app/test/e2e/specs/mega-flow.spec.ts`:
- Around line 606-658: The test lacks structured failure diagnostics; wrap the
body of the "it('update.version: returns version...')" test (and the other
affected specs) in a try/catch that on catch logs getRequestLog() and await
dumpAccessibilityTree(), then rethrows so failures still fail the test;
implement a small helper (e.g. withFailureDiagnostics(async () => { ... })) that
accepts the test async function, runs it, and on error prints
JSON.stringify(getRequestLog(), null, 2) and the result of await
dumpAccessibilityTree() before rethrowing; update the calls around
callOpenhumanRpc(), getRequestLog(), and dumpAccessibilityTree() in this spec
and the other noted specs (lines ~667-721 and ~731-790) to use the helper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 77d20c0c-c2bc-4aa6-a762-e3a3fc226e4a
📒 Files selected for processing (12)
app/test/e2e/specs/mega-flow.spec.tssrc/openhuman/agent/harness/session/transcript_tests.rssrc/openhuman/notifications/bus.rssrc/openhuman/notifications/bus_tests.rssrc/openhuman/notifications/store.rssrc/openhuman/notifications/store_tests.rssrc/openhuman/service/mock.rssrc/openhuman/service/mock_tests.rssrc/openhuman/tokenjuice/mod.rssrc/openhuman/tokenjuice/text_tests.rssrc/openhuman/update/ops.rssrc/openhuman/update/ops_tests.rs
- notifications/bus_tests: drain up to 64 events when looking for the test event so the assertion is order-independent on the shared global broadcast bus (parallel tests can interleave). - tokenjuice/text_tests: replace tautological from_utf8 checks in the three clamp_text_middle grapheme tests with real boundary assertions — every output grapheme must equal a complete grapheme of the source (or the omission marker), and ZWJ family clusters must never split. - tokenjuice/text_tests: tighten invalid_regex_loaded_from_disk_is_ skipped_not_fatal — assert the bad-regex rule is present (not silently dropped) and that its compiled skip_patterns is empty.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/tokenjuice/text_tests.rs (1)
137-142: ⚡ Quick winTighten the ASCII whitelist to avoid false positives in grapheme-safety assertion.
This
continuepath allows any ASCII grapheme, so unexpected ASCII corruption inclampedcan still pass. Prefer assertingclampedassource_prefix + omission_marker + source_suffixat grapheme boundaries instead of broadly whitelisting ASCII.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tokenjuice/text_tests.rs` around lines 137 - 142, The current grapheme-safety check lets any ASCII grapheme (g) pass, which can hide corruption; instead assert that clamped equals the concatenation of source_prefix + omission_marker + source_suffix at grapheme cluster boundaries: compute grapheme clusters for clamped and the original source, verify that clamped is exactly the prefix clusters + omission_marker + suffix clusters (using the same grapheme segmentation) rather than allowing any ASCII via g.chars().all(...); update the test around the variables g, clamped, source_prefix, omission_marker, and source_suffix to perform this exact-match assertion on grapheme clusters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/tokenjuice/text_tests.rs`:
- Around line 137-142: The current grapheme-safety check lets any ASCII grapheme
(g) pass, which can hide corruption; instead assert that clamped equals the
concatenation of source_prefix + omission_marker + source_suffix at grapheme
cluster boundaries: compute grapheme clusters for clamped and the original
source, verify that clamped is exactly the prefix clusters + omission_marker +
suffix clusters (using the same grapheme segmentation) rather than allowing any
ASCII via g.chars().all(...); update the test around the variables g, clamped,
source_prefix, omission_marker, and source_suffix to perform this exact-match
assertion on grapheme clusters.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0bfd1958-7213-41be-a9ac-3f36890c27c6
📒 Files selected for processing (2)
src/openhuman/notifications/bus_tests.rssrc/openhuman/tokenjuice/text_tests.rs
Summary
Continues the test-map.md coverage push from #1724 (batch 1) and #1752 (batch 2). Adds 124 Rust unit tests plus 3 mega-flow E2E scenarios across 7 high-priority features picked from test-map.md.
What's covered
Rust unit tests (all passing locally):
src/openhuman/update/ops_tests.rssrc/openhuman/service/mock_tests.rssrc/openhuman/notifications/bus_tests.rssrc/openhuman/notifications/store_tests.rssrc/openhuman/tokenjuice/text_tests.rssrc/openhuman/agent/harness/session/transcript_tests.rsE2E (
app/test/e2e/specs/mega-flow.spec.ts, +200 lines):openhuman.update_versionRPC contract — semver,target_triple,openhuman-core-<triple>asset prefix; no outbound mock request to update host.openhuman.notification_ingestdedup — same payload twice yields ≤1 entry onnotification_list.openhuman.threads_create_new→threads_message_append→threads_messages_listroundtrip.Items reported as invalid / out-of-scope
service.shutdownflush-window delay — lives inShutdownSubscriber, not the RPC handler; existing inline tests cover the publish contract.ServiceBlockingGatestate transitions — TS/React component, no Rust counterpart.service.*mock-backend lifecycle as an E2E scenario — env var read at process boot, can't flip mid-session; noservicenamespace in any controller schema either way.conversations/store_tests.rs(37+ tests) from a prior batch; mega-flow scenario 14 added as a thin smoke instead.Submission Checklist
Impact
Test-only PR. No runtime behavior change. Adds 124 Rust unit tests (all green via
cargo test --lib) and 3 new mega-flow E2E scenarios in the existing single-session spec.Related
AI Authored PR Metadata
Linear Issue
Commit & Branch
test/expand-coverage-batch-3Validation Run
cargo check --tests— clean (warnings pre-existing)cargo test --lib bus_tests store_tests mock_tests ops_tests transcript_tests text_tests— 124 new tests, 0 failedBehavior Changes
Parity Contract
Summary by CodeRabbit