fix(security): round command-log truncation to UTF-8 boundary#1817
Conversation
…mansai#1813) Five log::warn!/info!/debug! sites in SecurityPolicy::validate_command_execution truncated `command` with a naked byte slice `&command[..command.len().min(80)]`, which panicked the core thread on multi-byte UTF-8 characters straddling byte 80 — e.g. the 3-byte `'魔'` in real Sentry repros (OPENHUMAN-TAURI-GW). Route each site through `crate::openhuman::util::floor_char_boundary`, the same helper `markdown_body_preview()` uses for the same class of bug (tinyhumansai#1751). Adds two regression tests in policy_tests.rs covering the multi-byte cmd panic and a short-CJK command edge case. Closes tinyhumansai#1813
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughReplaces byte-length-based command truncation with UTF‑8 character-boundary-safe truncation in validate_command_execution logging and adds regression tests for multibyte UTF‑8 characters at and below the 80-byte truncation boundary. ChangesUTF-8 Safe Command Logging
Possibly related issues
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/security/policy_tests.rs`:
- Around line 282-291: The test's "high-risk-blocked path" never reaches the
high-risk truncating log because the inner command (e.g., "findstr") isn't in
the policy allowlist, so validation fails earlier; update the SecurityPolicy
used in this block (the SecurityPolicy instance built with
AutonomyLevel::Supervised and variable high_risk_policy) so its allowed_commands
includes the inner command used in cmd (or replace cmd with an allowlisted
command that still contains the chain operator '|'), then call
high_risk_policy.validate_command_execution(cmd, true) as before to ensure the
chain operator triggers the high-risk branch and hits the truncating warn! site.
Ensure references to SecurityPolicy, allowed_commands,
AutonomyLevel::Supervised, and validate_command_execution are the ones you
modify.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9eeccfcb-d136-4bf6-a18c-3eedd0069d94
📒 Files selected for processing (2)
src/openhuman/security/policy.rssrc/openhuman/security/policy_tests.rs
… test (addresses @coderabbitai on policy_tests.rs:282) The previous high-risk-blocked path used `cmd` + pipe (`|`) which failed allowlist validation on `findstr` before reaching the high-risk risk gate, so the truncating warn! at block_high_risk_commands was never executed. Replace with a `curl`-based fixture that: - passes allowlist (curl is in allowed_commands) - triggers CommandRiskLevel::High (curl is a high-risk command) - hits the block_high_risk_commands warn! branch with a multi-byte char straddling byte 80, confirming floor_char_boundary prevents the panic there Also add assertions that the result is Err and the error contains "high-risk", turning the branch into a proper regression guard.
CI failure diagnosis — both failures are pre-existing on
|
Summary
&command[..command.len().min(80)]log-truncation slices inSecurityPolicy::validate_command_executionthat aborted the core thread when the command contained a multi-byte UTF-8 char straddling byte 80.crate::openhuman::util::floor_char_boundaryhelper — the same helper that fixed the equivalent body-preview bug in fix(memory_tree, e2e tests ): deterministic query_topic ordering + robust CEF cleanup #1751.policy_tests.rscovering the real Sentry repro (CJK in acmd /ccommand) and a short multi-byte edge case.Problem
OPENHUMAN-TAURI-GW fires a
level=fatal,mechanism=panic,handled=noevent on Windows whenever a Chinese / Japanese / Korean user's command happens to put a multi-byte UTF-8 character across byte 80 of the command string. Two real-world repros:cmd /c "dir /b "%USERPROFILE%\Desktop\*.lnk" 2>nul | findstr /i "Warcraft WoW é”å…½ Battle"—'é”'lives at bytes 78..81, slice at 80 panics.python -c "import os; os.makedirs(r'D:\项目1', exist_ok=True); ..."—'ç›®'lives at bytes 78..81, same panic.Root cause is five identical naked byte slices used purely for log truncation in
src/openhuman/security/policy.rs(lines 501, 512, 519, 535, 546).Solution
Replace each
&command[..command.len().min(80)]with&command[..floor_char_boundary(command, 80)]. The helper already lives insrc/openhuman/util.rsand is documented as rounding a byte index down to the nearest UTF-8 boundary — exactly the semantics needed for a log preview that should never expand past its cap.This is the same helper
markdown_body_preview()started using in #1751 after the equivalent panic family hit memory-tree ingest (OPENHUMAN-TAURI-G2/GR/GQ/GK/KQ).Submission Checklist
policy.rsis exercised by the two new tests (which fire all three truncating log paths) plus the five pre-existingvalidate_command_executiontests.## Related— N/A: no coverage-matrix feature IDs were changed.docs/RELEASE-MANUAL-SMOKE.md) — N/A: pure log-string fix, no release-cut surface touched.Closes #NNNin the## RelatedsectionImpact
floor_char_boundarywalks backwards at most 3 bytes (max UTF-8 char width − 1); cost is dwarfed by the surroundinglog::warn!formatter.Related
markdown_body_preview_respects_utf8_boundary_and_byte_captest insrc/openhuman/memory/tree/ingest.rs:501(fix(memory_tree, e2e tests ): deterministic query_topic ordering + robust CEF cleanup #1751).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/1813-policy-char-boundary-panic8cb9c09dValidation Run
pnpm --filter openhuman-app format:check— N/A: noapp/files changed.pnpm typecheck— N/A: no TS files changed.cargo test --lib -- openhuman::security::policy→ 6 passed; 0 failed.cargo fmt --check -- policy.rs policy_tests.rsclean;cargo checkclean (only pre-existing unrelated warnings inslack_backfill.rs).Validation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
SecurityPolicy::validate_command_executionno longer panics when a command contains a multi-byte UTF-8 character at byte 80 of its log preview.Parity Contract
floor_char_boundaryreturnss.len()when the index is past the end).Duplicate / Superseded PR Handling
Summary by CodeRabbit
Summary by CodeRabbit
Bug Fixes
Tests