Add generated tool provenance and admission checks#2549
Conversation
📝 WalkthroughWalkthroughAdds optional provenance/capability/risk/policy fields to generated tool definitions, a GeneratedToolRisk enum, an admission pipeline that normalizes and validates definitions against configurable policies, Tool::external_effect() mapping from risk, and unit tests covering admission outcomes and risk behavior. ChangesGenerated Tool Provenance and Admission
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/tools/generated.rs (1)
206-212: ⚡ Quick winAdd structured debug logs for admission decisions.
This branch decides whether a generated tool enters the registry, but accepted/rejected outcomes are silent today. A single structured
debug!on each arm would make policy failures much easier to diagnose in production.🪵 Suggested logging shape
match validate_admission(&definition, config, &mut seen) { - Ok(()) => admitted.push(definition), - Err(reason) => rejected.push(GeneratedToolAdmissionRejection { tool_name, reason }), + Ok(()) => { + log::debug!( + "[generated_tools] admission accepted tool_name={} provider_id={:?} capability_id={:?}", + definition.name, + definition.provider_id, + definition.capability_id + ); + admitted.push(definition); + } + Err(reason) => { + log::debug!( + "[generated_tools] admission rejected tool_name={} provider_id={:?} capability_id={:?} reason={}", + tool_name, + definition.provider_id, + definition.capability_id, + reason + ); + rejected.push(GeneratedToolAdmissionRejection { tool_name, reason }); + } }As per coding guidelines, "Use log / tracing at debug or trace level on RPC entry and exit, error paths, state transitions, and any branch that is hard to infer from tests alone."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 206 - 212, Add structured debug logging for each admission decision in the loop that iterates over definitions: after computing tool_name and calling validate_admission(&definition, config, &mut seen), emit a debug-level structured log in both arms — for Ok(()) log that the tool was admitted with tool_name and any relevant fields (e.g., definition.id/name/version or normalized state via normalize_definition), and for Err(reason) log that the tool was rejected with tool_name and the reason and also create the GeneratedToolAdmissionRejection as before; place these logs adjacent to the match arms that push into admitted and rejected so the state transitions (admitted/rejected vectors) are observable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 66-69: The is_external_effect impl for GeneratedToolRisk currently
only matches ExternalWrite and Dangerous, so the Execute variant is incorrectly
considered non-external; update the matches! in
GeneratedToolRisk::is_external_effect to include Self::Execute (and mirror the
same change in the other identical implementation around the 177-181 area) so
Execute returns true, and add a small regression test next to
generated_tool_marks_external_risk_as_external_effect asserting that Execute is
treated as external.
---
Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 206-212: Add structured debug logging for each admission decision
in the loop that iterates over definitions: after computing tool_name and
calling validate_admission(&definition, config, &mut seen), emit a debug-level
structured log in both arms — for Ok(()) log that the tool was admitted with
tool_name and any relevant fields (e.g., definition.id/name/version or
normalized state via normalize_definition), and for Err(reason) log that the
tool was rejected with tool_name and the reason and also create the
GeneratedToolAdmissionRejection as before; place these logs adjacent to the
match arms that push into admitted and rejected so the state transitions
(admitted/rejected vectors) are observable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b6b4243d-8336-441a-ab29-f7de4542517d
📒 Files selected for processing (1)
src/openhuman/tools/generated.rs
|
Review note on provider/provenance coupling: this PR trims generated-tool |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/openhuman/tools/generated.rs (2)
330-341: 💤 Low valueSilent filtering of invalid provider IDs in config sets may cause confusion.
normalize_provider_setusesfilter_mapwhich silently drops entries that fail normalization. Iftrusted_providerscontains only invalid entries (e.g.,["bad/provider"]), the normalized set becomes empty and all providers will be rejected as "not trusted" — without any indication that the config itself was malformed.Consider logging a warning during admission when config entries are dropped, or validate the config at construction time. Given the planned provider registry in
#2541, this is minor, but a debug log would help diagnose misconfigurations.fn normalize_provider_set(values: &BTreeSet<String>) -> BTreeSet<String> { values .iter() - .filter_map(|value| normalize_provider_id(value)) + .filter_map(|value| { + let normalized = normalize_provider_id(value); + if normalized.is_none() { + log::debug!( + "[generated_tools] invalid provider_id in config set dropped value={value}" + ); + } + normalized + }) .collect() }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 330 - 341, normalize_provider_set silently drops entries that fail normalization which can make config errors (e.g., trusted_providers = ["bad/provider"]) invisible and cause unexpected "not trusted" rejections; update the admission logic around normalize_provider_set (used for config.trusted_providers and config.disabled_providers when checking provider_id in generated.rs) to detect and report dropped/invalid entries: compute the set of invalid inputs (or compare input length vs normalized length) and emit a warning/debug log naming the affected config field (trusted_providers/disabled_providers) and the invalid entries, or alternatively perform this validation at config construction time so malformed provider IDs are surfaced earlier.
424-665: 💤 Low valueConsider extracting tests to a sibling
generated_test.rsfile.The file is 666 lines, exceeding the ~500 line guideline. The test module (~240 lines) can be moved to a sibling file per the coding guidelines for Rust test organization.
// At end of generated.rs, replace the inline test module with: #[cfg(test)] #[path = "generated_test.rs"] mod tests;This keeps the implementation focused and reduces cognitive load when reviewing production code.
As per coding guidelines: "When extracting Rust tests out of an implementation file, prefer a sibling
*_test.rsfile wired in with#[cfg(test)] #[path = "..._test.rs"] mod tests;" and "File size should not exceed approximately 500 lines."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/generated.rs` around lines 424 - 665, The inline #[cfg(test)] mod tests { ... } in generated.rs (~240 lines) should be moved into a sibling file named generated_test.rs containing the entire test module body (keeping imports like use super::* and the EchoAdapter, sample_definition, admission_config, and all tests unchanged); then replace the removed inline module in generated.rs with a single module declaration wired to the sibling test file: #[cfg(test)] #[path = "generated_test.rs"] mod tests; and confirm all symbols referenced (GeneratedTool, GeneratedToolDefinition, admit_generated_tool_definitions, etc.) remain accessible (adjust pub/use visibility if the compiler complains).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/tools/generated.rs`:
- Around line 330-341: normalize_provider_set silently drops entries that fail
normalization which can make config errors (e.g., trusted_providers =
["bad/provider"]) invisible and cause unexpected "not trusted" rejections;
update the admission logic around normalize_provider_set (used for
config.trusted_providers and config.disabled_providers when checking provider_id
in generated.rs) to detect and report dropped/invalid entries: compute the set
of invalid inputs (or compare input length vs normalized length) and emit a
warning/debug log naming the affected config field
(trusted_providers/disabled_providers) and the invalid entries, or alternatively
perform this validation at config construction time so malformed provider IDs
are surfaced earlier.
- Around line 424-665: The inline #[cfg(test)] mod tests { ... } in generated.rs
(~240 lines) should be moved into a sibling file named generated_test.rs
containing the entire test module body (keeping imports like use super::* and
the EchoAdapter, sample_definition, admission_config, and all tests unchanged);
then replace the removed inline module in generated.rs with a single module
declaration wired to the sibling test file: #[cfg(test)] #[path =
"generated_test.rs"] mod tests; and confirm all symbols referenced
(GeneratedTool, GeneratedToolDefinition, admit_generated_tool_definitions, etc.)
remain accessible (adjust pub/use visibility if the compiler complains).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 44ec2ef0-670f-4d01-ae27-bbeba94f4f13
📒 Files selected for processing (1)
src/openhuman/tools/generated.rs
|
Okay getting more eyes on this @graycyrus can you review this? |
…ed-tool-admission # Conflicts: # app/test/e2e/specs/mega-flow.spec.ts
sanil-23
left a comment
There was a problem hiding this comment.
Approving from shepherd queue (round 2). All CI green or known flakes re-running; pnpm review fix round complete. Happy to merge once required reviewers concur.
- Log dropped invalid provider IDs in `normalize_provider_set` so config with only invalid entries no longer silently rejects every provider as "not trusted". The helper now takes a `field` label (trusted vs disabled) so the debug line names which set the bad value came from. - Move the 240-line inline test module out to a sibling `generated_tests.rs` wired with `#[cfg(test)] #[path = ...] mod tests;`, matching the local convention used by `ops_tests.rs` / `schema_tests.rs`. Drops `generated.rs` back under the ~500-line guideline. Both items are CodeRabbit nitpicks from the May 24 review on PR tinyhumansai#2549. Co-Authored-By: Claude <noreply@anthropic.com>
|
@YOMXXX — agreed that the long-term home for trusted-provider state is the registry in #2541, not a parallel
Holding the merge on the registry decision SGTM if a maintainer wants to gate it that way; happy to rebase on top of #2541 once it lands. |
sanil-23
left a comment
There was a problem hiding this comment.
Re-approving after rfix-fix push 'bfc4ea8c' addressing CodeRabbit nitpicks (admission log-on-drop + test-mod extraction to sibling file). Diff is small, all CI/tests green, no new findings. YOMXXX's architectural concern is correctly deferred to #2541. Approval stands.
|
Hey @graycyrus — could you take a look at this one? vaddisrinivas's generated-tool admission PR is CI fully green (29 success / 0 fail) on head bfc4ea8. CodeRabbit's most recent automated review came back with 0 actionable comments after the rfix push (test mod extracted to sibling, admission logging on dropped invalid provider IDs). YOMXXX's architectural concern about #2541 provider registry is acknowledged as a follow-up. Ready for a maintainer's eye. |
graycyrus
left a comment
There was a problem hiding this comment.
Solid infrastructure addition. The provenance metadata model is clean, the normalization logic handles edge cases well (separator-start/end rejection, case normalization, invalid chars), and the legacy fallback path is correctly preserved when enforcement is disabled. The external_effect impl now correctly includes Execute and Dangerous. That said, there are test coverage gaps that don't match the acceptance criteria in #2542 before this can merge.
The two blockers are both in the test coverage — specific code paths that the issue's acceptance criteria explicitly call out as required, but which have no tests in this PR.
Minor: normalize_provider_set is called twice per tool inside validate_admission (once for disabled_providers, once for trusted_providers), which means the config sets are re-normalized on every iteration of the admission loop. Pre-normalize both sets once before the loop in admit_generated_tool_definitions and pass them in, or memoize in the config. Not a correctness issue but it's unnecessary O(N×M) work.
# Conflicts: # src/openhuman/tools/generated.rs
…sion tests + pre-normalize provider sets Addresses @graycyrus's CHANGES_REQUESTED review on tinyhumansai#2549: - admission_rejects_disabled_provider test (was missing per tinyhumansai#2542 ACs) - admission_rejects_disabled_capabilities test (companion) - Pre-normalize trusted_providers + disabled_providers once before the admission loop instead of per-tool Co-Authored-By: Claude <noreply@anthropic.com>
sanil-23
left a comment
There was a problem hiding this comment.
Re-approving on f311cad — addressed all 3 of @graycyrus's items: added admission_rejects_disabled_provider test (his verbatim code), added companion admission_rejects_disabled_capabilities test, and pre-normalized trusted_providers + disabled_providers once before the admission loop (O(N×M)→O(N+M)). 43 generated tests pass; upstream/main merged in (1 conflict in generated.rs was layout-only, resolved with our superset).
graycyrus identified that commits 6226ed4, 21a7bf3, 4663136 (~393 lines in src/openhuman/tools/generated.rs) came from vaddisrinivas's pr/2549 (generated-tool admission, issue tinyhumansai#2542) and got cross-pollinated into this wallet-balances PR via an upstream/main merge that pulled in the codex/oh-2542 branch tip. Reverting restores the PR scope to wallet balances only. Once tinyhumansai#2549 merges to upstream/main, those changes will arrive here naturally via the next main merge. Refs: graycyrus's review comment on tinyhumansai#2686 (CHANGES_REQUESTED 2026-05-28T18:27Z)
graycyrus
left a comment
There was a problem hiding this comment.
Excellent work addressing all prior feedback! Both missing tests are now in place and passing:
admission_rejects_disabled_providervalidates the disabled_providers pathadmission_rejects_disabled_capabilitiesvalidates the disabled_capabilities path
The pre-normalization optimization is solid too — avoiding O(N×M) redundant work per tool.
Status: The code looks good to me. All prior issues resolved, new tests comprehensive, and the generated-tool admission infrastructure is solid. The acceptance criteria for #2542 are fully met.
Blocker: CI is currently failing on an unrelated test in core::legacy_aliases::tests::frontend_legacy_aliases_match_server_alias_table (9958 tests passed, 1 failed). This isn't caused by your PR — the generated-tool tests all pass — but it's blocking CI. Once that's resolved, I'll come back and approve this.
Great structure on the admission pipeline. The risk classification, deduplication, and provider/capability enforcement are clean and maintainable.
sanil-23
left a comment
There was a problem hiding this comment.
Re-approving after merging upstream/main (incl. #2865 parser fix that clears the inherited test/Rust Core Tests + Quality + Rust Core Coverage failures). 21 legacy_aliases tests pass locally; all #graycyrus's earlier items still satisfied (admission_rejects_disabled_provider + admission_rejects_disabled_capabilities tests + pre-normalize providers).
|
Posted #2869 with the root-cause fix: #2830's merge inadvertently reverted #2786's SessionExpired classification back to BackendUserError, so the |
sanil-23
left a comment
There was a problem hiding this comment.
Re-approving after merging upstream/main (incl. my #2869 fix that restores the SessionExpired classification for the embedding 401 'Invalid token' wire shape). The inherited test/Rust Core Tests + Quality + Rust Core Coverage failures clear with this merge — verified locally that the 4K5 test passes on the merged head.
Summary
Problem
Solution
GeneratedToolAdmissionConfig,GeneratedToolAdmissionReport, andadmit_generated_tool_definitions.Submission Checklist
## Related: N/A, no coverage-matrix feature ID applies.docs/RELEASE-MANUAL-SMOKE.md): N/A, no release-cut UI/runtime smoke surface changed.Closes #NNNin the## Relatedsection.Impact
Toolobjects.GeneratedToolDefinition::newcallers continue to work with metadata fields defaulted toNone.Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
codex/oh-2542-generated-tool-admission46631365d44929e2eef03935020cdcc5690abca6Validation Run
pnpm --filter openhuman-app format:check: N/A, Rust-only backend tool-infrastructure change.pnpm typecheck: N/A, Rust-only backend tool-infrastructure change.cargo test --manifest-path Cargo.toml generated_tool --lib;cargo test --manifest-path Cargo.toml admission_ --libcargo fmt --manifest-path Cargo.toml --all --check;git diff --checknode scripts/check-pr-checklist.mjs /tmp/openhuman-pr-body-2542.mdValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
Duplicate / Superseded PR Handling
vaddisrinivas:codex/oh-2542-generated-tool-admission.Summary by CodeRabbit
New Features
Tests