Skip to content

feat: apply max budget enforcement to all safe-output tools#91

Merged
jamesadevine merged 3 commits intomainfrom
feat/generic-max-budget
Mar 30, 2026
Merged

feat: apply max budget enforcement to all safe-output tools#91
jamesadevine merged 3 commits intomainfrom
feat/generic-max-budget

Conversation

@jamesadevine
Copy link
Copy Markdown
Collaborator

@jamesadevine jamesadevine commented Mar 30, 2026

Summary

Previously only update-work-item and comment-on-work-item had the max frontmatter option and budget enforcement in Stage 2 execution. The other four tools (create-work-item, create-pull-request, create-wiki-page, update-wiki-page) had no limits.

This PR raises max to a first-class concept on the ToolResult trait, making budget enforcement automatic and consistent across all safe-output tools.

Design

Each tool declares its default budget via the ToolResult trait. The tool_result! macro accepts an optional default_max parameter. Operators can still override max in front-matter YAML. In execute.rs, a register_budgets! macro builds the budget HashMap from concrete types.

Changes

  • src/tools/result.rs: Added DEFAULT_MAX to ToolResult trait; extended tool_result! macro with optional default_max arm
  • src/tools/*.rs: Removed max field and default_max() from all 6 config structs; simplified to derive(Default) where possible
  • src/execute.rs: Replaced per-tool budget tracking with type-driven register_budgets! macro + resolve_max() helper
  • src/execute.rs: Fixed UTF-8 panic in title truncation; sanitize control chars in log context
  • AGENTS.md: Documented max for all safe-output tools

Testing

  • All existing tests pass
  • New tests: DEFAULT_MAX trait constant, resolve_max with/without config, UTF-8 title truncation, control char stripping, create-work-item budget enforcement, mixed-tool independent budgets

Move the max safe-output budget from per-config-struct fields to the
ToolResult trait as DEFAULT_MAX, making it a first-class concept that
each tool declares at the type level.

- Add DEFAULT_MAX associated constant (default: 1) to ToolResult trait
- Extend tool_result! macro with optional default_max parameter
- Remove max field and default_max() from all 6 config structs
- Replace MaxConfig deserialization with resolve_max() that reads
  the operator's frontmatter override, falling back to T::DEFAULT_MAX
- Use register_budgets! macro to build budget map from concrete types
- Simplify config structs to derive(Default) where possible

Adding a new budgeted tool now only requires:
1. Set default_max in the tool_result! macro (or override DEFAULT_MAX)
2. Add the type to register_budgets!() in execute.rs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jamesadevine jamesadevine force-pushed the feat/generic-max-budget branch from b318654 to 2bf3cf8 Compare March 30, 2026 16:42
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid refactor — the generic budget enforcement approach is clean and well-tested, but there's one panic-risk bug and a couple of maintainability concerns worth addressing.

Findings

🐛 Bugs / Logic Issues

  • src/execute.rs:304extract_entry_context uses a byte-index slice &title[..40] to truncate titles. title.len() returns bytes, not characters, so a multi-byte UTF-8 character that straddles byte index 40 will cause a runtime panic. Titles with Chinese characters, emoji, or accented text can trigger this.

    // Current (panics on non-ASCII):
    let truncated = if title.len() > 40 { &title[..40] } else { title };
    
    // Safe fix:
    let truncated = if title.chars().count() > 40 {
        &title[..title.char_indices().nth(40).map(|(i, _)| i).unwrap_or(title.len())]
    } else {
        title
    };

⚠️ Suggestions

  • src/tools/comment_on_work_item.rs:467 — The test that previously asserted config.max == 1 (the default) was changed to assert!(config.target.is_some()). That's weaker than intended: the YAML being parsed is "max: 3\ntarget: \"*\"". Now there's no test verifying that an operator-supplied max in the front-matter config JSON is correctly honoured by resolve_max. Consider adding a test_resolve_max_for_comment_on_work_item test (the generic test_resolve_max_uses_config_override in execute.rs covers the mechanism, so this is low priority).

  • src/tools/result.rs / macro — The new default_max = $default_max arm of tool_result! is defined but none of the existing callers use it (all six tools fall back to the trait default of 1). This is fine as a future extension point, but it means the macro arm is currently dead code. A doc comment on the macro explaining when to use each arm would be helpful.

  • src/execute.rsregister_budgets! — This is a clear improvement, but adding a new ToolResult implementor still requires two manual steps: implementing the trait and adding to register_budgets!. If the two get out of sync (a new tool is added without registering its budget), there's no compile-time warning. A comment above register_budgets! explicitly calling this out would reduce the chance of a future omission.

✅ What Looks Good

  • The resolve_max design (reading max from the raw config JSON rather than requiring it as a typed field on every config struct) is a nice way to avoid scattering #[serde(default = "default_max")] boilerplate across six structs.
  • Budget is consumed before execution (counting failed attempts against the limit) — the existing comment explains the rationale well.
  • The new tests (test_budget_enforcement_create_work_item_max, test_budget_enforcement_mixed_tools_independent_budgets) give solid coverage of the new behaviour.
  • Removing hand-rolled Default impls in favour of #[derive(Default)] is correct and cleaner.

Generated by Rust PR Reviewer for issue #91 ·

- Fix UTF-8 panic in extract_entry_context: use char_indices() for
  safe truncation instead of byte-offset slicing
- Add safety comment above register_budgets! noting the manual sync
  requirement when adding new ToolResult implementors
- Expand tool_result! macro doc comment to explain both arms
  (with and without default_max)
- Add test for multi-byte UTF-8 title truncation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — the budget generalization is correct and well-tested. One ongoing maintenance concern worth flagging.

Findings

⚠️ Suggestions

  • src/execute.rs:register_budgets! — no compile-time guard for missing tools
    The comment correctly notes the problem: "There is no compile-time guard for this." If someone adds a new ToolResult implementor and forgets to add it to register_budgets!, the budget is silently unenforced. One way to harden this would be an exhaustive match at compile time (e.g. a const assertion or a match arm in execute_safe_output that can't compile without updating both places), but given the comment is explicit and visible, this is acceptable as-is — just worth knowing.

  • src/execute.rs:extract_entry_context — unsanitized path appears in log output
    extract_entry_context reads the raw path field from the NDJSON entry before execute_sanitized is called (sanitization only happens inside execute_impl). If an agent injects a path containing newlines or other control characters, the warn! call in check_budget could produce misleading log entries. A cheap fix is path.replace(['\n', '\r'], " ") before formatting. Very low severity, but worth noting given the security-conscious design.

  • PR description mentions a MaxConfig helper struct that isn't in the final code
    The description says "Added MaxConfig helper struct to extract max from any tool config JSON generically" and "Config structs - added max: u32 (default: 1)" — but neither is in the final implementation. The max is read generically by resolve_max directly from the JSON blob (consistent with the pre-existing design in UpdateWorkItemConfig). The final approach is actually cleaner; the description just describes an intermediate state.

✅ What Looks Good

  • UTF-8 truncation fix is correctchar_indices().nth(40) is the right way to truncate at character boundaries; the 41-emoji regression test proves it.
  • Budget consumed before execution is intentional and documented (prevents retry abuse against failing endpoints).
  • Independent per-tool budgetstest_budget_enforcement_mixed_tools_independent_budgets correctly verifies that create-work-item and create-wiki-page budgets don't bleed into each other.
  • AGENTS.md updated correctlymax is documented for all four previously unlimited tools.
  • noop / missing-tool / missing-data correctly excluded from the budget map.
  • resolve_max safely handles max: 0 (disables the tool) and invalid values (falls back to default via as_u64() returning None).

Generated by Rust PR Reviewer for issue #91 ·

extract_entry_context runs before execute_sanitized, so raw NDJSON
strings could contain newlines or other control characters that
produce misleading log entries. Strip control chars from title and
path fields before formatting into log messages.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — clean, correct refactor with a couple of minor observations worth noting.

Findings

⚠️ Suggestions

  • src/execute.rs:319path context is not truncated, unlike title

    In extract_entry_context, the title branch caps output at 40 chars (UTF-8 safe), but the path branch does not:

    // title: truncated to 40 chars ✓
    let truncated: &str = if clean.chars().count() > 40 { ... };
    
    // path: no truncation — full string is logged as-is
    let clean: String = path.chars().filter(|c| !c.is_control()).collect();
    return format!(" (path: {})", clean);

    An agent could emit a path entry with hundreds of characters (e.g. a long wiki path) and it would land verbatim in the log line. Not a security issue given log-only scope, but the inconsistency is surprising and could make logs messy. Consider adding a similar cap (e.g. 80 chars for paths).

  • src/execute.rs:96-97 — No compile-time guard for register_budgets!

    The comment itself calls this out: "There is no compile-time guard for this." The current list covers all live ToolResult implementors, but noop, missing-data, and missing-tool are intentionally absent — future contributors adding a new tool need to know to add it here. Consider a doc comment on the ToolResult trait itself pointing to register_budgets!, so the trait definition is the canonical place to look.

  • src/tools/create_pr.rs:189-190 — Manual ToolResult impl doesn't use the macro

    CreatePrResult implements ToolResult by hand rather than via tool_result!. It only sets NAME and inherits DEFAULT_MAX = 1 from the trait default — which is correct — but it's an inconsistency that could quietly diverge if the trait gains new associated items in future.

✅ What Looks Good

  • UTF-8 safe truncation in extract_entry_context is correctly implemented using char_indices().nth(40), and the emoji boundary test is a nice regression guard.
  • resolve_max correctly reads from raw JSON in ctx.tool_configs, meaning max: in front-matter YAML continues to work for all tools even though it was removed from the typed structs (UpdateWorkItemConfig, CommentOnWorkItemConfig). Backwards compatibility is preserved.
  • #[derive(Default)] replacements for the config structs are correct — all bool fields default to false and all Option(T) fields default to None, matching the previous hand-written impls exactly.
  • Budget consumed before execution (the pre-decrement pattern) is preserved and prevents unbounded retries against failing endpoints.
  • All tests pass.

Generated by Rust PR Reviewer for issue #91 ·

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dev.azure.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dev.azure.com"

See Network Configuration for more information.

@jamesadevine jamesadevine merged commit e88d8da into main Mar 30, 2026
11 checks passed
@jamesadevine jamesadevine deleted the feat/generic-max-budget branch March 30, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant