docs(governance): audit-trail internal-vs-shareable views + Addie anonymous knowledge tools#3175
Merged
docs(governance): audit-trail internal-vs-shareable views + Addie anonymous knowledge tools#3175
Conversation
…nymous knowledge tools Three related changes from a governance WG question on how to design audit logging: 1. New doc page docs/governance/campaign/audit-trail.mdx — explains the internal-vs-shareable view split for get_plan_audit_logs with a field-by-field tagging table and four worked examples generated by scripts/gen-governance-audit-examples.ts: clean buy, security-shaped denial (seller_compliance), coaching-shaped denial (Annex III prerequisite), and the enforce/advisory/audit mode comparison. All schema-tagged JSON blocks validate against the canonical schemas via tests/json-schema-validation. 2. Wire anonymous web-chat callers to receive search_docs, get_doc, search_repos, search_resources, and get_recent_news. Anonymous web chat previously had directory tools only; the system prompt told Addie to call search_docs and the tool wasn't registered, producing speculative answers instead of grounded ones. The MCP chat path already exposed the same set; this aligns the web path. Anonymous tool count: 7 → 13. 3. Constraints rule "Tool Unavailable Is Not 'No Result'" — distinguishes tool-returned-empty (say "I didn't find it") from tool-unavailable (say "I couldn't reach docs search; sign in for grounded answers"), and caps retries at one. Generalizes beyond search_docs to every tool. 4. skills/adcp-governance/SKILL.md gains the campaign-governance task surface (sync_plans, check_governance, report_plan_outcome, get_plan_audit_logs) and three operator-facing invariants: inline policies cannot relax registry policies, effective_date enables informational-before-enforcement, governance_context is the seller-visible correlation token while plan-level data is buyer-side. Filed and resolved upstream from this work: #3139, #3140, #3156 (→ #3160 merged), #3162 (→ #3163 merged), #3169 merged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rom anonymous knowledge tools Security fixes from expert review of PR #3175: 1. **Private working-group docs leaking via search_docs** server/src/db/working-group-db.ts: getIndexedDocumentsWithContent now filters wg.is_private = false. Without this, exposing search_docs to anonymous web-chat (the headline change in this PR) widens the surface from "anyone with an MCP client" to "anyone on the public web" for committee minutes, brand-confidential briefs, and draft policy entries indexed from private working groups. Authenticated WG members still access their content via the WG-specific pages — that path doesn't ride this index. 2. **Prompt-injection pipeline via user-bookmarked URLs** bookmark_resource (Slack-authenticated) queues arbitrary URLs that get fetched, summarized into addie_notes, and surfaced via search_resources/ get_recent_news. Anonymous Addie inherits that surface and the notes land in its prompt as "Addie's Take." Two-layer fix: - DB layer: searchCuratedResources / getRecentNews accept excludeUserSubmitted to drop source_type='web_search' (and 'community' for news) from results. - Handler layer: createKnowledgeToolHandlers gains an `anonymous` option; when true, passes excludeUserSubmitted through and strips addie_notes from formatted output. Both addie-chat.ts (web) and chat-tool.ts (MCP) pass anonymous: true on the global registration. Authenticated callers get the full handler via per-request override (claude-client.ts:594 "last wins" merge). Doc / prompt-engineering nits: 3. audit-trail.mdx: add entries[].mode and entries[].purchase_type rows to the field-tagging table (mode is the whole point of #3160 #3156); note that budget.utilization_pct is a one-step inverse problem (just as leaky as raw amounts); soften "on request" wording for drift_metrics to match the §103 "protocol does not define a regulator API" caveat. 4. constraints.md: tighten "Tool Unavailable Is Not 'No Result'" rule — drop fragile back-reference, replace upsell script with a behavioral shape (don't pitch; one line is enough), make the no-retry rule unambiguous ("Do not retry. One failure is the signal."). 5. SKILL.md: rename "Three invariants to lead with" to "Three invariants for audit and disclosure decisions" (the skill loads into orchestrator context, not a chat persona); demote effective_date to a doc-link note and promote plan_hash as the third load-bearing invariant for counterparty disclosure. 6. addie-chat.ts: anonymousTools log line now uses claudeClient.getRegisteredTools().length as source of truth instead of a hand-rolled sum that would silently drift. 7. gen-governance-audit-examples.ts: FIXME annotation referencing FRAMEWORK_MIGRATION.md so the in-flight server.server._requestHandlers migration sweep catches this script. Verified end-to-end: anonymous local Addie answering the original WG question still produces grounded retrieval (6 tool calls, 0 errors, cites the audit-trail doc) — the security scoping doesn't break the legitimate use case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Pushed Must-Fix (security)
Must-Fix (doc gap)
Should-Fix
Nits skipped (per reviewer note)
Changeset bump — keeping Verified end-to-end: anonymous local Addie answering the original WG question produces grounded retrieval (6 tool calls, 0 errors, cites the audit-trail doc) — the security scoping doesn't break the legitimate use case. CI re-running. |
This was referenced Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three related changes from a governance WG question on Slack about how to design audit logging.
What's in this PR
1. New doc:
docs/governance/campaign/audit-trail.mdxExplains the internal-vs-shareable view split for
get_plan_audit_logswith:scripts/gen-governance-audit-examples.ts:seller_compliancefinding when an unauthorized seller tries to commit)data_subject_contestationmissing — denial doubles as actionable guidance)enforce/advisory/audit)All 6 schema-tagged JSON blocks validate against the canonical schemas via
tests/json-schema-validation.test.cjs.2. Anonymous web-chat gets
search_docs/get_doc/search_repos/search_resources/get_recent_newsPreviously: Addie's system prompt told her to call
search_docsto ground answers, but the web-chat anonymous path didn't register the tool. Result: she'd call a non-existent tool, get "Unknown tool", and fall through to in-prompt speculation. The MCP chat path already exposed these read-only tools to anonymous callers; this aligns the web path.Anonymous tool count: 7 → 13. The full knowledge-search surface (Slack history, bookmarking) still requires authentication.
Verified the fix end-to-end: production Addie answering this WG question produced speculation about Policy Registry versioning ("are policies versioned by timestamp, version number, or content hash?"); local Addie with the fix produced a grounded answer naming
policy_id,version,effective_date,enforcement— fields she retrieved viasearch_docs.3. Constraints rule: "Tool Unavailable Is Not 'No Result'"
Adds a section to
server/src/addie/rules/constraints.mddistinguishing three outcomes:Generalizes beyond
search_docsto every tool. Caps the failure mode where Addie tries the same broken tool three times in a row.4. Skill update:
skills/adcp-governance/SKILL.mdAdds the campaign-governance task surface (
sync_plans,check_governance,report_plan_outcome,get_plan_audit_logs) and three operator-facing invariants:enforcementlevelseffective_dateenables informational-before-enforcement (the "minimal restrictions initially" pattern)governance_contextis the seller-visible correlation token; plan-level data (budget aggregates, drift metrics, channel allocation) is buyer-sideIssues filed and resolved during this work
governance-mode.jsonx-status: experimental+ per-check description tighteningTest plan
npm run test:docs-navpassesnpm run test:json-schemapasses (255 schema-tagged blocks validate)npm run typecheckpassesserver/tests/unit/training-agent.test.ts— all governance audit-log tests passnode tests/json-schema-validation.test.cjs --file docs/governance/campaign/audit-trail.mdx— 6/6 schema-tagged blocks valid.context/addie-before.jsonand.context/addie-after-real-fix.json(anonymous chat went from speculation to grounded retrieval)🤖 Generated with Claude Code