Skip to content

Fix issue #176: Make security documentation accessible#329

Merged
bokelley merged 4 commits intomainfrom
bokelley/fix-issue-176
Dec 17, 2025
Merged

Fix issue #176: Make security documentation accessible#329
bokelley merged 4 commits intomainfrom
bokelley/fix-issue-176

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented Dec 17, 2025

Summary

Fixes #176 (SECURITY.md links to non-existent security documentation) and addresses #177 (risk and security model concerns).

Changes for Issue #176

  • Add docs/reference/security page to Mintlify navigation (was missing, causing 404 errors)
  • Fix documentation links from https://docs.adcontextprotocol.org/reference/security to correct path /docs/reference/security
  • Fix GitHub link in security.mdx (was pointing to .mdx instead of .md)
  • Update authorization scopes table with missing tasks: list_authorized_properties, build_creative, preview_creative, and activate_signal

Changes for Issue #177

  • Add Risk Classification section categorizing operations as high/medium/low risk
  • Recommend OAuth 2.0 or mTLS (SHOULD) for financial operations (create_media_buy, update_media_buy)
  • Add Bearer Token Risks section documenting: domain hijacking, MITM, token replay vulnerabilities
  • Add mitigations table and request signing example for high-value transactions

Testing

All tests pass including schema validation, example validation, and JSON schema compliance checks.

🤖 Generated with Claude Code

bokelley and others added 4 commits December 17, 2025 05:56
Add security page to Mintlify navigation and fix broken documentation links.

Changes:
- Add docs/reference/security to docs.json navigation (fixes #176)
- Fix documentation URLs from /reference/security to /docs/reference/security
- Fix GitHub link to SECURITY.md (was pointing to .mdx extension)
- Update authorization scopes table with missing tasks (list_authorized_properties, build_creative, preview_creative, activate_signal)

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Addresses concerns raised in #177 about token-based authentication risks.

Changes:
- Add Risk Classification section categorizing operations as high/medium/low risk
- Recommend OAuth 2.0 or mTLS (SHOULD) for financial operations
- Document bearer token risks: domain hijacking, MITM, token replay
- Add mitigations table and request signing example for high-value transactions

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@bokelley bokelley merged commit 9d675b4 into main Dec 17, 2025
6 checks passed
bokelley added a commit that referenced this pull request May 11, 2026
…4358)

Polls http://worker.process.<app>.internal:8080/internal/jobs from web every
60s and emits logger.error after 3 consecutive failures. logger.error
auto-routes to #admin-errors via the existing posthog notifier, so the
alert lands in Slack without any extra wiring.

Closes the silent-death gap behind escalation #329: the worker
crashlooped for 6 days. Every scheduled job (compliance heartbeat,
escalation triage, weekly digest, announcement handlers, etc.) silently
stopped firing and the only signal a user got was Evgeny's stale "comply
re-runner" status. workerUnreachable was already in the /api/admin/jobs
response shape but nothing was polling it.

Fire-once semantics: alert when streak crosses threshold, info-level
recovery line on the first successful tick after an alert. Without these
a flapping worker would page on every tick.

5 vitest cases cover threshold-crossing, single-alert dedup, recovery
transition, non-2xx as failure, and streak reset on transient success.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 11, 2026
…p) (#4361)

One-off script that explains why a given agent URL isn't being picked up
by the compliance-heartbeat queue. Mirrors the getAgentsDueForCheck SQL
so the output matches what the heartbeat would see:

  1. Union-source presence — discovered_agents, agent_registry_metadata,
     member_profiles.agents (the three places the heartbeat reads from)
  2. Registry metadata filters — lifecycle_stage, compliance_opt_out,
     monitoring_paused, check_interval_hours
  3. Current agent_compliance_status row
  4. Position in the due queue (row_number with batch size 10/tick)

--requeue flag clears last_checked_at to force pickup on the next tick.

Background: escalation #329 — Evgeny's agent showed last_checked_at =
May 4 despite multiple heartbeat ticks running in the 6+ hour window
after the worker came back. Needed a way to distinguish (a) filtered
out, (b) queued behind NULL last_checked_at agents, (c) not in any
source table. Existing endpoints required workos admin auth that the
ADMIN_API_KEY user identity can't satisfy.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 11, 2026
…o keys (#4364)

* fix(compliance): rewrite deriveStoryboardStatuses for SDK 6.x scenario keys

The compliance heartbeat has been writing zero rows to
agent_storyboard_status since the SDK switched comply() to storyboard-
driven testing. The SDK emits one TestResult per phase of each storyboard,
keyed `<storyboard_id>/<phase_id>` in result.tracks[].scenarios[].scenario
(see @adcp/sdk compliance/storyboard-tracks.ts). The old implementation
walked the YAML's per-step `comply_scenario` field (bare names like
`signals_flow`, `capability_discovery`) and looked them up in the SDK's
scenario map. Every lookup missed → testedCount === 0 → every storyboard
skipped at the `continue` guard.

Effect across the registry:
  agent_storyboard_status total rows: 6  (across 4 agents)
  rows written by triggered_by='heartbeat': 0
  rows surviving were legacy bare-name keys from old manual runs

This silently broke the AAO Verified badge pipeline (no storyboard rows
→ deriveVerificationStatus has nothing to verify against) and every
agent's dashboard `storyboards_passing: 0 / N` was misleading: the
runner wasn't failing storyboards, the parser was dropping them.

Surfaced by escalation #329: Evgeny's agent was running 30/30 scenarios
clean but showing `degraded` because specialism_status.signal-owned read
'untested' from a never-populated agent_storyboard_status row.

Fix: read SDK output directly. Group scenarios by storyboard id, roll
per-step pass counts up from each phase's `steps` array, fall back to
phase-level counts when steps are absent. The `storyboardIds` override
is preserved for explicit-IDs callers that need an `untested` entry
when the runner didn't run a requested storyboard. The unused YAML
`comply_scenario` field is no longer load-bearing for status mapping
(the SDK already knows which storyboards it ran).

Tests: 9 cases covering all-pass, partial, all-fail, phase-only fallback,
legacy bare-name skip, empty input, and explicit-IDs untested gap.

Stack note: this is orthogonal to Emma's #4247 compliance-state
unification stack (#4250, #4263, #4264, #4268, #4274) which collapses
agent_test_history into agent_compliance_runs. Different files; rebases
cleanly in either order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(scripts): test-comply-storyboard-statuses — local harness for the fix

Runs comply() against an agent URL and prints what
deriveStoryboardStatuses would produce, without DB writes. Used to
validate the SDK-6.x scenario-key fix against real agents
(adcp-signals-adaptor.evgeny-193.workers.dev/mcp and
wonderstruck.sales-agent.scope3.com/mcp) before merging.

Will stay useful for future SDK upgrades that touch scenario emission
or storyboard-track aggregation — same pattern as the
diagnose-agent-comply-queue script from #4361.

Usage:
  npx tsx server/src/scripts/test-comply-storyboard-statuses.ts <agent-url> [<agent-url> ...]

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(compliance): code review nits — clarify steps doc, hoist explicit-ids check, add 3 edge tests

Addresses code-reviewer feedback on PR #4364:
- JSDoc on deriveStoryboardStatuses now calls out that steps_passed/total
  are not directly comparable across rows (some rows are real step counts,
  some are phase-level fallbacks when the SDK omits per-step data).
- Comment pinning the storyboard-id invariant (flat ids, no `/`) so the
  indexOf split stays correct as new storyboards land.
- Defensive `result.tracks ?? []` so a malformed result doesn't throw.
- Hoist `storyboardIds && length > 0` into a single `hasExplicitIds`
  const used at both the toEmit decision and the no-data fallback.
- Three new test cases:
  * same storyboard split across multiple tracks aggregates correctly
  * result.tracks absent → []
  * non-string scenario values (null, number) → skipped without throwing

12/12 vitest passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 11, 2026
…4374)

Adds an "X / Y storyboards passing" element between the SDK headline
("2 silent" etc.) and the track pills, with a tooltip explaining the
relationship:

  storyboards = canonical conformance unit
                (each applicable specialism + protocol baseline +
                universal check is one storyboard, pass or fail)
  track pills = SDK's coarse roll-up that can read as "passing" even
                when underlying storyboards are partial — useful for
                quick glance but misleading in isolation

Track pills gain their own tooltip pointing readers at the Verification
panel for per-storyboard detail.

Resolves the Evgeny-shape disconnect from escalation #329: track
summary showed "2 silent / 30 of 30 scenarios passing" while the
agent's signal_owned specialism storyboard was 1/5 steps. With the
data flowing correctly after PR #4364, this surface change closes the
loop on the adtech-product reviewer's "deprecate track summary on the
public dashboard, keep it operator-only" call by making the storyboard
count visually prominent and clarifying that the SDK track pills are
debug context.

Push A item 4 of 4 in the compliance reporting fidelity initiative.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SECURITY.md links to comprehensive security documentation that does not exist

1 participant