Skip to content

fix(registry): server-side agent-type resolution at registration#3498

Merged
EmmaLouise2018 merged 3 commits into
mainfrom
EmmaLouise2018/agent-type-guard
Apr 29, 2026
Merged

fix(registry): server-side agent-type resolution at registration#3498
EmmaLouise2018 merged 3 commits into
mainfrom
EmmaLouise2018/agent-type-guard

Conversation

@EmmaLouise2018
Copy link
Copy Markdown
Contributor

Summary

Refs #3495. Merges after #3496 and #3497 (third in the stack).

This is the prevention layer. PR #3496 fixed the discovered-side inference; PR #3497 backfilled member-registered rows. Without this PR, a client can re-introduce the bug at any time by PUT-ing type: 'buying' for a sales agent — member-profiles.ts previously accepted any string verbatim.

Root cause of the prevention gap

  • PUT /api/me/member-profile (line 439) — no validation of agents[].type
  • POST /api/me/member-profile (line 183) — same
  • PUT /api/admin/member-profiles/:id (line 1696) — same
  • MemberDatabase.normalizeAgentConfig (server/src/db/member-db.ts:38-39) — copied the string through with as AgentConfig['type']
  • AgentConfig.type declared as AgentType | 'buyer' (server/src/types.ts:379) — the legacy 'buyer' slack normalized into the type system
  • Hard-coded 'buying' defaults in dev-setup.ts:88, 109-110 (Training Agent + Acme seeds) and adcp-tools.ts:767 (OAuth context creation before discovery)

Changes

server/src/routes/member-profiles.tsresolveAgentTypes() helper

New helper called from all three write paths. For each agent in the array:

  1. Look up agent_capabilities_snapshot for the URL via AgentSnapshotDatabase.bulkGetCapabilities
  2. If a snapshot exists and its inferred_type validates against the AgentType enum, override the client's type with it. The capability snapshot is ground truth — the client's value is a hint at best.
  3. If the client's type is a string but doesn't validate, replace with 'unknown'
  4. Otherwise leave as-is

Wired into POST /, PUT /, and admin PUT /:id.

server/src/db/member-db.ts — defense in depth

normalizeAgentConfig (called on every JSONB → AgentConfig deserialization) now validates type via isValidAgentType and drops invalid values instead of casting them through. So even pre-existing legacy strings ('buyer', 'seller') won't propagate to the API response.

server/src/types.ts — tighten the type

  • AgentConfig.type from AgentType | 'buyer' to AgentType
  • isValidAgentType parameter widened from string | undefined | null to unknown so it can be used as a type guard in deserialization where the input shape is JSONB-derived

server/src/dev-setup.ts — fix wrong seed values

  • Training Agent: type: 'buying''sales' (it's an embedded sales-agent reference impl per mcp-tools.ts:1200)
  • Acme: type: 'buyer''buying', type: 'seller''sales' (those legacy values aren't in the AgentType enum)
  • Acme offerings: seller_agentsales_agent (seller_agent isn't in VALID_MEMBER_OFFERINGS — the prior value was being silently rejected during validation)

server/src/addie/mcp/adcp-tools.ts:767

OAuth-context creation no longer hardcodes agent_type: 'buying' — defaults to 'unknown' and lets the discovery probe set the real value when it runs.

Why this stacks on top of #3496 + #3497

Each PR is independent codewise; the stacking is purely runtime/correctness ordering.

Test plan

bokelley
bokelley previously approved these changes Apr 29, 2026
Copy link
Copy Markdown
Contributor

@bokelley bokelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

Defense-in-depth done right: prevention at write (resolveAgentTypes), normalization at deserialization (normalizeAgentConfig), and tightening AgentConfig.type from AgentType | 'buyer' to AgentType so the legacy slack can't sneak back in via the type system.

The seed/default fixes are real bug catches:

  • Acme seller_agent was being silently rejected against VALID_MEMBER_OFFERINGS — that's the kind of thing you only find when you tighten the type
  • adcp-tools.ts:767 defaulting OAuth context to 'buying' before discovery is exactly the wrong direction; 'unknown' until probed is the principled answer

One runtime dependency to call out: resolveAgentTypes overrides the client's type with agent_capabilities_snapshot.inferred_type. Existing snapshot rows for sales agents currently store 'buying' (populated by the buggy inferTypeFromProfile before #3496). Asked Emma over on #3496 to extend migration 453 to backfill that column too — without it, this PR's prevention layer will enforce the stale wrong answer on member registrations between deploy and next crawl cycle.

With that snapshot backfill landed in #3496, this is the right capstone for the stack. Merge order: 3496 → 3497 → 3498.

EmmaLouise2018 added a commit that referenced this pull request Apr 29, 2026
Per follow-up review: agent_capabilities_snapshot.inferred_type was
populated by the same buggy inferTypeFromProfile call (crawler.ts:577),
and is read in two places that would otherwise propagate the stale
'buying' value:

1. registry-api.ts:3465-3467 — uses inferred_type to fill in agent type
   when the registered type is 'unknown'. Sales agents would surface as
   'buying' in the public registry until they were re-probed.
2. PR #3498's resolveAgentTypes() — uses inferred_type as the
   authoritative type source on every member-profile write. Stale rows
   there would silently override a correctly client-supplied 'sales'
   back to 'buying', defeating the prevention layer for any agent
   probed before this fix shipped.

Cheapest place to land it is here in the existing discovery-side
migration. Same root cause, same fix pattern, same migration.
EmmaLouise2018 added a commit that referenced this pull request Apr 29, 2026
Main merged migration 453 (PR #2153, AAO Verified agent badge system)
while this stack was open, causing CI to fail with:

  Duplicate migration version 453:
    453_agent_verification_badges.sql
    453_fix_misclassified_sales_agents.sql

Renumbered our migration to 454. PR #3497 and #3498 will need similar
bumps to 455 and 456 respectively to keep the stack working.
Prevent clients from registering an agent with a wrong `type` (e.g. a sales
agent typed 'buying') by overriding the request payload with the inferred
type from the capability snapshot whenever we have probed the URL. Closes
the prevention gap that #3496 (discovery-side fix) and PR #3497 (member-
side data backfill) cannot cover on their own.

- routes/member-profiles.ts: new resolveAgentTypes() helper called from
  POST /, PUT /, and PUT /:id (admin). Reads agent_capabilities_snapshot,
  uses inferred_type when present, otherwise validates the client value
  against the AgentType enum and stores 'unknown' for non-enum strings.
- db/member-db.ts: normalizeAgentConfig validates type via isValidAgentType
  on deserialization, dropping legacy 'buyer'/'seller' rather than leaking
  them through the AgentConfig['type'] cast.
- types.ts: tighten AgentConfig.type to AgentType (was AgentType | 'buyer').
  Widen isValidAgentType signature to unknown so the deserialization-layer
  guard compiles.
- dev-setup.ts: Training Agent type is 'sales' (it's a sales reference
  impl); Acme Buyer/Seller agents use the canonical 'buying'/'sales'
  values; offerings array uses 'sales_agent' (the invalid 'seller_agent'
  was being silently rejected by VALID_MEMBER_OFFERINGS).
- addie/mcp/adcp-tools.ts: OAuth-context creation defaults agent_type to
  'unknown' instead of hardcoding 'buying' before any discovery has run.
…+ drop legacy types

Red-team review found three issues:

BLOCKER — resolveAgentTypes fell through to client-supplied type when a
snapshot existed but inferred_type was null (probe failed / OAuth-required
/ unclassified). That's exactly the smuggle window: a malicious client
could register a sales agent as 'buying' for any URL whose probe didn't
classify cleanly. Fixed: snapshot rows with null inferred_type now force
'unknown'. Trust silence over the client.

Tighten — FederatedAgent.type still allowed AgentType | 'buyer'. Same
shape as AgentConfig.type, same fix.

Migration 455 — pre-PR rows can carry type 'buyer' / 'seller' (older dev
seeds + early API callers). After this PR's normalizeAgentConfig change,
those silently disappear on read. Backfill them in-place so the registry
reflects user intent: 'buyer' -> 'buying', 'seller' -> 'sales'.
Cascade from main's 453 collision: PR #3496 took 454, PR #3497 took
455, so this PR's migration bumps to 456 to stay above the stack.
@EmmaLouise2018 EmmaLouise2018 force-pushed the EmmaLouise2018/agent-type-guard branch from 5d5c94b to 34a2532 Compare April 29, 2026 16:55
@EmmaLouise2018 EmmaLouise2018 merged commit 95e6f21 into main Apr 29, 2026
15 checks passed
@EmmaLouise2018 EmmaLouise2018 deleted the EmmaLouise2018/agent-type-guard branch April 29, 2026 16:58
EmmaLouise2018 added a commit that referenced this pull request Apr 30, 2026
…reclassify-on-disagreement (#3541)

Refs #3538.

PR #3498 added `resolveAgentTypes()` server-side, but it only runs on writes
(POST/PUT to /api/me/member-profile). Rows saved before #3498 never get
re-evaluated. The crawler's type-update path at `crawler.ts:580` only wrote
back when the stored type was missing — once any non-unknown value was set,
the row was frozen.

This is the cleanup for Problem 1 in #3538.

## Crawler type-update policy (crawler.ts)

Old: write back only when no stored type and inferred is non-unknown.

New:
- Promote when stored is missing OR stored is 'unknown' AND inferred is
  non-unknown. Same intent as before, broadened to cover the 'unknown' case
  that was previously frozen.
- Log a warning on disagreement (stored non-unknown != inferred non-unknown).
  Do NOT auto-flip — single probes can be wrong; auto-flipping would corrupt
  good rows on a transient bad probe. Operator runs the backfill explicitly
  to reconcile.

## Backfill script (server/scripts/backfill-member-agent-types.ts)

Walks every `member_profiles` row, calls `resolveAgentTypes()` on its
`agents[]`, writes back any agent whose stored type disagrees with the
snapshot's inferred type. Idempotent. Has a `--dry-run` mode.

```
npx tsx server/scripts/backfill-member-agent-types.ts --dry-run
npx tsx server/scripts/backfill-member-agent-types.ts
```

## Export

`resolveAgentTypes` is now exported from `member-profiles.ts` so the script
can reuse it. The backfill is the same logic as the write path; pushing the
abstraction up rather than duplicating it.

## Test plan

- New: `server/tests/unit/crawler-type-update-policy.test.ts` — pins the
  promote/disagreement matrix. 5/5 pass.
- `npx tsc --noEmit -p server/tsconfig.json` — clean.

## Operator note

Run `--dry-run` first on staging to see the diff, then again on prod.
Bidcliq and Swivel ('buying' but actually sales) are the known cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants