fix(registry): server-side agent-type resolution at registration#3498
Conversation
bokelley
left a comment
There was a problem hiding this comment.
Approved.
Defense-in-depth done right: prevention at write (resolveAgentTypes), normalization at deserialization (normalizeAgentConfig), and tightening AgentConfig.type from AgentType | 'buyer' to AgentType so the legacy slack can't sneak back in via the type system.
The seed/default fixes are real bug catches:
- Acme
seller_agentwas being silently rejected againstVALID_MEMBER_OFFERINGS— that's the kind of thing you only find when you tighten the type adcp-tools.ts:767defaulting OAuth context to'buying'before discovery is exactly the wrong direction;'unknown'until probed is the principled answer
One runtime dependency to call out: resolveAgentTypes overrides the client's type with agent_capabilities_snapshot.inferred_type. Existing snapshot rows for sales agents currently store 'buying' (populated by the buggy inferTypeFromProfile before #3496). Asked Emma over on #3496 to extend migration 453 to backfill that column too — without it, this PR's prevention layer will enforce the stale wrong answer on member registrations between deploy and next crawl cycle.
With that snapshot backfill landed in #3496, this is the right capstone for the stack. Merge order: 3496 → 3497 → 3498.
Per follow-up review: agent_capabilities_snapshot.inferred_type was populated by the same buggy inferTypeFromProfile call (crawler.ts:577), and is read in two places that would otherwise propagate the stale 'buying' value: 1. registry-api.ts:3465-3467 — uses inferred_type to fill in agent type when the registered type is 'unknown'. Sales agents would surface as 'buying' in the public registry until they were re-probed. 2. PR #3498's resolveAgentTypes() — uses inferred_type as the authoritative type source on every member-profile write. Stale rows there would silently override a correctly client-supplied 'sales' back to 'buying', defeating the prevention layer for any agent probed before this fix shipped. Cheapest place to land it is here in the existing discovery-side migration. Same root cause, same fix pattern, same migration.
Main merged migration 453 (PR #2153, AAO Verified agent badge system) while this stack was open, causing CI to fail with: Duplicate migration version 453: 453_agent_verification_badges.sql 453_fix_misclassified_sales_agents.sql Renumbered our migration to 454. PR #3497 and #3498 will need similar bumps to 455 and 456 respectively to keep the stack working.
Prevent clients from registering an agent with a wrong `type` (e.g. a sales agent typed 'buying') by overriding the request payload with the inferred type from the capability snapshot whenever we have probed the URL. Closes the prevention gap that #3496 (discovery-side fix) and PR #3497 (member- side data backfill) cannot cover on their own. - routes/member-profiles.ts: new resolveAgentTypes() helper called from POST /, PUT /, and PUT /:id (admin). Reads agent_capabilities_snapshot, uses inferred_type when present, otherwise validates the client value against the AgentType enum and stores 'unknown' for non-enum strings. - db/member-db.ts: normalizeAgentConfig validates type via isValidAgentType on deserialization, dropping legacy 'buyer'/'seller' rather than leaking them through the AgentConfig['type'] cast. - types.ts: tighten AgentConfig.type to AgentType (was AgentType | 'buyer'). Widen isValidAgentType signature to unknown so the deserialization-layer guard compiles. - dev-setup.ts: Training Agent type is 'sales' (it's a sales reference impl); Acme Buyer/Seller agents use the canonical 'buying'/'sales' values; offerings array uses 'sales_agent' (the invalid 'seller_agent' was being silently rejected by VALID_MEMBER_OFFERINGS). - addie/mcp/adcp-tools.ts: OAuth-context creation defaults agent_type to 'unknown' instead of hardcoding 'buying' before any discovery has run.
…+ drop legacy types Red-team review found three issues: BLOCKER — resolveAgentTypes fell through to client-supplied type when a snapshot existed but inferred_type was null (probe failed / OAuth-required / unclassified). That's exactly the smuggle window: a malicious client could register a sales agent as 'buying' for any URL whose probe didn't classify cleanly. Fixed: snapshot rows with null inferred_type now force 'unknown'. Trust silence over the client. Tighten — FederatedAgent.type still allowed AgentType | 'buyer'. Same shape as AgentConfig.type, same fix. Migration 455 — pre-PR rows can carry type 'buyer' / 'seller' (older dev seeds + early API callers). After this PR's normalizeAgentConfig change, those silently disappear on read. Backfill them in-place so the registry reflects user intent: 'buyer' -> 'buying', 'seller' -> 'sales'.
5d5c94b to
34a2532
Compare
…reclassify-on-disagreement (#3541) Refs #3538. PR #3498 added `resolveAgentTypes()` server-side, but it only runs on writes (POST/PUT to /api/me/member-profile). Rows saved before #3498 never get re-evaluated. The crawler's type-update path at `crawler.ts:580` only wrote back when the stored type was missing — once any non-unknown value was set, the row was frozen. This is the cleanup for Problem 1 in #3538. ## Crawler type-update policy (crawler.ts) Old: write back only when no stored type and inferred is non-unknown. New: - Promote when stored is missing OR stored is 'unknown' AND inferred is non-unknown. Same intent as before, broadened to cover the 'unknown' case that was previously frozen. - Log a warning on disagreement (stored non-unknown != inferred non-unknown). Do NOT auto-flip — single probes can be wrong; auto-flipping would corrupt good rows on a transient bad probe. Operator runs the backfill explicitly to reconcile. ## Backfill script (server/scripts/backfill-member-agent-types.ts) Walks every `member_profiles` row, calls `resolveAgentTypes()` on its `agents[]`, writes back any agent whose stored type disagrees with the snapshot's inferred type. Idempotent. Has a `--dry-run` mode. ``` npx tsx server/scripts/backfill-member-agent-types.ts --dry-run npx tsx server/scripts/backfill-member-agent-types.ts ``` ## Export `resolveAgentTypes` is now exported from `member-profiles.ts` so the script can reuse it. The backfill is the same logic as the write path; pushing the abstraction up rather than duplicating it. ## Test plan - New: `server/tests/unit/crawler-type-update-policy.test.ts` — pins the promote/disagreement matrix. 5/5 pass. - `npx tsc --noEmit -p server/tsconfig.json` — clean. ## Operator note Run `--dry-run` first on staging to see the diff, then again on prod. Bidcliq and Swivel ('buying' but actually sales) are the known cases.
Summary
Refs #3495. Merges after #3496 and #3497 (third in the stack).
This is the prevention layer. PR #3496 fixed the discovered-side inference; PR #3497 backfilled member-registered rows. Without this PR, a client can re-introduce the bug at any time by
PUT-ingtype: 'buying'for a sales agent —member-profiles.tspreviously accepted any string verbatim.Root cause of the prevention gap
PUT /api/me/member-profile(line 439) — no validation ofagents[].typePOST /api/me/member-profile(line 183) — samePUT /api/admin/member-profiles/:id(line 1696) — sameMemberDatabase.normalizeAgentConfig(server/src/db/member-db.ts:38-39) — copied the string through withas AgentConfig['type']AgentConfig.typedeclared asAgentType | 'buyer'(server/src/types.ts:379) — the legacy'buyer'slack normalized into the type system'buying'defaults indev-setup.ts:88, 109-110(Training Agent + Acme seeds) andadcp-tools.ts:767(OAuth context creation before discovery)Changes
server/src/routes/member-profiles.ts—resolveAgentTypes()helperNew helper called from all three write paths. For each agent in the array:
agent_capabilities_snapshotfor the URL viaAgentSnapshotDatabase.bulkGetCapabilitiesinferred_typevalidates against theAgentTypeenum, override the client'stypewith it. The capability snapshot is ground truth — the client's value is a hint at best.typeis a string but doesn't validate, replace with'unknown'Wired into POST
/, PUT/, and admin PUT/:id.server/src/db/member-db.ts— defense in depthnormalizeAgentConfig(called on every JSONB →AgentConfigdeserialization) now validatestypeviaisValidAgentTypeand drops invalid values instead of casting them through. So even pre-existing legacy strings ('buyer','seller') won't propagate to the API response.server/src/types.ts— tighten the typeAgentConfig.typefromAgentType | 'buyer'toAgentTypeisValidAgentTypeparameter widened fromstring | undefined | nulltounknownso it can be used as a type guard in deserialization where the input shape is JSONB-derivedserver/src/dev-setup.ts— fix wrong seed valuestype: 'buying'→'sales'(it's an embedded sales-agent reference impl permcp-tools.ts:1200)type: 'buyer'→'buying',type: 'seller'→'sales'(those legacy values aren't in theAgentTypeenum)seller_agent→sales_agent(seller_agentisn't inVALID_MEMBER_OFFERINGS— the prior value was being silently rejected during validation)server/src/addie/mcp/adcp-tools.ts:767OAuth-context creation no longer hardcodes
agent_type: 'buying'— defaults to'unknown'and lets the discovery probe set the real value when it runs.Why this stacks on top of #3496 + #3497
agent_capabilities_snapshot.inferred_type, which is populated by the crawler usinginferTypeFromProfile(). That function only returns'sales'for sales agents after fix(registry): infer discovered agents with sales tools as type 'sales' #3496 merges. If this PR ships before fix(registry): infer discovered agents with sales tools as type 'sales' #3496, the prevention layer would still produce'buying'for sales agents — same wrong answer, just enforced server-side.'sales'actually render correctly in the public registry. Without fix(registry): backfill member-registered sales agents + agents.html UI #3497, anything this PR labels'sales'would render as'Unclassified'.Each PR is independent codewise; the stacking is purely runtime/correctness ordering.
Test plan
npm run typechecknpx vitest run server/tests/unit/— 2564/2564 pass (one pre-existing flakymember-contexttest passed on retry)npx vitest run server/tests/unit/member-profiles.test.ts— passesPUT /api/me/member-profilewithtype: 'buying'for a known sales-agent URL stores'sales'in JSONB; new agent registrations surface as'sales'in the registry once their capabilities are probed