Skip to content

Commit d540ef2

Browse files
raahulrahlclaude
andcommitted
feat(gateway): fetch AgentCard on first contact — activate DID fallback
Resolves the second high-severity entry in BUGS_AND_KNOWN_ISSUES.md — the `peer.card` fallback at bindu/client/index.ts:196 was dead code because nothing in the codebase ever populated it. Signature verification required a pinnedDID; with none set, maybeVerifySignatures always short-circuited. New src/bindu/client/agent-card.ts — fetchAgentCard(peerUrl, opts): - GETs /.well-known/agent.json, Zod-parses against the AgentCard schema, caches per-process. - 2-second default timeout (short because it blocks the first call per peer; callers can override). - Every failure mode degrades to null: non-2xx, malformed JSON, schema mismatch, network error, abort. Failures are cached too so a flaky peer doesn't cost one outbound fetch per /plan. Wired into runCall at bindu/client/index.ts: fetches only when `trust.verifyDID: true` AND peer.card isn't already set. No cost on peers that don't care about verification. Mutation of input.peer.card is safe because PeerDescriptor is built fresh per catalog entry per request. End-to-end effect: `trust.verifyDID: true` WITHOUT `pinnedDID` is now a real feature. The gateway observes the peer's published DID, resolves its public key via the DID document, and verifies every artifact signature against it. Pinned wins over observed when both are set (pinning is a stronger security claim — the caller vouched for the identity, while observed just trusts what the peer published). Coverage: tests/bindu/agent-card-fetch.test.ts — 9 cases: - success path parses valid card - 404 returns null - malformed JSON returns null - schema mismatch returns null - network throw returns null - cache hit skips re-fetch (same reference returned) - negative cache skips re-fetch on known-bad peer - per-URL isolation — different peers fetched independently - timeout honored via internal AbortController BUGS_AND_KNOWN_ISSUES.md: original AgentCard entry marked RESOLVED with cross-refs to the helper and test. A new medium entry now tracks the remaining follow-up — SSE's agent_did field still only reflects pinned DIDs, not observed ones. That's plan-route.ts layer (findPinnedDID → findAgentDID with observed fallback), scheduled next. Typecheck clean, 210/210 tests pass (+9 agent-card-fetch). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent b0c0b8a commit d540ef2

File tree

4 files changed

+313
-13
lines changed

4 files changed

+313
-13
lines changed

gateway/docs/BUGS_AND_KNOWN_ISSUES.md

Lines changed: 53 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -47,26 +47,66 @@ Shipped on branch `feat/gateway-recipes` prior to merge.
4747

4848
---
4949

50-
### 🔴 `peer.card` is never populated — AgentCard-based DID fallback is dead code
50+
### ✅ RESOLVED — `peer.card` is now populated via `fetchAgentCard`
5151

52-
**Where:** [`src/bindu/client/index.ts:196`](../src/bindu/client/index.ts:196)
52+
**Was at:** [`src/bindu/client/index.ts:196`](../src/bindu/client/index.ts:196) (the fallback itself unchanged — it was correct, just starved of input).
53+
54+
**Fixed by:** new [`src/bindu/client/agent-card.ts`](../src/bindu/client/agent-card.ts)
55+
`fetchAgentCard(peerUrl, opts)` that GETs `<peer_url>/.well-known/agent.json`,
56+
Zod-parses it, and caches per-process. Wired into `runCall` at
57+
[`bindu/client/index.ts:153-159`](../src/bindu/client/index.ts:153):
5358

5459
```ts
55-
const did = peer.trust.pinnedDID ?? (peer.card ? getPeerDID(peer.card) : null)
60+
if (!input.peer.card && input.peer.trust?.verifyDID) {
61+
const card = await fetchAgentCard(input.peer.url, { signal: input.signal })
62+
if (card) input.peer.card = card
63+
}
5664
```
5765

58-
The `peer.card` branch is permanently `null` — nothing in the codebase
59-
fetches `/.well-known/agent.json` or sets `PeerDescriptor.card`. Grep
60-
confirms: no writes to `peer.card =` anywhere. Either:
66+
Now `trust.verifyDID: true` *without* a `pinnedDID` is a real feature:
67+
the gateway observes the peer's published DID, resolves its public key
68+
via the DID document, and verifies every artifact signature against it.
69+
The pinned path still wins when both are set.
70+
71+
Design choices:
72+
- **Fetch only when `verifyDID: true`** — no network cost on peers that
73+
don't care about verification.
74+
- **2-second timeout** — short because it blocks the first call per
75+
peer. Failures degrade to null (same behavior as before the fix).
76+
- **Per-process cache includes negatives** — a flaky peer doesn't cost
77+
one outbound fetch per `/plan`.
78+
- **Mutation of `input.peer.card`** is safe because `PeerDescriptor`
79+
is built fresh per catalog entry per request — no cross-session
80+
leak.
81+
82+
Coverage: [`tests/bindu/agent-card-fetch.test.ts`](../tests/bindu/agent-card-fetch.test.ts)
83+
— 9 cases (success / 404 / malformed / schema-mismatch / network
84+
failure / cache hit / negative cache / per-URL isolation / timeout).
85+
86+
**Remaining follow-up — SSE `agent_did` still null without pinnedDID.**
87+
`plan-route.ts`'s [`findPinnedDID()`](../src/api/plan-route.ts:314) only
88+
reads `trust.pinnedDID`. The observed DID from `peer.card` isn't
89+
surfaced in the SSE stream yet. Separate ticket; tracked below as
90+
🟠 "SSE `agent_did` doesn't surface observed DIDs".
91+
92+
---
93+
94+
### 🟠 SSE `agent_did` doesn't surface observed DIDs
95+
96+
**Where:** [`src/api/plan-route.ts:314-316`](../src/api/plan-route.ts:314)`findPinnedDID` only reads `trust.pinnedDID` from the request catalog.
6197

62-
1. Implement AgentCard fetch at first call per session (the "option C"
63-
from the earlier design discussion), OR
64-
2. Delete the fallback and mark `peer.card` as reserved for a future
65-
feature — pretending the fallback exists is worse than owning up.
98+
Consequence: even after `fetchAgentCard` populates `peer.card` inside the
99+
Bindu client (see the resolved AgentCard-fallback entry above), the SSE
100+
stream still emits `agent_did: null` unless the caller pinned a DID up
101+
front. Signature verification works against the observed DID; the
102+
display layer doesn't know about it.
66103

67-
The plan-route's [`findPinnedDID()`](../src/api/plan-route.ts:314) also
68-
only reads `pinnedDID`, so `agent_did` in SSE is `null` whenever the
69-
caller didn't pin. Users quickly trip on this.
104+
**Fix:** rename to `findAgentDID(request, agentName)`, add an observed-DID
105+
path (reads a per-plan cache populated after the first successful call
106+
publishes a Bus event with the observed DID), and add an optional
107+
`agent_did_source: "pinned" | "observed" | null` field on SSE frames so
108+
consumers can tell the provenance. Detailed in the earlier design
109+
discussion; ~80 LOC including tests.
70110

71111
---
72112

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
import { AgentCard } from "../protocol/agent-card"
2+
3+
/**
4+
* Fetch and parse a peer's AgentCard from `/.well-known/agent.json`.
5+
*
6+
* Populated via this helper, the `peer.card` field on a PeerDescriptor
7+
* activates the DID fallback in `maybeVerifySignatures` — when the
8+
* caller sets `trust.verifyDID: true` but didn't pin a DID, the
9+
* gateway can recover the peer's published DID from its AgentCard and
10+
* verify artifacts against the corresponding public key.
11+
*
12+
* # Cache
13+
*
14+
* Per-process, keyed by peer URL. AgentCards are stable for the life
15+
* of the peer process; a gateway restart is an acceptable boundary
16+
* for picking up a rotated identity. Negative results (404, malformed
17+
* JSON, timeout) are cached too so a flaky peer doesn't cost us one
18+
* outbound fetch per /plan.
19+
*
20+
* # Timeout
21+
*
22+
* 2 seconds by default — the fetch blocks the first call to a new
23+
* peer, so we keep it short. Callers can override via the opts.
24+
*
25+
* # Errors become `null`
26+
*
27+
* Every failure mode (network error, non-2xx, invalid JSON, schema
28+
* mismatch, abort) returns `null` rather than throwing. The fallback
29+
* in maybeVerifySignatures degrades gracefully — null peer.card just
30+
* means the pinnedDID path is the only option, same behavior as before
31+
* this module existed. Errors aren't "safety failures" here; they're
32+
* "couldn't enrich."
33+
*/
34+
35+
const cache = new Map<string, AgentCard | null>()
36+
37+
export interface FetchAgentCardOptions {
38+
readonly signal?: AbortSignal
39+
readonly timeoutMs?: number
40+
}
41+
42+
const DEFAULT_TIMEOUT_MS = 2000
43+
const WELL_KNOWN_PATH = "/.well-known/agent.json"
44+
45+
export async function fetchAgentCard(
46+
peerUrl: string,
47+
opts: FetchAgentCardOptions = {},
48+
): Promise<AgentCard | null> {
49+
if (cache.has(peerUrl)) return cache.get(peerUrl) ?? null
50+
51+
const timeoutMs = opts.timeoutMs ?? DEFAULT_TIMEOUT_MS
52+
const ac = new AbortController()
53+
const timer = setTimeout(() => ac.abort(), timeoutMs)
54+
const onUpstreamAbort = () => ac.abort()
55+
opts.signal?.addEventListener("abort", onUpstreamAbort, { once: true })
56+
57+
try {
58+
const target = new URL(WELL_KNOWN_PATH, peerUrl).toString()
59+
const res = await fetch(target, { signal: ac.signal })
60+
if (!res.ok) {
61+
cache.set(peerUrl, null)
62+
return null
63+
}
64+
const json = (await res.json()) as unknown
65+
const parsed = AgentCard.safeParse(json)
66+
if (!parsed.success) {
67+
cache.set(peerUrl, null)
68+
return null
69+
}
70+
cache.set(peerUrl, parsed.data)
71+
return parsed.data
72+
} catch {
73+
cache.set(peerUrl, null)
74+
return null
75+
} finally {
76+
clearTimeout(timer)
77+
opts.signal?.removeEventListener("abort", onUpstreamAbort)
78+
}
79+
}
80+
81+
/**
82+
* Reset the AgentCard cache. Only intended for use in tests — production
83+
* callers have no reason to evict (a gateway restart picks up any
84+
* identity rotation). Exported on a `__`-prefixed name to signal intent.
85+
*/
86+
export function __resetAgentCardCacheForTests(): void {
87+
cache.clear()
88+
}

gateway/src/bindu/client/index.ts

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import { verifyArtifact } from "../identity/verify"
1010
import { createResolver, primaryPublicKeyBase58 } from "../identity/resolve"
1111
import { getPeerDID } from "../protocol/identity"
1212
import type { AgentCard } from "../protocol/agent-card"
13+
import { fetchAgentCard } from "./agent-card"
1314

1415
/**
1516
* Public Bindu client — the thing the planner's tools invoke.
@@ -148,6 +149,22 @@ async function runCall(
148149
identity: LocalIdentity | undefined,
149150
tokenProvider: TokenProvider | undefined,
150151
): Promise<CallPeerOutcome> {
152+
// Populate peer.card from /.well-known/agent.json on first contact.
153+
// Cached per process — subsequent calls to the same peer are free.
154+
// This activates the AgentCard-based DID fallback in
155+
// maybeVerifySignatures: when the caller enabled `trust.verifyDID`
156+
// without pinning a DID, the gateway recovers the peer's published
157+
// DID here and verifies against its public key.
158+
//
159+
// Runs concurrently-safe (cache handles races via last-write-wins,
160+
// AgentCards are stable), non-blocking on failure (returns null,
161+
// peer.card stays undefined, verification falls through to the
162+
// pinnedDID-only path).
163+
if (!input.peer.card && input.peer.trust?.verifyDID) {
164+
const card = await fetchAgentCard(input.peer.url, { signal: input.signal })
165+
if (card) input.peer.card = card
166+
}
167+
151168
const parts: Part[] =
152169
typeof input.input === "string" ? [{ kind: "text", text: input.input }] : input.input
153170

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
import { describe, it, expect, beforeEach, vi, afterEach } from "vitest"
2+
import { fetchAgentCard, __resetAgentCardCacheForTests } from "../../src/bindu/client/agent-card"
3+
4+
/**
5+
* Coverage for the AgentCard fetch helper — the piece that activates
6+
* maybeVerifySignatures' observed-DID fallback when the caller enabled
7+
* `trust.verifyDID: true` without pinning a DID.
8+
*
9+
* Resolves the "peer.card is never populated" high-severity entry in
10+
* BUGS_AND_KNOWN_ISSUES.md: before this module, the fallback at
11+
* bindu/client/index.ts:196 was dead code because nothing ever set
12+
* peer.card. These tests pin the four real outcomes (success, 404,
13+
* malformed body, cache hit) and the degrade-to-null contract on
14+
* every failure mode.
15+
*/
16+
17+
const VALID_CARD = {
18+
id: "did:bindu:ops_at_example_com:research:7dc57d21-2c81-f6f5-c679-e51995f97e22",
19+
name: "research",
20+
description: "Web search and summarize.",
21+
skills: [],
22+
defaultInputModes: ["text/plain"],
23+
defaultOutputModes: ["text/plain"],
24+
capabilities: {
25+
extensions: [
26+
{
27+
uri: "did:bindu:ops_at_example_com:research:7dc57d21-2c81-f6f5-c679-e51995f97e22",
28+
},
29+
],
30+
},
31+
}
32+
33+
describe("fetchAgentCard", () => {
34+
beforeEach(() => {
35+
__resetAgentCardCacheForTests()
36+
})
37+
38+
afterEach(() => {
39+
vi.restoreAllMocks()
40+
})
41+
42+
it("fetches and parses a valid AgentCard from /.well-known/agent.json", async () => {
43+
const fetchSpy = vi.spyOn(global, "fetch").mockResolvedValue(
44+
new Response(JSON.stringify(VALID_CARD), {
45+
status: 200,
46+
headers: { "content-type": "application/json" },
47+
}),
48+
)
49+
50+
const result = await fetchAgentCard("http://localhost:3773")
51+
expect(result).not.toBeNull()
52+
expect(result?.name).toBe("research")
53+
expect(fetchSpy).toHaveBeenCalledWith(
54+
"http://localhost:3773/.well-known/agent.json",
55+
expect.objectContaining({ signal: expect.anything() }),
56+
)
57+
})
58+
59+
it("returns null on non-2xx without throwing", async () => {
60+
vi.spyOn(global, "fetch").mockResolvedValue(
61+
new Response("not found", { status: 404 }),
62+
)
63+
64+
const result = await fetchAgentCard("http://localhost:3773")
65+
expect(result).toBeNull()
66+
})
67+
68+
it("returns null on malformed JSON body", async () => {
69+
vi.spyOn(global, "fetch").mockResolvedValue(
70+
new Response("not a json body, just a string", {
71+
status: 200,
72+
headers: { "content-type": "application/json" },
73+
}),
74+
)
75+
76+
const result = await fetchAgentCard("http://localhost:3773")
77+
expect(result).toBeNull()
78+
})
79+
80+
it("returns null when the body doesn't match the AgentCard schema", async () => {
81+
vi.spyOn(global, "fetch").mockResolvedValue(
82+
new Response(JSON.stringify({ wrong: "shape" }), {
83+
status: 200,
84+
headers: { "content-type": "application/json" },
85+
}),
86+
)
87+
88+
const result = await fetchAgentCard("http://localhost:3773")
89+
expect(result).toBeNull()
90+
})
91+
92+
it("returns null and caches the failure when fetch throws", async () => {
93+
vi.spyOn(global, "fetch").mockRejectedValue(new Error("ECONNREFUSED"))
94+
95+
const result = await fetchAgentCard("http://localhost:3773")
96+
expect(result).toBeNull()
97+
})
98+
99+
it("caches successful results — second call does not re-fetch", async () => {
100+
const fetchSpy = vi.spyOn(global, "fetch").mockResolvedValue(
101+
new Response(JSON.stringify(VALID_CARD), {
102+
status: 200,
103+
headers: { "content-type": "application/json" },
104+
}),
105+
)
106+
107+
const first = await fetchAgentCard("http://localhost:3773")
108+
const second = await fetchAgentCard("http://localhost:3773")
109+
110+
expect(first).toBe(second) // same cached reference
111+
expect(fetchSpy).toHaveBeenCalledTimes(1)
112+
})
113+
114+
it("caches failures too — second call does not re-fetch a known-bad peer", async () => {
115+
const fetchSpy = vi.spyOn(global, "fetch").mockResolvedValue(
116+
new Response("", { status: 404 }),
117+
)
118+
119+
await fetchAgentCard("http://localhost:3773")
120+
await fetchAgentCard("http://localhost:3773")
121+
122+
expect(fetchSpy).toHaveBeenCalledTimes(1)
123+
})
124+
125+
it("caches per-URL — a different peer is fetched independently", async () => {
126+
const fetchSpy = vi.spyOn(global, "fetch").mockResolvedValue(
127+
new Response(JSON.stringify(VALID_CARD), { status: 200 }),
128+
)
129+
130+
await fetchAgentCard("http://localhost:3773")
131+
await fetchAgentCard("http://localhost:3775")
132+
133+
expect(fetchSpy).toHaveBeenCalledTimes(2)
134+
const urls = fetchSpy.mock.calls.map((c) => c[0])
135+
expect(urls).toContain("http://localhost:3773/.well-known/agent.json")
136+
expect(urls).toContain("http://localhost:3775/.well-known/agent.json")
137+
})
138+
139+
it("aborts on timeout", async () => {
140+
// Simulate a fetch that never resolves — the helper's internal
141+
// timeout should abort. We use a short timeout and expect null.
142+
vi.spyOn(global, "fetch").mockImplementation(
143+
(_url, init) =>
144+
new Promise((_resolve, reject) => {
145+
const signal = (init as { signal?: AbortSignal } | undefined)?.signal
146+
signal?.addEventListener("abort", () => reject(new Error("aborted")), {
147+
once: true,
148+
})
149+
}),
150+
)
151+
152+
const result = await fetchAgentCard("http://localhost:3773", { timeoutMs: 50 })
153+
expect(result).toBeNull()
154+
})
155+
})

0 commit comments

Comments
 (0)