feat: Bindu Gateway + bug-tracking infrastructure by raahulrahl · Pull Request #463 · GetBindu/Bindu

raahulrahl · 2026-04-18T21:45:50Z

What's in this branch

Lands the Bindu Gateway (a new top-level service that sits in front of peer agents and orchestrates planner-driven tool calls against them) plus the bug-tracking infrastructure this repo now uses to track fixes across the whole codebase.

Twelve commits, split into three layers:

Bug-tracking infrastructure (docs only)

Commit	Purpose
`79be162`	Bootstrap `bugs/` folder — README, postmortem template, six dated postmortems for gateway bugs surfaced in the initial review
`26fbd6d`	Expand `known-issues.md` from the 13 initial gateway entries to 36, covering every unfixed item from the review
`1e26bd7`	Collocate `known-issues.md` under `bugs/` alongside postmortems — one folder answers both "what's broken today" and "what broke historically"

Gateway initial landing

Commit	Purpose
`0e651db`	Phase 0 dry-run + Phase 1 Days 1–9 — the gateway itself
`ae77e5d`	Day 10: mock Bindu agent helper, E2E integration test, README

Gateway follow-up fixes

Each of these corresponds to a postmortem under bugs/2026-04-18-*.md:

Commit	Fix
`484b6b8`	Isolate SSE events per /plan session — `sse-cross-contamination`
`9e49d97`	Terminate SSE reader fibers on request end — `spawnreader-fiber-leak`
`ad4f1b5`	Populate `remote_task_id` in audit rows
`bbb1474`	Preserve prior summary across compaction passes — `compaction-lossy-second-pass`
`77603da`	Land compaction cuts on turn boundaries — `compaction-mid-turn-cut`
`0655ac1`	Dedupe concurrent compactions per session — `compaction-concurrent-races`
`857197a`	Constant-time bearer token compare — `timing-unsafe-token-compare`

Downstream PRs that depend on this merge

Three fix branches are open against main that each cherry-picked the three bug-tracking infra commits (since bugs/ doesn't exist on main yet). When this PR lands, each downstream PR's diff will collapse by those three commits via git's cherry detection on rebase:

fix/task-ownership-idor — closes IDOR on task/context endpoints
fix/did-signature-fail-open — DID middleware fail-open
fix/types-populate-by-name — accept snake_case on input, camelCase on wire

Review notes

The six dated postmortems under bugs/ are documentation-only and pair with the seven fix commits in this branch. Useful reading-order: open the postmortem that matches a given commit's subject to see the root cause analysis alongside the fix.
bugs/README.md explains the schema for future postmortems.
The gateway lives entirely under gateway/ — no Python core behavior changes.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Introduced Bindu Gateway service with POST /plan HTTP endpoint for AI-powered task planning across multiple agents
- Added Server-Sent Events (SSE) for real-time streaming of planning progress, task execution, and results
- Implemented persistent session management supporting multi-turn conversations and session resumption
- Enabled dynamic agent skill discovery and execution with remote peer communication
- Added identity verification using ed25519 signatures for secure agent authentication
Documentation
- Added comprehensive guides, architecture specifications, and phased roadmap for gateway development

Adds a new TypeScript/Bun workspace at `gateway/` — a task-first orchestrator that accepts { question, agents[], preferences? } from an external caller, plans the work with an LLM, calls downstream Bindu agents over A2A, and streams results back as SSE. Based on the plan at gateway/plans/ — calibrated against live Bindu agents via Phase 0 dry-run fixtures captured in scripts/dryrun-fixtures/. ## Phase 0 — Protocol dry-run (scripts/) - scripts/bindu-dryrun.ts: end-to-end polling client against a local Bindu echo agent. Captures AgentCard, DID Doc, Skills, Negotiation, message/send response, and terminal Task including signatures. - Fixtures in scripts/dryrun-fixtures/echo-agent/ drive Phase 1 Zod schemas and integration tests. ## Phase 1 — Gateway implementation (gateway/) Runtime: TypeScript on Bun/Node 22, Effect 4 beta, Hono 4.10, @supabase/supabase-js 2.58, AI SDK 6, @noble/ed25519 + bs58. ### What's fresh (Bindu-native) - bus/ typed event bus (Effect Service + PubSub) - config/ hierarchical config loader with env overrides - db/ Supabase adapter (sessions, messages, tasks) - auth/ keystore on disk for downstream credentials - permission/ wildcard ruleset evaluator - provider/ thin AI SDK wrapper (Anthropic, OpenAI) - tool/ Tool.define + scoped registry - skill/ .md + YAML frontmatter loader - agent/ agent.md loader + Agent.Info schema - session/ 9 files — message types, session service, streamText wrapper, THE LOOP, compaction, summary, overflow detection, revert - bindu/protocol/ Zod for Message, Part, Artifact, Task, HistoryMessage, AgentCard, DID Document, JSON-RPC envelope, BinduError with code classification (auth, schema-mismatch, etc.) - bindu/identity/ ed25519 bootstrap + verify + verifyArtifact + DID resolver with TTL cache - bindu/auth/ PeerAuth (none | bearer | bearer_env) → headers - bindu/client/ HTTP transport + message/send + tasks/get poll loop with camelCase-first + -32700/-32602 flip retry + signature verification when trust.verifyDID - bindu/index.ts barrel (imports identity first to trigger bootstrap) - planner/ agent catalog → dynamic tools, orchestrates SessionPrompt.prompt with compactIfNeeded hook - api/plan-route.ts POST /plan — bearer auth, Zod request validation, SSE emitter for session/plan/task.*/final/done - server/ Hono shell + /health - index.ts Layer graph (Config → DB/Provider/Agent → Session → SessionPrompt/SessionCompaction → Planner) + ManagedRuntime boot ### What's copied from OpenCode (trimmed, vendored @opencode-ai/shared) - effect/, util/, id/, global/, _shared/ — Effect runtime glue, logger, filesystem helpers, ID generators, XDG paths, error types. ~3400 lines of generic infra we don't need to re-derive. ### Migrations - 001_init.sql: gateway_sessions, gateway_messages, gateway_tasks with RLS (service-role bypass) - 002_compaction_revert.sql: compacted/reverted flags on messages and tasks, compaction_summary on sessions, partial indexes for active-row lookups ### Tests Three test files, 20 tests total, all passing: - tests/bindu/protocol.test.ts (12): fixture parsing, casing normalize, DID parse, error code classification - tests/bindu/identity.test.ts (4): REAL signature verification against Phase 0 echo agent artifact, tamper detection - tests/bindu/poll.test.ts (4): mock-fetch polling scenarios (submitted → working → completed, -32700 casing flip, input-required needsAction, -32013 InsufficientPermissions) ## Plan documents (gateway/plans/) - PLAN.md: master plan — architecture, protocol wire spec, config schema, fork-and-extract plan, risks - phase-0..5 detail files: preconditions, work breakdown, code sketches, test plans, phase-specific risks, exit gates - README.md: index ## What's not done yet (future commits) - Day 10: E2E tests + demo docker-compose + README top-level - Phase 2: reconnect, tenancy/RLS enforcement, circuit breakers, rate limits, observability - Phase 3: inbound Bindu server + DID signing + mTLS - Phase 4: registry + trust scoring + cycle limits - Phase 5: payments, negotiation orchestrator, push notifications ## Statistics - 128 files, 16,504 insertions - src/ = ~8700 lines TypeScript, tsc --noEmit green - 20 tests passing (vitest) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t + README Wraps Phase 1. The gateway now has a runnable quickstart, a CI-friendly integration test against an in-process mock Bindu agent, and a project README for onboarding. ## Added - tests/helpers/mock-bindu-agent.ts — in-process HTTP server implementing the minimum Bindu A2A wire surface (.well-known/agent.json, message/send, tasks/get, tasks/cancel). Configurable respond() function; binds a random port per invocation. - tests/integration/bindu-client-e2e.test.ts — 3 tests that spin up the mock agent and exercise sendAndPoll end-to-end. Covers: - message/send → tasks/get round-trip yields the expected artifact - respond() transform runs server-side (uppercase) - snake_case context_id on the wire normalizes to camelCase contextId on the parsed Task (Phase 0 finding validated in CI) - gateway/README.md — quickstart, prerequisites, Supabase migration steps, architecture overview, test matrix, repo layout, license note. ## Totals - 23/23 tests passing (vitest) - tsc --noEmit green - src/ = ~8700 lines TypeScript Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Concurrent /plan requests shared the global event bus with no filter, so subscribers in one request's SSE stream received text.delta, task.started, task.artifact and final frames from every other in-flight plan. In multi-tenant deployments this was a cross-tenant information disclosure. Split Planner.startPlan into prepareSession + runPlan so the /plan handler learns sessionID BEFORE opening the SSE stream. Every bus.subscribe() is then piped through Stream.filter((e) => e.properties.sessionID === ...), so each request only ever sees its own session's frames. The session row is now emitted as the first SSE event (previously last), letting clients correlate every subsequent frame from the start. Adds tests/api/plan-route-filter.test.ts — two concurrent subscribers with different session IDs; each must see only its own deltas. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

spawnReader in plan-route.ts called Stream.runForEach on an infinite PubSub-backed stream with no termination condition. The `ac.signal.aborted` guard inside the callback only suppressed SSE writes; the underlying fiber kept pulling events from the PubSub forever. Each /plan request leaked five such fibers plus five PubSub subscriptions, which accumulated linearly with request volume. Introduce an `abortEffect(signal)` helper that converts an AbortSignal into an Effect that resolves when the signal fires (via `Effect.callback`, the Effect 4.0 replacement for `Effect.async`). Pipe every reader stream through `Stream.interruptWhen(abortEffect(signal))` so the fiber terminates cleanly when the handler's `finally { ac.abort() }` runs — releasing the PubSub subscription and freeing closure-captured state. Drops the prior 100ms setTimeout flush hack from the success path; the interrupt now gates the lifecycle deterministically. Extends tests/api/plan-route-filter.test.ts with a new case that forks a reader, publishes an event, aborts the signal, and awaits the fiber. If interruptWhen is broken, the await hangs and Vitest fails on timeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gateway_tasks.remote_task_id (and its index) were always NULL. The column exists to correlate the gateway's internal task row with the peer-assigned task id returned by the downstream Bindu agent — the ID that appears in the peer's own logs and is required for tasks/cancel, resume, or any cross-system debugging. recordTask runs BEFORE the peer has issued an id, so the column stays NULL at insert time. finishTask runs AFTER the peer has responded and has the id in outcome.task.id, but the interface had no field for it — so the update never wrote it through. Adds `remoteTaskId?: string` to FinishTaskInput, writes it into the update patch when supplied, and captures `outcome.task.id` in the planner tool path so every successful Bindu call produces an audit row keyed to the peer's task id. Typecheck-only verification; this is pure plumbing — the interface change guarantees the field flows end-to-end at compile time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Session compaction overwrote gateway_sessions.compaction_summary on every run with only the summary of the newly-added messages. Run #1 summarized turns 1–N into paragraph A; run #2 summarized turns N+1–M into paragraph B, which REPLACED A wholesale. Any load-bearing fact captured in A was permanently lost — long sessions progressively forgot early context (user's original goal, early agent results, translations, pinned facts). An additional latent bug: session.history() prepends the prior summary as a synthetic user-message with a freshly-minted UUID. On pass #2 that synthetic would land in `head` and be re-summarized as part of the body, paraphrase-of-a-paraphrase style; and the subsequent `UPDATE ... WHERE id IN (head_ids)` was a silent no-op for the synthetic id. Fix: - summarize() grows an optional `priorSummary?: string | null`. When present, it's injected as a leading user message tagged `[PRIOR SUMMARY — preserve every fact below]`. - The system prompt gains an explicit fact-preservation clause and "new summary must be a SUPERSET of the prior summary" language. - Closing instruction switches to a union-with-prior variant when a non-empty prior summary is present. - Whitespace-only prior summary is treated as absent (single `hasPrior` flag gates both the marker block and the closing prompt — first regression test caught this edge case). compaction.runCompaction: - Filters synthetic messages out of history before splitHead, so the no-op UPDATE path is gone and the prior summary is not rewritten as part of head. - Reads compaction_summary directly from the session row before summarizing and passes it as priorSummary. - No-ops cleanly when there's nothing new to fold in, avoiding redundant LLM calls that would just re-paraphrase the same content. Overwriting the column is now safe because the new summary is constructed as a SUPERSET of the old one. Adds tests/session/summary.test.ts — three cases covering marker injection, closing-prompt variant selection, and whitespace handling. Verified the "with priorSummary" test catches the regression by temporarily reverting summary.ts and re-running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

splitHead did a raw `history.slice(0, history.length - keepTail)`. A single planner turn (user → assistant-with-tool_use → tool_result → ... → final-assistant) can span far more than 4 messages — three-tool turns run 8 messages, ten-tool turns run 22. The naive cut routinely landed INSIDE a turn, stranding an assistant tool_use in `head` whose matching tool_result was kept verbatim in `tail`. On the very next model call the provider (Anthropic, OpenAI) rejected the request with "tool_use / tool_result mismatch" — a 400 error that the planner cannot retry its way out of, because the DB state is already broken (head rows are flagged compacted=true). The session was dead until someone manually cleared the flag. Fix: walk LEFT from the naive cut until the message at the split point is a user turn. Since a user message starts a new turn by definition, the invariant is that every assistant tool_use is in the same half as its tool_result. `keepTail` becomes a MINIMUM — we keep more in tail to reach a safe boundary, never fewer. If no user message exists left of the naive cut (entire history is one unbroken turn), we bail with {head: [], tail: history} rather than break the pairing. Adds tests/session/compaction-split.test.ts — five cases covering tool-heavy turns near the tail, single-unbroken-turn histories, and the general keepTail invariant. Verified regression-catching: with the old raw-slice algorithm restored, two cases fail (mid-turn cut and tool-pair-terminal turn). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two concurrent /plan requests on the same session_id both triggered compactIfNeeded. Each read the same pre-compaction history, each called the summarizer LLM (doubling cost), and each UPDATE'd gateway_sessions.compaction_summary — last writer wins. Because LLMs are non-deterministic even at low temperature, the two summaries diverged and whichever paragraph lost the race silently dropped its facts from session state. The head-row UPDATE (`SET compacted=true`) is idempotent so that part was harmless, but the summary-column race was a real data-loss path. Fix: application-layer promise dedupe. A per-process Map<SessionID, Promise<CompactOutcome>> records the in-flight compaction for each session. Second caller finds the existing entry and awaits THE SAME promise — no second LLM call, no second UPDATE, both callers receive the identical CompactOutcome. The map entry is cleared in a finally block, so a resolved (or failed) compaction does not block the next one. Limitation: this is per-process state. A horizontally-scaled gateway fronting a single Supabase could still race across processes. Noted in code comment; Phase 2 can add a Postgres version column or stored-proc-wrapped compaction for cross-process safety. Single-process Phase 1 is correct today. Adds tests/session/compaction-dedupe.test.ts — four cases covering the happy path (same promise reused), post-settle behavior (next call kicks off fresh producer), per-session isolation (different keys don't share), and error-path recovery (rejected promise clears the entry so retry works). Verified the happy-path test catches the regression by disabling the map lookup and re-running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

plan-route.ts authenticated incoming requests with `authConfig.tokens.includes(token)`, which compares strings byte-by-byte via `===` and short-circuits on the first mismatch. The time difference between "first byte matched" and "first byte didn't" is observable over the network; with enough samples an attacker can recover a bearer token byte-by-byte. Iterating the tokens array with a short-circuiting match additionally leaks which token in the list was a prefix of the guess. Replace with a constant-time validator: 1. SHA-256 both the provided token and each configured token. Hashing normalizes inputs to 32 bytes — removes the length leak and lets timingSafeEqual run without throwing on unequal-length buffers. 2. Run timingSafeEqual against EVERY entry even after a match. Total time becomes O(tokens.length), independent of which token matched or whether any did. 3. OR the results into a single boolean at the end. Exported from plan-route.ts as validateBearerToken so it can be tested directly. The call site in handleRequest() replaces the `includes` check — no behavior change for valid inputs, no timing leak for invalid ones. Adds tests/api/bearer-token.test.ts — six cases covering: single-token match, unknown rejection, empty config (no default-accept), tokens of vastly different lengths (length not leaked), exact-match semantics (no prefix/suffix/case hits), and a loose timing-variance check that runs 10k iterations each of a "byte-0 match" and a "byte-0 mismatch" guess and asserts their ratio stays under 3x. The old includes() would fail that last test because character-by-character compare amplifies the byte-depth difference over thousands of iterations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Establishes a three-way split for tracking bugs across the repo: - GitHub Issues: volatile, where triage and status live. Source of truth for IS THIS FIXED YET. - bugs/*.md: durable postmortems for bugs that taught us something about a class of failure. One file per bug, indefinite retention. Required template with Symptom / Root cause / Fix sections; Why-tests-missed and Class-of-bug sections strongly encouraged (not CI-enforced yet — medium strictness by intent). - docs/known-issues.md: user-facing heads-up for current limitations. Entries are REMOVED as issues close; this file grows only for things that aren't planned to be fixed soon. Adds bugs/README.md with the template and inclusion rules. Seeds bugs/ with six postmortems for the critical and security bugs resolved in commits 484b6b8 through 857197a (the gateway code-review pass): - sse-cross-contamination: bus.subscribe without tenancy filter - spawnreader-fiber-leak: Stream.runForEach on infinite stream, AbortSignal check inside callback only skipped writes - compaction-lossy-second-pass: overwriting a lossy-compressed column compounds loss; must merge-then-write - compaction-mid-turn-cut: raw-index slice on a message list with semantic (turn) boundaries broke tool_use/tool_result pairing - compaction-concurrent-races: non-idempotent UPDATE on a shared row had no dedupe; LLM non-determinism made the race silent - timing-unsafe-token-compare: .includes() on secrets short- circuits on first mismatch, recoverable byte-by-byte via timing Skipped a postmortem for the remote_task_id fix (commit ad4f1b5) — pure plumbing with no generalizable lesson beyond the commit message. Seeds docs/known-issues.md with thirteen still-open limitations surfaced by the same review pass but not yet fixed (context-window hardcoded, abort-signal propagation to the Bindu client, permission rules not enforced for tool calls, tool-name collisions, agent- catalog overwrite, signature verification semantics, pagination truncation, TTL cleanup, rate limiting, token estimation accuracy, DID resolver stampede, bearer-env error collapse, and the known single-process-only limitation of the compaction-dedupe fix). Structure is repo-wide by intent: bugs/ sits at the top level with `area:` frontmatter tagging the subsystem (gateway/api, gateway/ session, bindu/core, sdks/typescript). docs/known-issues.md has per-subsystem sections. Single archive across Python core, gateway, SDKs, and frontend. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Expands docs/known-issues.md from 13 to 36 gateway entries, covering every unfixed item surfaced in the original code review — not just the subset seeded with the initial commit. Additions by severity: high (new): - poll-budget-unbounded-wall-clock: sendAndPoll's 60 × 10s backoff means one hung peer stalls a tool call for up to 5 min with no overall plan deadline. - no-session-concurrency-guard: two /plan requests on the same session_id interleave writes to gateway_messages; compaction dedupe doesn't cover plan turns themselves. medium (new): - tool-input-sent-as-textpart: JSON.stringify(args) goes as TextPart; skills expecting DataPart fail silently or partially. - prompt-injection-scrubbing-theater: the regex strip of "ignore previous" etc. is trivially bypassable and gives false confidence; real defenses listed in the workaround. - did-resolver-no-key-id-selection: primaryPublicKeyBase58 picks first key, wrong during rotation windows. - no-graceful-shutdown: httpServer.close + runtime.dispose drops in-flight SSE streams mid-frame; no drain, no deadline. - assistant-message-lost-on-stream-error: LLM stream errors drop the partially-completed assistant row even when tool calls have already been billed. gateway_tasks and gateway_messages drift. - json-schema-to-zod-incomplete: enum, oneOf, pattern, numeric bounds, additionalProperties are ignored — planner LLM gets no signal about valid values. medium (expanded): - tool-name-collisions-silent: added note about parseAgentFromTool's non-greedy regex mis-parsing agent names that contain underscores (breaks the task.started SSE event). low (new): - resume-race-duplicate-session: concurrent first-request on a fresh session_id hits the UNIQUE constraint on the second insert; caller sees 500, retry resolves. - cancel-casing-not-retried: poll-exhaust cancel uses camelCase only; peers requiring snake_case get a silent leak. - health-endpoint-no-dependency-probe: /health returns 200 even if Supabase / provider are down. - no-request-id-in-logs: no correlation ID; client/server log joins rely on timestamp + peer URL. - no-config-hot-reload: changes to agents/planner.md or config require a full restart. - resolve-env-limited-to-simple-var: only bare $VAR matches, not ${VAR}/suffix or default-value syntax. - compaction-summary-injected-as-user-role: synthetic message uses user role; system role (or tagged marker) would be safer. - revert-millisecond-ties-nondeterministic: created_at boundary ambiguity under sub-ms insertions. - revert-doesnt-cancel-remote-tasks: local-only revert; peer-side tasks continue consuming resources. - empty-agents-catalog-no-400: agents: [] default accepted; LLM attempts phantom tool calls instead of a clear 400. - no-migration-rollback: migrations are forward-only; paired down.sql does not exist. nit (new): - tasks-recorded-is-dead-state: planner populates an array that's never returned or persisted. - map-finish-reason-pointless-ternary: conditional type is always `any`; simplify in a cleanup pass. - db-effect-promise-swallows-errors: two paths in db/index.ts use Effect.promise which treats rejection as defect, silently resolving. Correctness-adjacent. - test-coverage-gaps: consolidated entry enumerating the missing end-to-end cases (concurrent plans, long-session compaction, revert, non-English, >1000-row sessions, aborted requests, snake_case cancel retry). Not added (already fixed in prior commits): - unused BusEvent import in plan-route.ts (fixed in 484b6b8) - setTimeout(100) flush hack in plan-route.ts (fixed in 9e49d97) Organizational change: entries are now grouped by severity header (High / Medium / Low / Nits) within each subsystem, with a leading note explaining the ordering convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Collocates the two bug-tracking artifacts under a single top-level folder: bugs/ ├── README.md (explains both formats) ├── known-issues.md (user-facing, current limitations) ├── 2026-04-18-sse-cross-contamination.md ├── 2026-04-18-spawnreader-fiber-leak.md └── ... (dated postmortems, fixed bugs) Rationale: `bugs/` and `docs/known-issues.md` were answering two closely-related questions ("what's broken today" / "what broke historically"); keeping them in separate folders meant readers had to discover the split and maintainers had to update cross-references every time. One folder, one README, one place to look. File naming distinguishes them at a glance: dated `YYYY-MM-DD-*.md` files are postmortems for FIXED bugs with indefinite retention; `known-issues.md` is a single living file of CURRENT limitations whose entries are REMOVED as the underlying issues get fixed. Changes: - git mv docs/known-issues.md → bugs/known-issues.md (rename preserves history). - Rewrote bugs/README.md intro to describe both artifacts and the two questions they answer; updated the "Relationship to other tracking" cross-reference to use a sibling path. - Updated three postmortem files (compaction-concurrent-races, compaction-lossy-second-pass, compaction-mid-turn-cut) that referenced "docs/known-issues.md" in prose to use the new sibling path. - Fixed three internal links inside known-issues.md that pointed at "../bugs/" (now "./" since it lives in the same folder). No content change in known-issues.md or any postmortem — pure reorganization + link fixups. Tested with `grep -rn "known-issues" bugs/` to confirm no stale paths remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-18T21:46:08Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a0ee166e-03a4-4ea8-b247-b1fac3a92c97

📥 Commits

Reviewing files that changed from the base of the PR and between a9c4dc3 and a28bf8f.

⛔ Files ignored due to path filters (2)

gateway/package-lock.json is excluded by !**/package-lock.json
scripts/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (134)

gateway/.env.example
gateway/.gitignore
gateway/README.md
gateway/agents/planner.md
gateway/migrations/001_init.sql
gateway/migrations/002_compaction_revert.sql
gateway/package.json
gateway/plans/PLAN.md
gateway/plans/README.md
gateway/plans/phase-0-dryrun.md
gateway/plans/phase-1-mvp.md
gateway/plans/phase-2-production.md
gateway/plans/phase-3-inbound.md
gateway/plans/phase-4-public-network.md
gateway/plans/phase-5-opportunistic.md
gateway/src/_shared/filesystem.ts
gateway/src/_shared/global.ts
gateway/src/_shared/npm.ts
gateway/src/_shared/types.d.ts
gateway/src/_shared/util/array.ts
gateway/src/_shared/util/binary.ts
gateway/src/_shared/util/effect-flock.ts
gateway/src/_shared/util/encode.ts
gateway/src/_shared/util/error.ts
gateway/src/_shared/util/flock.ts
gateway/src/_shared/util/fn.ts
gateway/src/_shared/util/glob.ts
gateway/src/_shared/util/hash.ts
gateway/src/_shared/util/identifier.ts
gateway/src/_shared/util/iife.ts
gateway/src/_shared/util/lazy.ts
gateway/src/_shared/util/module.ts
gateway/src/_shared/util/path.ts
gateway/src/_shared/util/retry.ts
gateway/src/_shared/util/slug.ts
gateway/src/agent/index.ts
gateway/src/api/plan-route.ts
gateway/src/auth/index.ts
gateway/src/bindu/auth/resolver.ts
gateway/src/bindu/client/fetch.ts
gateway/src/bindu/client/index.ts
gateway/src/bindu/client/poll.ts
gateway/src/bindu/identity/bootstrap.ts
gateway/src/bindu/identity/index.ts
gateway/src/bindu/identity/resolve.ts
gateway/src/bindu/identity/verify.ts
gateway/src/bindu/index.ts
gateway/src/bindu/protocol/agent-card.ts
gateway/src/bindu/protocol/identity.ts
gateway/src/bindu/protocol/index.ts
gateway/src/bindu/protocol/jsonrpc.ts
gateway/src/bindu/protocol/normalize.ts
gateway/src/bindu/protocol/types.ts
gateway/src/bus/bus-event.ts
gateway/src/bus/index.ts
gateway/src/config/index.ts
gateway/src/config/loader.ts
gateway/src/config/schema.ts
gateway/src/db/index.ts
gateway/src/db/types.ts
gateway/src/effect/index.ts
gateway/src/effect/instance-registry.ts
gateway/src/effect/logger.ts
gateway/src/effect/runner.ts
gateway/src/global/index.ts
gateway/src/id/id.ts
gateway/src/index.ts
gateway/src/permission/index.ts
gateway/src/planner/index.ts
gateway/src/provider/index.ts
gateway/src/server/index.ts
gateway/src/session/compaction.ts
gateway/src/session/index.ts
gateway/src/session/llm.ts
gateway/src/session/message.ts
gateway/src/session/overflow.ts
gateway/src/session/prompt.ts
gateway/src/session/revert.ts
gateway/src/session/schema.ts
gateway/src/session/summary.ts
gateway/src/skill/index.ts
gateway/src/tool/registry.ts
gateway/src/tool/tool.ts
gateway/src/util/abort.ts
gateway/src/util/archive.ts
gateway/src/util/color.ts
gateway/src/util/data-url.ts
gateway/src/util/defer.ts
gateway/src/util/effect-http-client.ts
gateway/src/util/effect-zod.ts
gateway/src/util/error.ts
gateway/src/util/filesystem.ts
gateway/src/util/fn.ts
gateway/src/util/format.ts
gateway/src/util/iife.ts
gateway/src/util/index.ts
gateway/src/util/lazy.ts
gateway/src/util/local-context.ts
gateway/src/util/locale.ts
gateway/src/util/lock.ts
gateway/src/util/log.ts
gateway/src/util/process.ts
gateway/src/util/queue.ts
gateway/src/util/record.ts
gateway/src/util/schema.ts
gateway/src/util/scrap.ts
gateway/src/util/signal.ts
gateway/src/util/timeout.ts
gateway/src/util/token.ts
gateway/src/util/update-schema.ts
gateway/src/util/wildcard.ts
gateway/tests/api/bearer-token.test.ts
gateway/tests/api/plan-route-filter.test.ts
gateway/tests/bindu/identity.test.ts
gateway/tests/bindu/poll.test.ts
gateway/tests/bindu/protocol.test.ts
gateway/tests/helpers/mock-bindu-agent.ts
gateway/tests/integration/bindu-client-e2e.test.ts
gateway/tests/session/compaction-dedupe.test.ts
gateway/tests/session/compaction-split.test.ts
gateway/tests/session/summary.test.ts
gateway/tsconfig.json
gateway/vitest.config.ts
scripts/.gitignore
scripts/bindu-dryrun.ts
scripts/dryrun-fixtures/echo-agent/NOTES.md
scripts/dryrun-fixtures/echo-agent/agent-card.json
scripts/dryrun-fixtures/echo-agent/did-doc.json
scripts/dryrun-fixtures/echo-agent/final-task.json
scripts/dryrun-fixtures/echo-agent/negotiation.json
scripts/dryrun-fixtures/echo-agent/skill-question-answering-v1.json
scripts/dryrun-fixtures/echo-agent/skills.json
scripts/dryrun-fixtures/echo-agent/submit-response.json
scripts/package.json

📝 Walkthrough

Walkthrough

The PR introduces a complete Bindu Gateway service—a TypeScript/Bun HTTP server that accepts user questions, orchestrates multi-agent task execution via an LLM-based planner, persists conversation state in Supabase, and streams results back via Server-Sent Events. Implementation spans database schemas, Bindu protocol handling with signature verification, session management including compaction and revert, dynamic tool registration, LLM streaming integration, and comprehensive infrastructure.

Changes

Cohort / File(s)	Summary
Core Setup `gateway/.env.example`, `gateway/.gitignore`, `gateway/package.json`, `gateway/tsconfig.json`, `gateway/vitest.config.ts`	Environment and build configuration with Supabase/API key placeholders, Node ≥22 requirement, TypeScript strict mode, Vitest setup with env file loading.
Documentation `gateway/README.md`, `gateway/agents/planner.md`, `gateway/plans/*.md`	Comprehensive operational and design documentation: MVP quickstart, Anthropic Claude planner config, and detailed phase roadmap (0–5) covering protocol dry-run, MVP, production hardening, inbound exposure, discovery/trust, and opportunistic features.
Database & Migrations `gateway/migrations/001_init.sql`, `gateway/migrations/002_compaction_revert.sql`	Supabase schema: `gateway_sessions`, `gateway_messages`, `gateway_tasks` tables with indexes, RLS-enabled; Phase 2 additions for compaction state and message/task revert tracking.
API & Server `gateway/src/server/index.ts`, `gateway/src/api/plan-route.ts`	Hono app factory with `/health` route and SSE-streaming `POST /plan` handler that validates bearer auth, manages sessions, runs the planner, filters bus events by session, and handles abort signals.
Bindu Protocol `gateway/src/bindu/protocol/*.ts`	Zod-based schemas and parsers: `AgentCard`, `Task`, `Message`, `Part`, `Artifact`, `DIDDocument`, `SkillDetail`; JSON-RPC 2.0 request/response; error classification (schema mismatch, auth, retryable); camelCase/snake_case normalization layer.
Bindu Identity & Auth `gateway/src/bindu/identity/*.ts`, `gateway/src/bindu/auth/resolver.ts`	Ed25519 signature verification with bootstrap, DID document resolution with caching, DID parsing (`did:bindu`, `did:key`), authentication header resolution from config.
Bindu Client & Polling `gateway/src/bindu/client/fetch.ts`, `gateway/src/bindu/client/poll.ts`, `gateway/src/bindu/client/index.ts`	HTTP RPC transport with timeout/abort, polling loop with adaptive casing retry on schema-mismatch, task completion detection, signature verification, peer cancellation support.
Configuration `gateway/src/config/schema.ts`, `gateway/src/config/loader.ts`, `gateway/src/config/index.ts`	Zod schemas for server/auth/session/Supabase/limits/agents/permissions; environment and JSON file loading with `$VAR` substitution and override merging.
Database Service `gateway/src/db/types.ts`, `gateway/src/db/index.ts`	Session/message/task CRUD operations backed by Supabase client; row type interfaces; task state enums and terminal-state checks.
Session Management `gateway/src/session/*.ts`	Multi-file session subsystem: schema (branded UUIDs), message model with text/file/tool parts, LLM streaming wrapper, prompt loop with system prompt building and tool execution, compaction with turn-boundary safety, revert/rollback, overflow detection and token accounting, summary generation.
Planning & Tool System `gateway/src/planner/index.ts`, `gateway/src/tool/*.ts`, `gateway/src/skill/index.ts`	Session prompt orchestration with dynamic tool registration per agent skill; tool context and execution result wrapping in `<remote_content>` to prevent injection; skill markdown parsing with YAML frontmatter; tool registry with scoped lifecycle.
Agent & Permission Management `gateway/src/agent/index.ts`, `gateway/src/permission/index.ts`, `gateway/src/auth/index.ts`	Agent loading from markdown with config overlay; wildcard-based permission evaluation (allow/deny/ask); persistent credential store (bearer/API key/OAuth/DID/mTLS) with JSON file backing.
Event Bus `gateway/src/bus/*.ts`	Typed event bus with per-session filtering, PubSub channels, and stream subscribers for prompt lifecycle events.
Effect Infrastructure `gateway/src/effect/*.ts`	Logger wrapper with Effect context, runner state machine for async task coordination, instance registry for cleanup.
Provider & LLM `gateway/src/provider/index.ts`	Model ID parsing, Anthropic/OpenAI factory with optional custom API keys/base URLs, Effect-based service layer.
Shared Utilities `gateway/src/_shared/.ts`, `gateway/src/util/.ts`	Filesystem operations, npm package management with locking, global path computation, encode/hash helpers, binary search, lazy evaluation, glob/path utilities, error formatting, Effect HTTP client wrapping, Zod schema conversion.
Filesystem & Global State `gateway/src/global/index.ts`	XDG-based directory paths (data, cache, config, state) with test home override, cache versioning, flock global state initialization.
ID Generation `gateway/src/id/id.ts`	Prefix-based monotonic IDs with ascending/descending support, timestamp extraction.
Index & Entry Points `gateway/src/index.ts`, `gateway/src/bindu/index.ts`, `gateway/src/**/index.ts`	Barrel modules wiring Effect layers, composing service dependencies, re-exporting subsystems.
Tests `gateway/tests/*/.test.ts`, `gateway/tests/helpers/mock-bindu-agent.ts`	Unit tests for bearer token auth, SSE event filtering, signature verification; integration tests for polling against mock agent; regression tests for compaction turn boundaries, summary preservation; in-process mock Bindu agent HTTP server.
Phase 0 Dry-Run `scripts/bindu-dryrun.ts`, `scripts/package.json`, `scripts/.gitignore`	Protocol validation script fetching agent card, DID doc, skills, negotiation; submitting message and polling with adaptive casing; signature verification; anomaly detection and reporting.
Dry-Run Fixtures `scripts/dryrun-fixtures/echo-agent/*`	Captured wire payloads: agent card JSON, DID document, skill definitions, task lifecycle (submit/poll response), negotiation payload, and protocol findings (NOTES.md).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🐰 A gateway born of code and dreams so bright,
Where agents dance and plans take flight.
Sessions store their tales with care,
Compaction's logic—turn-aware!
hops with glee 🌟

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ft-gateway

Post-merge CI on main failed on two pre-commit hooks after #463 landed: 1. detect-secrets: 4 high-entropy strings in gateway dry-run fixtures (echo-agent/NOTES.md, did-doc.json, final-task.json) and one hex expectation in gateway/tests/bindu/protocol.test.ts not present in .secrets.baseline. Plus a fifth flag on a docstring example URL in scripts/backfill_owner_did.py. 2. pydocstyle on scripts/backfill_owner_did.py: D301 (docstring contained literal backslashes without r"" prefix) and D103 (main() missing a docstring). Fixes: * .secrets.baseline — re-run `detect-secrets scan --baseline .secrets.baseline` to pick up the four new fixture/test locations. Entries land as unaudited (is_secret unset), same shape as the existing 22 baseline entries, so the pre-commit hook passes; the pre-push verify-secrets-audited hook is orthogonal and was not in the failing CI step. * scripts/backfill_owner_did.py — docstring made raw (``r"""``) so the ``\`` line-continuations in the usage examples no longer need doubling; backslashes now render as actual shell line continuations in ``--help`` output. main() gains a one-line docstring. The placeholder URL in the docstring keeps a ``# pragma: allowlist secret`` inline so it does not need a second baseline entry. Local checks: * detect-secrets-hook --baseline → exit 0 on the flagged files * pydocstyle scripts/backfill_owner_did.py → exit 0 * pytest tests/unit/ → 795 passed, 3 skipped Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

raahulrahl and others added 12 commits April 17, 2026 22:15

Merge branch 'main' into ft-gateway

a28bf8f

raahulrahl merged commit dd25bf7 into main Apr 18, 2026
1 of 3 checks passed

raahulrahl mentioned this pull request Apr 19, 2026

fix(ci): unblock main — allowlist gateway fixtures + pydocstyle in backfill #464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Bindu Gateway + bug-tracking infrastructure#463

feat: Bindu Gateway + bug-tracking infrastructure#463
raahulrahl merged 13 commits intomainfrom
ft-gateway

raahulrahl commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

raahulrahl commented Apr 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's in this branch

Bug-tracking infrastructure (docs only)

Gateway initial landing

Gateway follow-up fixes

Downstream PRs that depend on this merge

Review notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

raahulrahl commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading