Skip to content

feat: agent-optimized v0.2.0 — composite tools + verification + gating#5

Merged
milstan merged 1 commit intomainfrom
milstan/leadbay-agent-skills
Apr 21, 2026
Merged

feat: agent-optimized v0.2.0 — composite tools + verification + gating#5
milstan merged 1 commit intomainfrom
milstan/leadbay-agent-skills

Conversation

@milstan
Copy link
Copy Markdown
Contributor

@milstan milstan commented Apr 21, 2026

Summary

Adds the agent-facing composite-tool surface that lets Claude (or any MCP client) drive Leadbay end-to-end without needing to know about lens permissions, region routing, polling, selection state, or quota mechanics. Designed through a full /autoplan dual-voice review (CEO + Eng + DX, both Codex and Claude subagent voices) followed by live API exploration against production. Ships 10 composite agent-skill tools (pull_leads, research_lead, bulk_qualify_leads, enrich_titles, adjust_audience, refine_prompt, answer_clarification, recall_ordered_titles, account_status, report_outreach) plus 28 new granular API tools, all gated by LEADBAY_MCP_WRITE=1 / LEADBAY_MCP_ADVANCED=1 (MCP) or exposeWrite / exposeGranular plugin config (OpenClaw); report_outreach requires verification: {source: gmail_message_id|calendar_event_id|user_confirmed, ref} to prevent hallucinated outreach poisoning the SDR pipeline (cross-phase critical from the review). Client refactored with HTTP header capture, region+latency+retry_after _meta envelope, 429→QUOTA_EXCEEDED mapping (production behavior), 60s /me cache with invalidateMe() called by every cached-field mutator, selection Mutex, region auto-detect on login (us → fr fallback). Tests: 89 unit (54+11+19 → 58+12+19), 10 live read-only smoke, plus end-to-end MCP and OpenClaw plugin live smokes — all 100% pass against the real backend.

Test plan

  • pnpm test — 89 unit tests pass across all 3 packages
  • pnpm test:smoke with token — 10 live read-only smoke tests pass against api-us.leadbay.app
  • MCP live e2e (10 steps, 32 assertions) — drives account_statuspull_leadsresearch_leadrecall_ordered_titlesreport_outreach (dry_run + verification rejection) → get_lens_filterlist_sectorsget_quota against the real backend, scans every response and log line for credential leaks
  • OpenClaw plugin live e2e (3 scenarios + 8 tool calls) — verifies default install hides write tools, exposeWrite=true exposes them, full live flow works
  • Contract test enforces every tool description contains "When to use" + "When NOT to use" sections
  • Manifest ↔ code parity test passes (50 declared tools all registered when exposeGranular+exposeWrite are on)
  • No credentials in any tracked file (.context/ and .claude/ gitignored; the literal "Password1!" coincidence in a doctring removed)

See packages/mcp/MIGRATION.md for the v0.1 → v0.2 upgrade guide.

🤖 Generated with Claude Code

Adds the agent-facing composite-tool surface that lets Claude (or any MCP
client) drive Leadbay end-to-end without the agent needing to know about
lens permissions, region routing, polling, or selection state. Designed
through a full /autoplan dual-voice review (CEO + Eng + DX) followed by
live API exploration; all 29 approved revisions are in this PR.

New composite tools (agent's default surface):
- pull_leads (replaces find_prospects; adds qualification_summary per lead)
- research_lead (qualification → signals → firmographics → contacts → engagement)
- bulk_qualify_leads (paginates past already-qualified, fan-out + poll, 429 mid-fanout)
- enrich_titles (selection-lifecycle managed, dry_run, 429 handling)
- adjust_audience (admin/non-admin auto-routing, sector free-text resolution, draft fallback)
- refine_prompt + answer_clarification (admin-gated, stale-clarification guard)
- recall_ordered_titles (preview-field path + live-aggregate fallback)
- account_status (quota + admin + intelligence state)
- report_outreach with mandatory verification (gmail_message_id | calendar_event_id | user_confirmed) — prevents pipeline poisoning

New granular tools (28): lens filter/scoring/draft/promote/create/update/active,
selection select/deselect/clear/ids, sectors taxonomy, user_prompt CRUD,
clarifications get/pick/dismiss, epilogue set/remove + responses, prospecting actions,
notes read, web_fetch read, bulk-enrichment preview/launch.

Client refactor: HTTP header capture, _meta envelope (region + endpoint +
latency_ms + retry_after), 429→QUOTA_EXCEEDED mapping (was RATE_LIMITED),
60s /me cache with invalidateMe() called by every write tool that mutates
cached fields, selection Mutex (for concurrent enrich_titles), region
auto-detect on login (us → fr fallback).

Gating model:
- LEADBAY_MCP_WRITE=1 — exposes composite + granular write tools (off by default)
- LEADBAY_MCP_ADVANCED=1 — exposes granular API tools (off by default)
- OpenClaw plugin: exposeWrite + exposeGranular config flags (both off by default)
- leadbay_login still hidden from MCP (UC-3, prompt-injection vector)

Mock mode (LEADBAY_MOCK=1) reads fixtures from .context/leadbay-live-shapes/
for agent-author dry-running. dry_run param on every state-changing composite.

Tests: 89 unit (54→58 core, 11→12 leadclaw, 19 mcp), 10 live read-only smoke,
plus end-to-end MCP and OpenClaw plugin live smokes against the real backend.
Per-tool description style enforced ("When to use" + "When NOT to use" sections).

Live-probe drift documented in .context/leadbay-live-shapes/SHAPE-DRIFT.md
(gitignored). Migration notes in packages/mcp/MIGRATION.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant