Skip to content

refactor(examples/pix-nfse-skeleton): raise bar to match installment-negotiation example#41

Merged
dangazineu merged 1 commit into
mainfrom
refactor/raise-bar-pix-nfse-skeleton
May 19, 2026
Merged

refactor(examples/pix-nfse-skeleton): raise bar to match installment-negotiation example#41
dangazineu merged 1 commit into
mainfrom
refactor/raise-bar-pix-nfse-skeleton

Conversation

@dangazineu

Copy link
Copy Markdown
Contributor

Raises the existing Pix + NFS-e walking skeleton to the bar set by #34 (whatsapp-installment-negotiation). That PR established a set of OSS-demo conventions — exact MCP pins, deep per-call assertions, and a README shape that walks a copying customer from the hook → why-an-agent → run paths → per-turn acceptance criteria → known gaps. This PR aligns the walking-skeleton example with the assertions-and-README parts of that bar.

This demo intentionally is not an agent-thesis demo (no LLM, no aimock, no multi-turn session.send()), so the W2 items that target the aimock / fixture / iterations machinery do not apply here. The applicable parts of the bar — exact MCP pins, deep per-call assertions, README structure with regex literals and exact values — are what this PR addresses.

What changed

Skeleton test depth

Promoted count-and-shape checks into per-call arg + output assertions, mirroring the installment-negotiation pattern:

  • Step 1 (asaas/create_customer) — data.id must match /^cus_demo_/ (was: only result.results[0].success === true).
  • Step 2 (asaas/create_payment) — data.id matches /^pay_demo_/ (tightened from /^pay_/), data.billingType === "PIX", data.value === 150.0, and data.customer is a string (proves step 1's id round-tripped through the bridge).
  • Step 3 (asaas/get_pix_qrcode) — data.payload matches /^00020126/ (the BR-Code static-EMV envelope header, was: only length > 0); data.encodedImage is a non-empty string.
  • Step 4 (nuvem-fiscal/create_nfse) — data.id matches /^nfse_demo_/ (tightened from /^nfse_/), data.status === "autorizada", plus data.numero and data.valorServico are numbers (was: only id + status).
  • Every result.results[i].success is asserted individually, not just result.success in aggregate — catches the case where one step silently fails but the aggregate flag is still computed truthy elsewhere.

README structure

Rewrote the opening hook around the explicit four-step list, then added a What this is, and what it isn't section that states the wire-contract framing plainly and links the sibling demos that exercise the agent surface (nfse-from-natural-language, whatsapp-installment-negotiation). The Acceptance criteria section was rewritten as numbered per-step invariants with exact values + regex literals so a copying customer reading the README understands what the demo guarantees, not just that it runs.

What was NOT changed

  • MCP version pins were already exact (@codespar/mcp-asaas@0.1.3, @codespar/mcp-nuvem-fiscal@0.2.1) — the "switch caret to exact" part of the W2 bar doesn't apply to this demo's package.json.
  • No runtime, SDK, or MCP server code touched.
  • live.test.ts left as-is — the live smoke is not part of "raise the bar" and tightening it is filed as follow-on (see below).
  • No shared root-level files (package.json, tsconfig.base.json, .github/workflows/) touched.
  • Live smoke (npm run validate:live) NOT run by this PR — costs real Anthropic credits. The default npm run validate (Docker mode against ghcr.io/codespar/codespar:latest) was run locally and passes; a coordinator-side live-smoke pass against api.anthropic.com is the next gate.

Filed as follow-on (out of scope for this PR)

  • Extracting shared validate.sh / tsconfig.json / vitest.config.ts / .gitignore templates into examples/_shared/.
  • Building shared test helpers (assertToolCall, typed defineFixtures).
  • An examples/README.md index and a scripts/new-example.sh generator.
  • Tightening this demo's own live.test.ts to assert step-specific arg + output shapes against real Claude (current live test only asserts at-least-one-call-per-provider).

Local verification

cd examples/pix-nfse-skeleton
npm install
npm run validate   # Docker mode: ghcr.io/codespar/codespar:latest

Test Files 1 passed | 1 skipped (2) — the live-smoke describe block stays skipped without CODESPAR_LIVE_SMOKE=1.

…negotiation example

Promote count-and-shape checks into per-call arg + output assertions,
matching the pattern set in #34 (whatsapp-installment-negotiation).
A wire-contract demo is the wrong place for "any shape with these
field names" checks — the whole point is catching when the demo
fixture, the bridge, or the loop dispatcher drifts. The new
assertions pin:

- The customer id round-trips through to step 2's `customer` field.
- Step 2 carries `billingType: "PIX"` and `value: 150` exactly, not
  just an id matching /^pay_/.
- Step 3's payload starts with the BR-Code static-EMV envelope
  header (`/^00020126/`) — a generic non-empty-string check would
  pass for any junk the bridge happened to return.
- Step 4 emits both `numero` and `valorServico` as numbers, not just
  the id + status pair.
- Every step's `success` is checked individually, not only
  `result.success` in aggregate.

README rewritten to (1) open with the four-step list as a hook,
(2) state explicitly that this is infrastructure validation and not
an agent-thesis demo, with sibling links to the demos that exercise
the agent surface, and (3) restate Acceptance criteria as numbered
per-step invariants with exact values + regex literals so a copying
customer knows what the demo guarantees.

MCP version pins were already exact (`0.1.3`, `0.2.1`) on this demo;
no package.json change is needed for that part of the bar. No
`live.test.ts` or shared root-level files touched.
@dangazineu dangazineu merged commit cb09e32 into main May 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant