refactor(examples/pix-nfse-skeleton): raise bar to match installment-negotiation example#41
Merged
Merged
Conversation
…negotiation example Promote count-and-shape checks into per-call arg + output assertions, matching the pattern set in #34 (whatsapp-installment-negotiation). A wire-contract demo is the wrong place for "any shape with these field names" checks — the whole point is catching when the demo fixture, the bridge, or the loop dispatcher drifts. The new assertions pin: - The customer id round-trips through to step 2's `customer` field. - Step 2 carries `billingType: "PIX"` and `value: 150` exactly, not just an id matching /^pay_/. - Step 3's payload starts with the BR-Code static-EMV envelope header (`/^00020126/`) — a generic non-empty-string check would pass for any junk the bridge happened to return. - Step 4 emits both `numero` and `valorServico` as numbers, not just the id + status pair. - Every step's `success` is checked individually, not only `result.success` in aggregate. README rewritten to (1) open with the four-step list as a hook, (2) state explicitly that this is infrastructure validation and not an agent-thesis demo, with sibling links to the demos that exercise the agent surface, and (3) restate Acceptance criteria as numbered per-step invariants with exact values + regex literals so a copying customer knows what the demo guarantees. MCP version pins were already exact (`0.1.3`, `0.2.1`) on this demo; no package.json change is needed for that part of the bar. No `live.test.ts` or shared root-level files touched.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Raises the existing Pix + NFS-e walking skeleton to the bar set by #34 (whatsapp-installment-negotiation). That PR established a set of OSS-demo conventions — exact MCP pins, deep per-call assertions, and a README shape that walks a copying customer from the hook → why-an-agent → run paths → per-turn acceptance criteria → known gaps. This PR aligns the walking-skeleton example with the assertions-and-README parts of that bar.
This demo intentionally is not an agent-thesis demo (no LLM, no aimock, no multi-turn
session.send()), so the W2 items that target the aimock / fixture / iterations machinery do not apply here. The applicable parts of the bar — exact MCP pins, deep per-call assertions, README structure with regex literals and exact values — are what this PR addresses.What changed
Skeleton test depth
Promoted count-and-shape checks into per-call arg + output assertions, mirroring the installment-negotiation pattern:
asaas/create_customer) —data.idmust match/^cus_demo_/(was: onlyresult.results[0].success === true).asaas/create_payment) —data.idmatches/^pay_demo_/(tightened from/^pay_/),data.billingType === "PIX",data.value === 150.0, anddata.customeris a string (proves step 1's id round-tripped through the bridge).asaas/get_pix_qrcode) —data.payloadmatches/^00020126/(the BR-Code static-EMV envelope header, was: onlylength > 0);data.encodedImageis a non-empty string.nuvem-fiscal/create_nfse) —data.idmatches/^nfse_demo_/(tightened from/^nfse_/),data.status === "autorizada", plusdata.numeroanddata.valorServicoare numbers (was: only id + status).result.results[i].successis asserted individually, not justresult.successin aggregate — catches the case where one step silently fails but the aggregate flag is still computed truthy elsewhere.README structure
Rewrote the opening hook around the explicit four-step list, then added a
What this is, and what it isn'tsection that states the wire-contract framing plainly and links the sibling demos that exercise the agent surface (nfse-from-natural-language,whatsapp-installment-negotiation). TheAcceptance criteriasection was rewritten as numbered per-step invariants with exact values + regex literals so a copying customer reading the README understands what the demo guarantees, not just that it runs.What was NOT changed
@codespar/mcp-asaas@0.1.3,@codespar/mcp-nuvem-fiscal@0.2.1) — the "switch caret to exact" part of the W2 bar doesn't apply to this demo'spackage.json.live.test.tsleft as-is — the live smoke is not part of "raise the bar" and tightening it is filed as follow-on (see below).package.json,tsconfig.base.json,.github/workflows/) touched.npm run validate:live) NOT run by this PR — costs real Anthropic credits. The defaultnpm run validate(Docker mode againstghcr.io/codespar/codespar:latest) was run locally and passes; a coordinator-side live-smoke pass againstapi.anthropic.comis the next gate.Filed as follow-on (out of scope for this PR)
validate.sh/tsconfig.json/vitest.config.ts/.gitignoretemplates intoexamples/_shared/.assertToolCall, typeddefineFixtures).examples/README.mdindex and ascripts/new-example.shgenerator.live.test.tsto assert step-specific arg + output shapes against real Claude (current live test only asserts at-least-one-call-per-provider).Local verification
Test Files 1 passed | 1 skipped (2)— the live-smoke describe block stays skipped withoutCODESPAR_LIVE_SMOKE=1.