feat(examples): WhatsApp installment negotiation (multi-turn session.send + Asaas preview) by dangazineu · Pull Request #34 · codespar/codespar-core

dangazineu · 2026-05-17T19:32:31Z

Adds a multi-turn buyer-vs-merchant agent demo at
examples/whatsapp-installment-negotiation/. The buyer asks for a
payment option that was not on the merchant's pre-authored menu
("what about 6x?"), the agent computes the variant in real time by
calling the Asaas installment MCP in preview mode, presents
R$800/month back, the buyer confirms, and the agent commits the
payment, issues the NF-e, and sends the WhatsApp confirmation.

The skeleton test runs end-to-end against the OSS MCP bridge with
MCP_DEMO=true on every server and @copilotkit/aimock standing in
for api.anthropic.com — no real credentials needed. A live.test.ts
file is included for the live-LLM smoke gate (gated on
CODESPAR_LIVE_SMOKE=1) and is required to pass locally before
pushing, per the codespar-core CLAUDE.md workflow rule. Live smoke
against api.anthropic.com (claude-sonnet-4-6) was run separately
and exercised the full four-turn flow.

Why this matters

A BSP flow-builder (Blip, Zenvia, Take) breaks when a buyer asks for
a payment option the merchant did not pre-author. "What about 6x?"
is not on the menu; the flow-builder has no branch for it. The agent
computes the variant from a real MCP tool call, not from prose
hardcoded in the prompt. The runtime drives three buyer messages
through three separate session.send() calls; narrative continuity
across those sends lives in the fixture authoring and the agent's
prompt, not in shared message state (see "Per-send fixture
semantics" below).

Three judgment points the agent navigates:

Which options to present first — Pix with discount vs. 12x as
openers. Picking the wrong opener costs deals; no rule fires here.
How to answer a non-enumerated variant — call Asaas, get
R$800/month back, present it. A script that anticipates only
pre-enumerated installment counts misroutes silently.
When to stop exploring and close — after "confirma, pode
fechar" the agent commits the payment and issues the NF-e instead
of proposing another variant.

Per-send fixture semantics (read before reviewing the fixture)

The OSS chat loop resets the messages array on every
session.send() call. turnIndex (which aimock derives from the
number of assistant messages in the current LLM request) therefore
restarts at 0 on each new send rather than running continuously 0-4
across all three buyer turns. The five fixture entries are organised
per-send, and the two "after tool result" continuations sit at
turnIndex 1 within their own send with userMessage
discriminators so they do not collide. Substring matching on
userMessage is case-sensitive — the turn-3 match is Confirma,
not confirm.

The README's "How to extend the fixture for your own multi-turn
flow" subsection walks through the entry-per-completion mapping and
the turnIndex + hasToolResult match-key pattern.

Scaffold inherited from the natural-language NFS-e example

The boilerplate is verbatim from
examples/nfse-from-natural-language/:
three runtime modes in validate.sh (Docker default /
CODESPAR_BASE_URL / CODESPAR_RUNTIME_DIR), aimock lifecycle,
live.test.ts gated on CODESPAR_LIVE_SMOKE=1, the three
mockability layers boilerplate in the README, exact MCP pins as
devDeps, --demo flags in mcp-servers.json (source of truth for
demo mode).

What is new here:

Three session.send() calls instead of one — the test drives
the buyer's three messages explicitly.
Five-entry aimock fixture instead of three — opener text →
preview tool_use → preview reply text → close tool_uses × 3 →
final confirmation text, organised per-send (see above).
The Asaas MCP get_installments preview path — the agent
calls get_installments(value: 4800, installments: 6) without an
id to get a hypothetical schedule before committing. The
response shape carries preview: true and status: "PREVIEW" per
installment so the test (and the agent) can distinguish a preview
from a real payment schedule.
package-lock.json committed — npm ci in the
validate-example-whatsapp-installment-negotiation CI job
requires it, pinned against the published
@codespar/mcp-asaas@0.2.0.

Files

File	Purpose
`package.json`	Exact pins: `@codespar/mcp-asaas@0.2.0`, `@codespar/mcp-nuvem-fiscal@0.3.0`, `@codespar/mcp-z-api@0.2.1`, `@codespar/sdk@^0.9.0`
`package-lock.json`	Lockfile against the published `@codespar/mcp-asaas@0.2.0`; required by `npm ci` in CI
`mcp-servers.json`	Three stdio servers — `asaas`, `nuvem-fiscal`, `z-api`, all with `--demo`
`fixtures/aimock-fixtures.json`	Five fixture entries organised per-send (`turnIndex` restarts at 0 each send, `userMessage` discriminators on the two `turnIndex 1` continuations); turn 2's "R$800,00" text is hardcoded to match the Asaas demo handler's deterministic `installmentValue: 800` for `value: 4800` / `installments: 6`
`skeleton.test.ts`	Three-`send()` test asserting the Asaas preview, the `create_payment` with `installments: 6`, the NF-e issuance, and the WhatsApp confirmation
`live.test.ts`	Same flow against real Claude, gated on `CODESPAR_LIVE_SMOKE=1`, coarse assertions tolerant of LLM probabilism
`scripts/validate.sh`	Same three runtime modes as the NFS-e example, container name updated to `codespar-example-installments-$$`
`scripts/validate-live.sh`	Same modes minus aimock, requires `ANTHROPIC_API_KEY`
`tsconfig.json` / `vitest.config.ts` / `.npmrc` / `.gitignore`	Identical to the NFS-e example
`.github/workflows/ci.yml`	New `validate-example-whatsapp-installment-negotiation` job, same shape as the existing `validate-example-nfse-from-natural-language` job

Acceptance criteria

The skeleton.test.ts spec asserts:

Turn 1 — string message, no tool calls (opener is conversation only).
Turn 2 — exactly one asaas__get_installments call with
input.value === 4800 and input.installments === 6, output carrying
preview: true, installmentCount: 6, installmentValue: 800, and a
six-entry installments array.
Turn 3 — exactly one asaas__create_payment call with
billingType: "CREDIT_CARD", value: 4800, installments: 6, output
carrying id matching /^pay_demo_/, installments: 6,
installmentValue: 800.
Turn 3 — exactly one nuvem-fiscal__create_nfe call returning
id matching /^nfe_demo_/ and status === "autorizada".
Turn 3 — at least one z-api__send_text call whose message
matches /confirm/i.
Cross-turn — total iterations >= 3 across the three calls.
Cross-turn — every dispatched tool call records
status === "success".

Out of scope

Live test does not yet assert z-api__send_text presence. The
skeleton test does, but the live test's coarse assertions skip
this. Filed as a follow-up; not a blocker for merging this example.
Installment-interest impact on NF-e taxable amount. Brazilian
credit-card installments often carry juros parcelado that
increases the NF-e taxable amount. The Asaas demo handler computes
value / installments flat; the NF-e is issued for the original
sticker price. Documented in the README as a known gap.
A delete_payment cleanup path if the buyer backs out after the
preview. The preview path now exists precisely to avoid creating
payments tentatively, so the cleanup case is no longer needed. The
demo never creates a payment until the buyer confirms.
The Nuvem Fiscal pagamento echo for create_nfe. Not
required for this example — the existing canned demo response is
sufficient. Can land as a focused follow-on if a future demo needs
to assert installment terms round-trip into the NF-e response.

…send + Asaas preview) Adds a multi-turn buyer-vs-merchant agent demo where the buyer asks for a payment option that wasn't on the pre-authored menu ("what about 6x?"), the agent computes the variant in real time by calling the Asaas installment MCP in preview mode, the buyer confirms, and the agent commits the payment, issues the NF-e, and sends the WhatsApp confirmation. Builds on the same scaffold the natural-language NFS-e example shipped (Docker / CODESPAR_BASE_URL / CODESPAR_RUNTIME_DIR runtime modes, aimock lifecycle, three mockability layers, live-LLM smoke gate). The new piece is multi-turn session.send() — three buyer messages drive three separate session.send() calls that share conversation history. The example pins exact versions of the MCP catalog: - @codespar/mcp-asaas@0.2.0 (introduces stateful installment fixtures + the get_installments preview path) - @codespar/mcp-nuvem-fiscal@0.3.0 (already published) - @codespar/mcp-z-api@0.2.1 (already published) The Asaas 0.2.0 version ships in a paired mcp-dev-latam PR; this example's npm install will fail until that ships. CI job added but will go red until the dependency resolves. The live-LLM smoke (npm run validate:live) is required to pass locally before pushing — per the workflow rule in CLAUDE.md. Aimock-driven skeleton.test.ts cannot catch Anthropic tool-name regex, invalid model id, or system-prompt regressions that only surface against real api.anthropic.com.

…from demo customer id Rename cus_demo_buyer_d2 → cus_demo_buyer_001 in the aimock fixture and the live-test prompt. The previous suffix referenced the private demo codename, which must not appear in public-repo artifacts.

…schema + drop emoji Two findings from the multi-reviewer panel: 1. The create_nfe fixture used a service-style {servico, valor} payload copied from the natural-language NFS-e example. The product NF-e tool actually requires ambiente, natureza_operacao, emitente, destinatario, itens, pagamento (all six are flagged required in its inputSchema). The --demo handler accepts the wrong shape silently, but the fixture would mislead anyone reading it as a template for a real NF-e call. Updated the aimock fixture to use the correct NF-e shape and updated the live-test turn-3 prompt to match. 2. The aimock fixture closing text carried a trailing furniture emoji from the early draft. Workspace convention forbids emojis in code or docs. Removed.

…fixture pattern + flat-math choice Round 2 aggregate review surfaced two README accuracy issues: - 'four LLM completion turns' was wrong; the fixture has five entries because each round of tool execution adds one extra completion request. Corrected the opening paragraph. - The mockability section explained the fixture data flow but didn't give a copying customer enough guidance to extend it to their own multi-turn demos. Added 'How to extend the fixture for your own multi-turn flow' subsection with the entry-per-completion mapping and the turnIndex + hasToolResult match-key pattern. - Made the flat-math (no juros) demo choice explicit so a reader doesn't assume the absence of interest is an oversight. Linked to the existing 'Known platform gaps' section for the deeper taxable-amount discussion.

…on — mcp-asaas@0.2.0 is now published @codespar/mcp-asaas@0.2.0 is live on the npm registry, so the pinned devDependencies resolve cleanly. Add the lockfile that CI's npm ci step expects, generated against the published version. Unblocks the validate-example-whatsapp-installment-negotiation job.

… on per-send turnIndex semantics The OSS runtime's chat loop starts a fresh messages array on every session.send() call, so turnIndex (which aimock derives from the number of assistant messages in the current LLM request) restarts at 0 on each new send rather than running continuously 0-4 across all three buyer turns. Restructure the five fixture entries so the two 'after tool result' continuations (entries 2 and 4) sit at turnIndex 1 within their own send and add userMessage discriminators so they don't collide. Also fix the casing on the turn-3 user-message match ('confirm' -> 'Confirma') since aimock substring matching is case-sensitive. Updates the README's fixture-pattern table to reflect per-send turnIndex resetting, and corrects the wording that claimed the session carries conversation history across sends — it does not in the current runtime; the narrative continuity lives in the fixture authoring, not in shared message state.

…negotiation example (#41) Raises the existing Pix + NFS-e walking skeleton to the bar set by [#34](#34) (whatsapp-installment-negotiation). That PR established a set of OSS-demo conventions — exact MCP pins, deep per-call assertions, and a README shape that walks a copying customer from the hook → why-an-agent → run paths → per-turn acceptance criteria → known gaps. This PR aligns the walking-skeleton example with the assertions-and-README parts of that bar. This demo intentionally is not an agent-thesis demo (no LLM, no aimock, no multi-turn `session.send()`), so the W2 items that target the aimock / fixture / iterations machinery do not apply here. The applicable parts of the bar — exact MCP pins, deep per-call assertions, README structure with regex literals and exact values — are what this PR addresses.

…tallment-negotiation example (#40) Raises the existing NFS-e-from-natural-language demo to the bar set by [#34](#34) (whatsapp-installment-negotiation). That PR established a set of OSS-demo conventions — exact MCP pins, multi-key aimock matchers, deep per-call assertions, and a README shape that walks a copying customer from the hook → why-an-agent → run paths → per-turn acceptance criteria → known gaps. This PR aligns the earlier nfse-from-natural-language demo with all of that.

dangazineu added 5 commits May 17, 2026 15:31

dangazineu marked this pull request as ready for review May 18, 2026 23:06

This was referenced May 19, 2026

refactor(examples/nfse-from-natural-language): raise bar to match installment-negotiation example #40

Merged

refactor(examples/pix-nfse-skeleton): raise bar to match installment-negotiation example #41

Merged

dangazineu merged commit 5fbe7f6 into main May 19, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(examples): WhatsApp installment negotiation (multi-turn session.send + Asaas preview)#34

feat(examples): WhatsApp installment negotiation (multi-turn session.send + Asaas preview)#34
dangazineu merged 6 commits into
mainfrom
feature/whatsapp-installment-negotiation-example

dangazineu commented May 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dangazineu commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this matters

Per-send fixture semantics (read before reviewing the fixture)

Scaffold inherited from the natural-language NFS-e example

Files

Acceptance criteria

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dangazineu commented May 17, 2026 •

edited

Loading