Skip to content

test: add Phase 1 input-variance table coverage#305

Merged
blove merged 7 commits into
mainfrom
claude/ci-testing-phase-1
May 13, 2026
Merged

test: add Phase 1 input-variance table coverage#305
blove merged 7 commits into
mainfrom
claude/ci-testing-phase-1

Conversation

@blove
Copy link
Copy Markdown
Contributor

@blove blove commented May 13, 2026

Summary

  • Adds input-variance table tests to four streaming-render units (chat-streaming-md, content-classifier, partial-args-bridge, a2ui parser).
  • Motivating regression: #290 (empty-assistant-bubble) shipped because every chat-streaming-md test fed input ending in \n; the "no trailing newline" LLM-response shape was uncovered.
  • Phase 1 only — additive tests, no production-code changes. Phase 0 (test-infra audit) and Phase 3 (AIMock E2E + CI wiring) deferred.

Spec: docs/superpowers/specs/2026-05-13-ci-testing-coverage-plan.md
Plan: docs/superpowers/plans/2026-05-13-ci-testing-coverage-phase-1.md

Test plan

  • chat suite green (92 files, 704 tests)
  • a2ui suite green (5 files, 54 tests)
  • PR fix(chat): empty assistant bubble for plain LLM responses #290 regression row (plain text no trailing newline) present in chat-streaming-md variance table
  • Two row corrections learned in flight (whitespace-only <p> placeholder; char-by-char streams) documented in the spec footnote

blove added 7 commits May 13, 2026 14:53
Defines the input-variance table-test approach for four streaming-render
units (chat-streaming-md, content-classifier, partial-args-bridge, a2ui
parser). Motivated by PR #290 — the empty-assistant-bubble bug shipped
because every chat-streaming-md test used input ending in '\n', and the
'no trailing newline' LLM-response shape was uncovered.

Phase 0 (test-infrastructure audit) and Phase 3 (AIMock E2E + CI wiring)
remain deferred.
Bite-sized TDD-style tasks for the four target units. All test files
new or append-only — no production code changes.
- Drop `selectorAbsent: 'p'` from chat-streaming-md 'whitespace only' row
  (markdown-it emits a placeholder <p> for whitespace input; the
  trimmed-text invariant still holds and is the only assertion that
  matters).
- Drop the char-by-char progressive-prefix row from partial-args-bridge
  variance. Partial-json materializes partially-parsed strings as their
  incomplete text, so the bridge's mount-once gate fires with a partial
  id ('r') and never re-targets when the full id ('root') resolves. LLM
  streams are token-chunked, not char-chunked, so this edge case has
  never bitten production. Spec footnote logs it as a latent concern
  for a future phase.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cacheplane Ready Ready Preview, Comment May 13, 2026 10:50pm

Request Review

@blove blove merged commit 8f9940a into main May 13, 2026
14 checks passed
blove added a commit that referenced this pull request May 14, 2026
* docs: add Phase 2a aimock E2E harness design spec

Phase 2a sits between Phase 1 (input-variance tables, #305) and the
scenario-coverage phases that will follow. Lands the harness, one
trivial smoke fixture, the per-PR CI job, and the daily drift-detection
workflow. Real product-level regression coverage is deferred to
Phase 2b+ as small additive PRs.

* docs: add Phase 2a aimock E2E harness implementation plan

8-task plan with Task 0 as a de-risk gate that validates the harness's
core assumptions (mock API shape, Python OpenAI SDK base-URL handoff,
LangGraph agent code compatibility) before any code lands.

* feat(examples-chat): scaffold aimock-e2e Nx project

* feat(examples-chat): add aimock-runner harness module

* feat(examples-chat): add hi.json seed fixture

* feat(examples-chat): add playwright config with globalSetup

* feat(examples-chat): add aimock-e2e smoke spec

* ci(examples-chat): add aimock-e2e per-PR job

* ci(examples-chat): add scheduled aimock fixture drift workflow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant