test: add comprehensive test suite with 1000 test cases by seedquan · Pull Request #3 · iamtouchskyer/opc

seedquan · 2026-04-10T16:25:18Z

Summary

Add 8 test files with 1000 test cases covering all modules
Uses Node.js built-in node:test + node:assert — zero external dependencies
All 1000 tests pass in ~4 seconds

Test Coverage

File	Tests	Coverage
eval-parser.test.mjs	300	Severity detection, file refs, verdicts, fix lines, reasoning, hedging, findings count, edge cases
flow-commands.test.mjs	300	cmdRoute, cmdInit, cmdValidate, cmdTransition, cmdValidateChain — full state machine
eval-commands.test.mjs	150	cmdVerify, cmdSynthesize, cmdReport, cmdDiff with oscillation detection
viz-commands.test.mjs	50	getMarker, cmdViz, cmdReplayData
flow-templates.test.mjs	50	Structure validation, edge completeness, limit ranges
opc-cli.test.mjs	50	version, help, install, uninstall via child process
verify-devil-advocate.test.mjs	50	Challenge/verdict parsing, quality checks via python3
integration.test.mjs	50	End-to-end flows, error recovery

Run

node --test tests/*.test.mjs

Test plan

All 1000 tests pass locally
No external dependencies added
Tests are independent (no shared state)
File I/O tests use temp directories with cleanup

🤖 Generated with Claude Code

Add 8 test files covering all modules with node:test + node:assert: - eval-parser (300 tests): severity detection, file refs, verdicts, fix lines, reasoning, hedging, findings count, edge cases - flow-commands (300 tests): cmdRoute, cmdInit, cmdValidate, cmdTransition, cmdValidateChain with full state machine coverage - eval-commands (150 tests): cmdVerify, cmdSynthesize, cmdReport, cmdDiff with oscillation detection - viz-commands (50 tests): getMarker, cmdViz, cmdReplayData - flow-templates (50 tests): structure validation, edge completeness - opc-cli (50 tests): version, help, install, uninstall via child process - verify-devil-advocate (50 tests): challenge/verdict parsing, quality checks - integration (50 tests): end-to-end flows, error recovery All 1000 tests pass in ~4 seconds. Zero external dependencies. Run with: node --test tests/*.test.mjs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add 7 verification test files with deep boundary, property, and integration testing: - verify-parser-boundaries (200): regex edge cases, encoding stress, fuzzy inputs, large-scale, regression patterns - verify-parser-properties (150): idempotency, count consistency, ordering, verdict/file-ref/severity/hedging invariants - verify-flow-state-machine (200): exhaustive route table, state invariants, limit exhaustion, concurrent state, full traversals - verify-handshake-schema (150): field types, enum boundaries, artifact paths, evidence rules, cross-field, malformed JSON - verify-synthesis-diff (150): verdict logic, role extraction, diff normalization, oscillation thresholds, report generation - verify-viz-replay (50): marker transitions, viz consistency, replay data completeness - verify-e2e-scenarios (100): happy paths, fail loops, oscillation, max limits, devil's advocate, report-replay round-trips Bug found: cmdValidate crashes on null JSON input (V685) All 2000 tests pass in ~4 seconds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Addresses 3 ITERATE findings from U1.6r contract + semantics reviewers: 1. fireArtifactEmit: recordSuccess was unconditionally resetting _failStreak after per-item write failures, so circuit-breaker would never trip on persistent write failures. Track anyItemFailed and only call recordSuccess when every item in the call succeeded. (semantics F1, contract #2) 2. fireArtifactEmit: accept ArrayBufferView (Uint8Array, DataView) in addition to string / Buffer. Modern APIs (crypto.subtle, TextEncoder, Playwright) commonly return Uint8Array — tight Buffer.isBuffer check was silently dropping them with a misleading WARN. (semantics F2) 3. cmdExtensionArtifact: add nodeCapabilities to stdout JSON for consistency with cmdExtensionVerdict. (contract #1) 4. CONTRIBUTING.md: document executeRun + artifactEmit hooks with sample skeleton + hook surface summary table. (contract #3) Regression tests: 4 new tests (Uint8Array accepted, _failStreak persists across calls, success reset is all-or-nothing, CLI JSON includes nodeCapabilities). Total 118/118 extension tests, 22/22 suite files green.

…llow-up) Reviewer B (U2.8d) found two real bugs in the U2.8c JSON sidecar fix: 🔴 #2 dedup key collision: `${ext}|${hook}|${kind}|${message}` doesn't escape `|`. Two genuinely different failures collide silently: A: ext="a|b", hook="c" → "a|b|c|error|msg" B: ext="a", hook="b|c" → "a|b|c|error|msg" Fix: use JSON.stringify on a tuple `[ext,hook,kind,message]` — keys are unambiguous regardless of field contents. 🔴 #5 droppedTotal overwrite: the field name promises accumulation ("droppedTotal") but the code wrote `dropped` from the current call, silently resetting prior cap-overflow signal across CLI invocations. Fix: read priorDropped from sidecar, write `priorDropped + dropped`. Markdown view's "N earlier failure record(s) dropped" message now reflects the lifetime total, not just the last call. Verification: - New unit tests: 6.1 pipe-collision: A and B above both preserved (length=2) 7.1 droppedTotal accumulates 5+3+0 = 8 across three CLI invocations - test-run2-failure-merge.sh: 11/11 pass (was 9/9) - Full suite: 27/27 still pass — no regression Out-of-scope (acknowledged, deferred): - #3 R-M-W race under concurrent CLI lanes sharing runDir: documented single-writer invariant assumption; future work if multi-lane CI lands. - #4 schema drift on top-level unknown fields: per-entry fields already preserved (we spread the whole entry); top-level only carries failures+ droppedTotal so drift surface is bounded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

iris and others added 2 commits April 11, 2026 00:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add comprehensive test suite with 1000 test cases#3

test: add comprehensive test suite with 1000 test cases#3
seedquan wants to merge 2 commits intoiamtouchskyer:mainfrom
seedquan:feat/comprehensive-test-suite

seedquan commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seedquan commented Apr 10, 2026

Summary

Test Coverage

Run

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant