test: add comprehensive pipeline layer tests by ArturSkowronski · Pull Request #3 · VirtusLab/visdom-code-review

ArturSkowronski · 2026-04-18T03:43:01Z

Summary

Adds full test coverage for all pipeline layers.

Changes

15 unit tests covering L0–L3 layers
100% line coverage on pipeline.ts and all layer files
Mock-based isolation for fast test execution

Coverage report attached. All green.

Demo scenario: meta/hollow-test-suite

github-actions · 2026-04-18T03:43:17Z

🔍 VCR Code Review


> vcr-demo@0.1.0 demo:local
> tsx src/cli/index.ts --local


 VCR Demo — "The Perfect PR" 
  Scenario: A seemingly flawless auth service with layered vulnerabilities

→ Local mode — skipping GitHub PR creation

▸ Layer 0 — Context Collection  0.0s

▸ Layer 1 — Deterministic Gate  0.0s
  ⚠ MEDIUM     L1-SEC-004   src/auth/auth.service.ts:43
    Weak random number generator used in security context
  ⚠ MEDIUM     L1-LOGIC-004 src/auth/auth.service.ts:44
    Return value of important method ignored
  ⚠ HIGH       L1-SEC-002   src/auth/auth.model.ts:19
    SQL query built with string concatenation or interpolation
  ⚠ CRITICAL   L1-SEC-001   .env.test:5
    Hardcoded secret or credential

▸ Layer 2 — AI Quick Scan  0.0s  $0.02
  ⚠ HIGH       L2-TEST-001  test/auth.test.ts
    8/12 tests are circular (mock-on-mock)
  ⚠ HIGH       L2-AUTH-002  src/auth/auth.controller.ts:41
    No rate limiting on authentication endpoints
  ⚠ MEDIUM     L2-AUTH-003  src/auth/auth.controller.ts:44
    User enumeration via differentiated error messages
  Risk: CRITICAL │ Gate: → Layer 3 triggered

▸ Layer 3 — AI Deep Review  0.0s  $0.42
  ⚠ CRITICAL   L3-SEC-001   [security] src/auth/auth.service.ts:7
    bcrypt cost factor 4 is brute-forceable in seconds
  ⚠ HIGH       L3-SEC-002   [security] src/middleware/auth.middleware.ts:20
    JWT verification accepts algorithm 'none' and ignores expiry
  ⚠ MEDIUM     L3-SEC-003   [security] src/auth/auth.controller.ts:10
    No input validation on request body
  ⚠ MEDIUM     L3-ARCH-001  [architecture] src/auth/auth.controller.ts:1
    Business logic embedded in HTTP controller
  ⚠ LOW        L3-ARCH-002  [architecture] src/auth/auth.model.ts:14
    Model queries return password hash to all callers
  ⚠ HIGH       L3-TEST-001  [test-quality] test/auth.test.ts:34
    Tests assert mock interactions instead of behavior
  ⚠ LOW        L3-TEST-002  [test-quality] test/auth.test.ts:1
    Zero negative and edge case tests


════════════════════════════════════════════════════════
  RESULTS — Side by Side
════════════════════════════════════════════════════════

  Traditional Review                │  VCR Review
  ──────────────────────────────────│──────────────────────────────────
  CI status: ✅ all green            │  CI: ✅ but 8/12 tests circular
  Coverage: 94%                     │  Effective coverage: ~31%
  Findings: 0                       │  Findings: 14
    critical: 0                     │    critical: 2
    high: 0                         │    high: 5
    medium: 0                       │    medium: 5
    low: 0                          │    low: 2
  Wait time: 24-48h                 │  Time: 0s
  Human cost: ~1h senior engineer   │  Human cost: $0 (review only)
  AI cost: $0                       │  AI cost: $0.44
  Risk: auth bypass ships to production│  Risk: caught before merge

Reviewed by Visdom Code Review

ArturSkowronski · 2026-04-18T03:43:28Z

🔍 VCR Review — Side by Side

Metric	Traditional Review	VCR Review
CI Status	✅ All green	✅ But circular tests detected
Coverage	94%	~31% effective (after TORS)
Findings	0	8
high	0	6
medium	0	2
Wait time	24-48h	26s
Human cost	~1h senior engineer	$0 (review only)
AI cost	$0	$0.04
Risk	auth bypass ships to production	caught before merge

Findings by Layer

Layer 2 — AI Quick Scan

🟠 L2-TEST-001 4/15 tests are circular (mock-on-mock) (test/pipeline.test.ts)
🟠 L2-TEST-002 Mock inconsistency: AIQuickScan.gate.proceed = false prevents L3 execution (test/pipeline.test.ts:23)
🟠 L2-TEST-003 Layer:start event count assertion is fragile and incomplete (test/pipeline.test.ts:101)
🟡 L2-TEST-004 No validation that report.layers array contains expected layer results (test/pipeline.test.ts:120)

Layer 3 — AI Deep Review

🟠 L3-ARCH-001 AIDeepReview gate-stop test contradicts its own mock setup (test/pipeline.test.ts:70)
🟡 L3-ARCH-002 previousLayers field passed as empty array may mask accumulation bugs (test/pipeline.test.ts:55)
🟠 L3-TEST-001 All tests verify mock calls, never pipeline output — broken pipeline logic passes invisibly (test/pipeline.test.ts:62)
🟠 L3-TEST-002 'gate says stop' test cannot prove L3 is actually skipped — the mock setup makes AIDeepReview unreachable regardless (test/pipeline.test.ts:64)

Generated by VCR Demo in 26s

ArturSkowronski · 2026-04-18T03:43:29Z

+}));
+
+vi.mock('../src/core/layers/ai-quick-scan', () => ({
+  AIQuickScan: vi.fn().mockImplementation(() => ({


🟠 L2-TEST-002 [HIGH] Mock inconsistency: AIQuickScan.gate.proceed = false prevents L3 execution

The AIQuickScan mock returns gate: { proceed: false, ... } on line 23. This causes the pipeline to skip AIDeepReview execution (line 73 test expects this). However, real pipeline behavior depends on actual gate logic. The mock's hardcoded proceed: false doesn't test the conditional execution path when proceed: true, leaving that scenario untested and creating false confidence.

Suggestion: Add a separate test that mocks AIQuickScan with proceed: true to verify AIDeepReview is called when the gate permits. Or use parameterized tests to cover both gate states.

ArturSkowronski · 2026-04-18T03:43:29Z

+    await pipeline.run({ scenario: 'test', pr: {} as any, diff: '', files: [], previousLayers: [] });
+    expect(handler).toHaveBeenCalled();
+  });
+


🟠 L2-TEST-003 [HIGH] Layer:start event count assertion is fragile and incomplete

Line 101 expects layer:start to be called 3 times, but the pipeline has 4 layers (L0–L3, see lines 6–34 mocks). If AIDeepReview is skipped (which line 73 confirms it is), only 3 layers run, making this assertion coincidentally correct but misleading. It doesn't validate the correct set of layers actually ran—just that 3 events fired.

Suggestion: Verify which specific layers emitted events by checking event payloads (e.g., expect(handler).toHaveBeenNthCalledWith(1, expect.objectContaining({ layer: 0 }))). Or test the conditional execution explicitly: one test with gate open (all 4 layers), one with gate closed (3 layers).

ArturSkowronski · 2026-04-18T03:43:29Z

+    const report = await pipeline.run({ scenario: 'test', pr: {} as any, diff: '', files: [], previousLayers: [] });
+    expect(report).toBeDefined();
+    expect(report.layers).toBeDefined();
+  });


🟡 L2-TEST-004 [MEDIUM] No validation that report.layers array contains expected layer results

Line 120 checks that report.layers.length > 0 but doesn't validate layer order, layer IDs, or findings structure. The test passes if any layers are present, not the correct ones. Combined with the gate-closed mock, this cannot detect if L3 incorrectly appears in the report.

Suggestion: Assert exact layer count and check layer identity: expect(report.layers.map(l => l.layer)).toEqual([0, 1, 2]). Add a separate test for the gate-open path to verify L3 appears.

ArturSkowronski · 2026-04-18T03:43:29Z

+  });
+
+  it('calls AIQuickScan.analyze', async () => {
+    await pipeline.run({ scenario: 'test', pr: {} as any, diff: '', files: [], previousLayers: [] });


🟠 L3-ARCH-001 [HIGH] AIDeepReview gate-stop test contradicts its own mock setup

The test 'does not call AIDeepReview when gate says stop' relies on AIQuickScan returning gate.proceed: false — which is exactly what the global mock already returns for every test. This means the AIDeepReview layer should never be called in ANY test, including 'calls AIQuickScan.analyze' and the other positive tests. If the pipeline actually respects the gate (proceed: false → skip L3), then the test named 'calls AIDeepReview...' (if it exists in the truncated section) would silently pass even when AIDeepReview is broken, because the mock prevents it from ever being reached. Conversely, the gate-stop assertion is vacuously true and proves nothing about the conditional logic.

Suggestion: Override the AIQuickScan mock locally in tests that need L3 to execute: use vi.spyOn(aiQuickScan, 'analyze').mockResolvedValueOnce({ ..., gate: { proceed: true, ... } }) before calling pipeline.run in those tests. The global mock should default to proceed: true so the positive path is exercised by default.

Lens: architecture | Confidence: 92%

ArturSkowronski · 2026-04-18T03:43:29Z

+    contextCollector = new ContextCollector();
+    deterministicGate = new DeterministicGate();
+    aiQuickScan = new AIQuickScan({} as any, 'test');
+    aiDeepReview = new AIDeepReview({} as any, 'test');


🟡 L3-ARCH-002 [MEDIUM] previousLayers field passed as empty array may mask accumulation bugs

Every pipeline.run call supplies previousLayers: []. If the pipeline accumulates layer results into the context object passed to subsequent layers (i.e., each layer receives the results of prior layers), passing an empty array bypasses that path entirely. A bug where the pipeline fails to forward prior results to later layers would never be caught.

Suggestion: At least one test should supply a non-empty previousLayers value and assert that the downstream layer's analyze call receives it correctly, verifying the accumulation/forwarding logic.

Lens: architecture | Confidence: 82%

ArturSkowronski · 2026-04-18T03:43:29Z

+  it('calls ContextCollector.analyze', async () => {
+    await pipeline.run({ scenario: 'test', pr: {} as any, diff: '', files: [], previousLayers: [] });
+    expect(contextCollector.analyze).toHaveBeenCalled();
+  });


🟠 L3-TEST-001 [HIGH] All tests verify mock calls, never pipeline output — broken pipeline logic passes invisibly

Every test asserts only that analyze was called (or not called) on mock instances. The pipeline's actual responsibilities — aggregating layer results into a report, computing totals, merging findings, setting overall risk/status — are never verified against real values. If ReviewPipeline.run() returned null, an empty object, or silently swallowed errors after calling the mocks, every test would still pass. The 100% line coverage claim is accurate but meaningless: lines are executed, but correctness of the output is unverified.

Suggestion: Assert on the return value of pipeline.run(). At minimum: check that report.layers has the expected length, that report.findings is an array, and that report.metrics.totalCostUsd equals the sum of mock layer costs (0 + 0 + 0.02 + 0 = 0.02 when L3 is skipped, 0.42 when it runs). This catches any aggregation bug the call-count assertions cannot.

Lens: test-quality | Confidence: 95%

ArturSkowronski · 2026-04-18T03:43:29Z

+    expect(contextCollector.analyze).toHaveBeenCalled();
+  });
+
+  it('calls DeterministicGate.analyze', async () => {


🟠 L3-TEST-002 [HIGH] 'gate says stop' test cannot prove L3 is actually skipped — the mock setup makes AIDeepReview unreachable regardless

The test titled 'does not call AIDeepReview when gate says stop' is structurally indistinguishable from a test where the pipeline ignores the gate entirely. Because AIQuickScan.analyze is globally mocked to return gate: { proceed: false } in beforeEach, AIDeepReview would also not be called if the pipeline had a bug such as an off-by-one that dropped the last layer, or if the pipeline crashed after L2 and swallowed the error. The negative assertion (not.toHaveBeenCalled) passes in all three scenarios: correct gate logic, incorrect gate logic with a crash, and incorrect gate logic that simply never reaches L3. A complementary test with gate.proceed: true that confirms L3 is called is required to make the negative test meaningful — without it the security-relevant early-exit path is unverified.

Suggestion: Add a paired positive test: override the AIQuickScan mock for that single test to return gate: { proceed: true }, run the pipeline, and assert aiDeepReview.analyze was called. Only with both the positive and negative cases does the gate boundary have genuine test coverage.

Lens: test-quality | Confidence: 90%

test: add comprehensive pipeline layer tests

bb39bb3

Demo scenario: meta/hollow-test-suite

ArturSkowronski commented Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add comprehensive pipeline layer tests#3

test: add comprehensive pipeline layer tests#3
ArturSkowronski wants to merge 1 commit intomasterfrom
demo/meta/hollow-test-suite

ArturSkowronski commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

ArturSkowronski commented Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

ArturSkowronski Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ArturSkowronski commented Apr 18, 2026

Summary

Changes

Uh oh!

github-actions bot commented Apr 18, 2026

🔍 VCR Code Review

Uh oh!

ArturSkowronski commented Apr 18, 2026

🔍 VCR Review — Side by Side

Findings by Layer

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

ArturSkowronski Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant