-
Notifications
You must be signed in to change notification settings - Fork 2
test(e2e): harden approved-real manifest boundaries #410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -105,6 +105,93 @@ describe('chat-flow evidence manifest contract', () => { | |
| }); | ||
| }); | ||
|
|
||
| it('records observed-local rows as read-only and non-real', () => { | ||
| const manifest = buildChatFlowEvidenceManifest({ | ||
| scenario: 'desktop-observed-local', | ||
| surface: 'desktop', | ||
| dataSource: 'observed-hub-replay', | ||
| authExecution: 'local-only', | ||
| rows: [ | ||
| { | ||
| id: 'observed-edge-health', | ||
| claim: 'Local Edge was observed without model/API spend', | ||
| evidenceLevel: 'observed-local', | ||
| realTested: false, | ||
| status: 'passed', | ||
| command: 'pwsh ./scripts/smoke/verify-localhost-real-services.ps1', | ||
| }, | ||
| ], | ||
| }); | ||
|
|
||
| expect(manifest).toMatchObject({ | ||
| evidence_levels: ['observed-local'], | ||
| real_tested: false, | ||
| rows: [ | ||
| { | ||
| id: 'observed-edge-health', | ||
| evidence_level: 'observed-local', | ||
| real_tested: false, | ||
| }, | ||
| ], | ||
| }); | ||
| expect(validateChatFlowEvidenceManifest(manifest)).toEqual({ ok: true, errors: [] }); | ||
| }); | ||
|
|
||
| it('requires approved-real rows to carry approval and real login plus CLI/model evidence claims', () => { | ||
| const manifest = buildChatFlowEvidenceManifest({ | ||
| scenario: 'approved-real-missing-claims', | ||
| surface: 'desktop', | ||
| dataSource: 'approved-real-preflight', | ||
| authExecution: 'approved-real', | ||
| rows: [ | ||
| { | ||
| id: 'approved-real-row', | ||
| claim: 'Approved real path ran', | ||
| evidenceLevel: 'approved-real', | ||
| realTested: true, | ||
| status: 'passed', | ||
| command: 'pwsh ./scripts/verify/verify-approved-real-preflight.ps1 -ManifestPath approved.json', | ||
| }, | ||
| ], | ||
| }); | ||
|
|
||
| expect(validateChatFlowEvidenceManifest(manifest)).toEqual({ | ||
| ok: false, | ||
| errors: [ | ||
| 'approved-real-missing-claims row approved-real-row sets real_tested=true without approval_ref', | ||
| 'approved-real-missing-claims row approved-real-row sets real_tested=true without real_login claim', | ||
| 'approved-real-missing-claims row approved-real-row sets real_tested=true without real_cli_or_model claim', | ||
| ], | ||
| }); | ||
|
Comment on lines
+158
to
+165
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win Avoid pinning validator prose in these tests. These assertions hard-code the full validation messages, so wording-only changes will fail the suite even when the contract is unchanged. Prefer stable machine-readable reasons/codes, or at least predicate-based checks for the relevant invariant instead of exact prose. As per coding guidelines, " Also applies to: 221-223 🤖 Prompt for AI AgentsSource: Coding guidelines |
||
| }); | ||
|
|
||
| it('accepts approved-real rows only when the real evidence claims are explicit', () => { | ||
| const manifest = buildChatFlowEvidenceManifest({ | ||
| scenario: 'approved-real-gold-path', | ||
| surface: 'desktop', | ||
| dataSource: 'approved-real-preflight', | ||
| authExecution: 'approved-real', | ||
| rows: [ | ||
| { | ||
| id: 'approved-real-row', | ||
| claim: 'Approved real login and CLI/model path ran', | ||
| evidenceLevel: 'approved-real', | ||
| realTested: true, | ||
| status: 'passed', | ||
| command: 'pwsh ./scripts/smoke/verify-p0-approved-real-gold-path.ps1', | ||
| approvalRef: 'approval-2026-06-29-001', | ||
| claims: { | ||
| realLogin: true, | ||
| realCliOrModel: true, | ||
| }, | ||
| }, | ||
| ], | ||
| }); | ||
|
|
||
| expect(validateChatFlowEvidenceManifest(manifest)).toEqual({ ok: true, errors: [] }); | ||
| expect(manifest.real_tested).toBe(true); | ||
| }); | ||
|
|
||
| it('rejects packaged Desktop and release claims without matching evidence levels', () => { | ||
| const manifest = buildChatFlowEvidenceManifest({ | ||
| scenario: 'desktop-vite-chat-flow', | ||
|
|
@@ -131,8 +218,35 @@ describe('chat-flow evidence manifest contract', () => { | |
| ok: false, | ||
| errors: [ | ||
| 'desktop-vite-chat-flow row desktop-vite claims packaged Desktop without packaged-release evidence', | ||
| 'desktop-vite-chat-flow row desktop-vite claims release upload without release evidence', | ||
| 'desktop-vite-chat-flow row desktop-vite claims release upload without packaged-release evidence', | ||
| ], | ||
| }); | ||
| }); | ||
|
|
||
| it('keeps packaged Desktop and release upload claims on packaged-release evidence only', () => { | ||
| const manifest = buildChatFlowEvidenceManifest({ | ||
| scenario: 'desktop-packaged-release', | ||
| surface: 'desktop', | ||
| dataSource: 'approved-real-preflight', | ||
| authExecution: 'approved-real', | ||
| rows: [ | ||
| { | ||
| id: 'tauri-package', | ||
| claim: 'Tauri package policy and release dry gate passed', | ||
| evidenceLevel: 'packaged-release', | ||
| realTested: false, | ||
| status: 'passed', | ||
| command: 'pwsh ./scripts/release/verify-tauri-package-dry.ps1', | ||
| claims: { | ||
| packagedDesktop: true, | ||
| releaseUpload: true, | ||
| }, | ||
| }, | ||
| ], | ||
| }); | ||
|
|
||
| expect(validateChatFlowEvidenceManifest(manifest)).toEqual({ ok: true, errors: [] }); | ||
| expect(manifest.real_tested).toBe(false); | ||
| expect(manifest.evidence_levels).toEqual(['packaged-release']); | ||
| }); | ||
| }); | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Keep the failure assertions semantic, not prose-coupled.
These cases lock the suite to exact validator wording instead of the contract violation being reported. That makes harmless message edits look like regressions. As per coding guidelines, "
**/*.{ts,tsx,js,jsx}: Do not write tests that merely replicate implementation branches, assert constant strings, hard-code error text as behavior, or mock the function under test itself."Also applies to: 339-344
🤖 Prompt for AI Agents
Source: Coding guidelines