Skip to content

Proposal: Diagnostic Reporting System#1013

Closed
Alan-Jowett wants to merge 1 commit intovbpf:mainfrom
Alan-Jowett:proposal/diagnostic-reporting
Closed

Proposal: Diagnostic Reporting System#1013
Alan-Jowett wants to merge 1 commit intovbpf:mainfrom
Alan-Jowett:proposal/diagnostic-reporting

Conversation

@Alan-Jowett
Copy link
Copy Markdown
Contributor

@Alan-Jowett Alan-Jowett commented Feb 18, 2026

Summary

This PR proposes a Diagnostic Reporting System to address issues #995 and #1010.

Documents

Problem Statement

Current diagnostic output has two extremes:

Large BPF programs (10k+ instructions) need intermediate verbosity options (#995).

Proposed Solution

Verbosity Hierarchy

Flag Blocks shown Use case
(default) Error only Quick check
--slice\ ~10-50 instructions Understand single failure
--reachable\ ~100s instructions Debug complex control flow
-v\ ALL (fixed to not stop at error) Full audit

Output Formats

  • --format=text\ (default)
  • --format=json\ (machine-readable)
  • --format=sarif\ (IDE integration)
  • --format=dot\ (CFG visualization)

Phased Implementation

  1. Phase 1 (MVP): Fix -v, add --slice\ and --reachable\ (~2-4 weeks)
  2. Phase 2: JSON/SARIF output (~2-3 weeks)
  3. Phase 3: Advanced filtering (~4-6 weeks)
  4. Phase 4: Interactive/server mode (future)

Related Issues

Discussion

Looking for feedback on:

  1. Is the verbosity hierarchy intuitive?
  2. Are the use cases representative?
  3. Is the MVP scope (Phase 1) right?
  4. Any requirements missing?

Summary by CodeRabbit

  • Documentation
    • Added diagnostic reporting system specification outlining feature requirements, data models, and user interfaces
    • Added implementation roadmap with phased approach including core diagnostics, structured outputs, advanced filtering, and interactive modes

This proposal addresses issues vbpf#995 and vbpf#1010 by defining a comprehensive
diagnostic reporting system with:

- Verbosity levels: --quiet, --slice, --reachable, -v
- Structured output formats: text, JSON, SARIF, DOT
- Interactive modes: REPL and server (future)

Documents:
- diagnostic-reporting-spec.md: Feature specification with requirements
- diagnostic-reporting-roadmap.md: Phased implementation plan

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Alan Jowett <alan.jowett@microsoft.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 18, 2026

Walkthrough

Two new documentation files introduce a comprehensive diagnostic reporting system specification and phased implementation roadmap for Prevail, addressing enhanced diagnostic output modes, structured machine-readable formats, filtering capabilities, and interactive features across four sequential phases.

Changes

Cohort / File(s) Summary
Diagnostic System Documentation
docs/proposals/diagnostic-reporting-spec.md, docs/proposals/diagnostic-reporting-roadmap.md
Added specification and phased roadmap for Diagnostic Reporting System. Spec defines problem statement, use cases, functional/non-functional requirements, data model, error codes, UI interfaces, architecture components, and success metrics. Roadmap outlines four phases (core modes, structured formats, advanced filtering, interactive mode) with explicit goals, deliverables, acceptance tests, dependencies, risks, and effort estimates.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly identifies the main change: introduction of a diagnostic reporting system proposal with two specification documents.
Linked Issues check ✅ Passed The PR documents address both linked issues: #995 (failure slice verbosity mode with minimal semantic slices) and #1010 (invariant printing for loop predecessors) through comprehensive specifications and a phased implementation roadmap.
Out of Scope Changes check ✅ Passed Both added documents are entirely within scope: the spec defines requirements and use cases for issues #995 and #1010, and the roadmap outlines implementation phases directly tied to those objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@elazarg
Copy link
Copy Markdown
Collaborator

elazarg commented Feb 18, 2026

Why is it better for the plan to be in the docs rather than in issues / disucssions?

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/proposals/diagnostic-reporting-roadmap.md`:
- Line 54: The acceptance criterion "check --slice produces ≤ 50 instructions
for typical failures" is ambiguous; update the document to replace "typical"
with a concrete, deterministic test fixture and expected assertion: add a named
example BPF program (e.g., "example-bpf-unbounded-loop.bpf") that reproduces the
failure, include the exact command to run (using check --slice against that
fixture), and state the deterministic assertion that the output slice contains ≤
50 instructions for that fixture; also add guidance to include the fixture under
tests/fixtures and a CI test that runs check --slice on the named fixture and
fails if the instruction count exceeds 50.
- Around line 197-206: Update the roadmap so Phase 3’s dependency targets Phase
1 instead of Phase 2 and add a language specifier to the fenced diagram block:
change the dependency text that currently reads "Phase 3 depends on Phase 2
(filtering builds on structured output)" to "Phase 3 depends on Phase 1
(filtering and DOT output build on slice/CFG infrastructure from Phase 1; can
proceed in parallel with Phase 2)" and ensure the fenced code block containing
the ASCII diagram (the block starting with ``` on the diagram) includes a
language tag (e.g., ```text) to satisfy markdownlint MD040; make these edits in
the document sections that reference Phase 2/Phase 3 dependencies and the ASCII
diagram.

In `@docs/proposals/diagnostic-reporting-spec.md`:
- Around line 165-169: The spec uses inconsistent slice-size bounds for the
--slice option (diagram showing "~20 insns", UC-1 acceptance criteria stating "≤
50 instructions", and the CLI table showing "~10-50 instructions"); unify these
by choosing one authoritative bound (e.g., "≈20 instructions" or "≤50
instructions") and update all references to --slice across the document (the
verbosity hierarchy diagram, UC-1 acceptance criteria, and the CLI comparison
table) so the same phrasing and numeric bound is used consistently.
- Around line 165-169: Add explicit fenced code block language specifiers and
ensure blank-line padding around the blocks flagged by markdownlint: for the
"verbosity hierarchy ASCII art" block (the block showing --quiet ⊂ ... ALL
insns) add ```text before and ``` after with a blank line above and below; for
the CLI usage block around line 316 add ```text; for the example output blocks
around lines 353 and 359 add ```text; for the JSON example around line 374 add
```json; and for the architecture diagram around line 391 add ```text. Locate
these blocks by searching for the ASCII art and example headings in
diagnostic-reporting-spec.md and update each fenced block header and surrounding
blank lines accordingly.
- Around line 160-161: Add explicit behavior for the --slice and --reachable
flags (FR-1.3 and FR-1.4) when verification passes: update
docs/proposals/diagnostic-reporting-spec.md to either add a new FR or a CLI note
stating that these flags are no-ops on successful verification and will emit the
standard success output (e.g., a success message/exit code and optional empty
JSON array/object), or specify the chosen behavior consistently (e.g., "if no
error, --slice/--reachable produce an empty JSON result and exit 0"). Reference
the flag names (--slice, --reachable) and FR IDs (FR-1.3, FR-1.4) in the change
so callers know the exact output and exit semantics.
- Around line 360-371: The slice example is inconsistent: fix the abstract trace
so the path to instruction 184 is reachable and the registers are unambiguous —
ensure the branch at instruction 180 (the `if r2 > r3 goto 190` check) can
evaluate false under the shown invariants (e.g., make r3's range include values
≥80 or reduce r2), and explicitly show how r0 is loaded at 170 (clarify r1 -> r0
assignment and annotate r0 with the loaded shared_region_size or appropriate
pointer/size range) so the access at 184 (`*(r0 + 80)`) follows from the
recorded r0 value; update the state lines for r0 and r3 and the failure message
(`Upper bound must be at most r0.shared_region_size`) to match the corrected
concrete abstract values and control-flow.
- Line 162: FR-1.5 and NFR-4 contradict: FR-1.5 mandates fixing -v/--verbose to
show all blocks (no stop-at-error) while NFR-4 currently requires preserving
existing -v behavior (the bug). Update NFR-4 to clarify that the flag name and
backward-compatible surface (flag name and basic semantics) are preserved but
explicitly allow the FR-1.5 bugfix to change the stop-at-error behavior; e.g.,
reword NFR-4 to state "Preserve the -v/--verbose flag name and general semantics
for compatibility, but permit the change mandated by FR-1.5 so that -v/--verbose
shows all blocks and does not stop at errors." Reference FR-1.5 and NFR-4 when
making the rewording to ensure no conflict remains.
- Line 217: The NFR-1 requirement is incorrect: computing a backward slice
cannot be guaranteed O(slice_size) because traversal of the dependency graph may
require examining edges proportional to program_size; update the NFR text for
clarity by replacing the absolute claim with a realistic complexity bound or
qualifier — e.g., change "SHALL complete in O(slice_size), not O(program_size)"
to something like "SHALL aim for time proportional to the reachable dependency
graph of the error node (bounded by O(program_size) worst-case), and
implementations SHOULD optimize for cases where the reachable subgraph is small
(practical cost ≈ O(slice_size) when indices/caching/incremental invariants are
available)"; reference NFR-1, the terms slice_size and program_size, and include
an explicit note that worst-case complexity is O(program_size) unless additional
indexing or incremental computation is assumed.
- Around line 247-256: The interface named Error shadows the global TypeScript
Error class; rename this type (for example to VerificationError or
DiagnosticError) and update all references to that symbol in the spec and any
related code (e.g., where Error is used in arrays, function signatures, or
types) so consumers don't collide with the built-in Error; ensure the new name
replaces the interface declaration (originally declared as interface Error) and
any usages such as error lists, serializers, or type annotations to maintain
consistency.
- Line 182: FR-2.4's wording is ambiguous; update the spec table entry labeled
"FR-2.4" to replace "Reachable mode SHALL compute forward CFG reachability to
error" with the clearer phrasing "Compute forward CFG reachability from each
block to the error location" (or equivalent wording that explicitly states
forward reachability from blocks toward the error), ensuring consistency with
the roadmap language and avoiding the backward/forward terminology confusion.


- [ ] `check -v` on loop failure shows ALL blocks including those after error PC
- [ ] `check --reachable` shows only blocks that can reach error
- [ ] `check --slice` produces ≤ 50 instructions for typical failures
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Acceptance test criterion is untestable as written.

check --slice produces ≤ 50 instructions for typical failures — "typical" is undefined. Add a concrete test fixture (e.g., a specific BPF program with a known failure) so the acceptance test is deterministic and repeatable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-roadmap.md` at line 54, The acceptance
criterion "check --slice produces ≤ 50 instructions for typical failures" is
ambiguous; update the document to replace "typical" with a concrete,
deterministic test fixture and expected assertion: add a named example BPF
program (e.g., "example-bpf-unbounded-loop.bpf") that reproduces the failure,
include the exact command to run (using check --slice against that fixture), and
state the deterministic assertion that the output slice contains ≤ 50
instructions for that fixture; also add guidance to include the fixture under
tests/fixtures and a CI test that runs check --slice on the named fixture and
fails if the instruction count exceeds 50.

Comment on lines +197 to +206
```
Phase 1 ─────► Phase 2 ─────► Phase 3
Phase 4
```

- Phase 2 depends on Phase 1 (output infrastructure)
- Phase 3 depends on Phase 2 (filtering builds on structured output)
- Phase 4 depends on Phase 3 (interactive mode uses all filtering capabilities)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Phase 3's stated dependency on Phase 2 is incorrect and introduces unnecessary delay.

The dependency rationale "filtering builds on structured output" does not hold. Phase 3's deliverables — enhanced backward slice through stack slots, register filter (--filter=r3), label range filter (--range=100:200), and DOT format — all operate on internal CFG/slice data structures already built in Phase 1. None of them require Phase 2's JSON/SARIF output infrastructure. This sequencing forces 2–3 unnecessary weeks of wait time before Phase 3 work can start in parallel with Phase 2.

Also, the fenced code block at line 197 is missing a language specifier, which triggers a markdownlint MD040 warning.

📐 Proposed fix
-```
+```text
 Phase 1 ─────► Phase 2 ─────► Phase 3
                               │
                               ▼
                            Phase 4
-```
+```

-- Phase 2 depends on Phase 1 (output infrastructure)
-- Phase 3 depends on Phase 2 (filtering builds on structured output)
-- Phase 4 depends on Phase 3 (interactive mode uses all filtering capabilities)
+- Phase 2 depends on Phase 1 (output infrastructure)
+- Phase 3 depends on Phase 1 (filtering and DOT output build on slice/CFG infrastructure from Phase 1; can proceed in parallel with Phase 2)
+- Phase 4 depends on Phase 3 (interactive mode uses all filtering capabilities)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
Phase 1 ─────► Phase 2 ─────► Phase 3
Phase 4
```
- Phase 2 depends on Phase 1 (output infrastructure)
- Phase 3 depends on Phase 2 (filtering builds on structured output)
- Phase 4 depends on Phase 3 (interactive mode uses all filtering capabilities)
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 197-197: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-roadmap.md` around lines 197 - 206,
Update the roadmap so Phase 3’s dependency targets Phase 1 instead of Phase 2
and add a language specifier to the fenced diagram block: change the dependency
text that currently reads "Phase 3 depends on Phase 2 (filtering builds on
structured output)" to "Phase 3 depends on Phase 1 (filtering and DOT output
build on slice/CFG infrastructure from Phase 1; can proceed in parallel with
Phase 2)" and ensure the fenced code block containing the ASCII diagram (the
block starting with ``` on the diagram) includes a language tag (e.g., ```text)
to satisfy markdownlint MD040; make these edits in the document sections that
reference Phase 2/Phase 3 dependencies and the ASCII diagram.

Comment on lines +160 to +161
| FR-1.3 | System SHALL support `--slice` mode (backward slice to error) | P1 |
| FR-1.4 | System SHALL support `--reachable` mode (blocks from which error is reachable) | P1 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

--slice and --reachable behavior is undefined when verification passes.

No use case or requirement specifies what these flags do when the program verifies successfully. Should they emit a success message and exit? Silently succeed? Emit an empty JSON object? This needs an explicit FR or a note in the CLI section (e.g., "if no error, flags are no-ops and standard success output is emitted").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` around lines 160 - 161, Add
explicit behavior for the --slice and --reachable flags (FR-1.3 and FR-1.4) when
verification passes: update docs/proposals/diagnostic-reporting-spec.md to
either add a new FR or a CLI note stating that these flags are no-ops on
successful verification and will emit the standard success output (e.g., a
success message/exit code and optional empty JSON array/object), or specify the
chosen behavior consistently (e.g., "if no error, --slice/--reachable produce an
empty JSON result and exit 0"). Reference the flag names (--slice, --reachable)
and FR IDs (FR-1.3, FR-1.4) in the change so callers know the exact output and
exit semantics.

| FR-1.2 | System SHALL support default mode (error + location) | P1 |
| FR-1.3 | System SHALL support `--slice` mode (backward slice to error) | P1 |
| FR-1.4 | System SHALL support `--reachable` mode (blocks from which error is reachable) | P1 |
| FR-1.5 | System SHALL fix `-v/--verbose` to show ALL blocks (not stop at error) | P1 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

NFR-4 directly contradicts FR-1.5 — one of them preserves the bug being fixed.

FR-1.5 (line 162): "System SHALL fix -v/--verbose to show ALL blocks (not stop at error)"
NFR-4 (line 220): "Existing -v behavior SHALL be preserved for compatibility"

The existing -v behavior IS the bug documented in #1010. Preserving it violates FR-1.5. NFR-4's intent is presumably to preserve the flag name and semantics (not rename -v or drop it), not to freeze the buggy stop-at-error behavior. Rephrase NFR-4 to remove the contradiction:

📐 Proposed fix
-| NFR-4 | Existing `-v` behavior SHALL be preserved for compatibility | P1 |
+| NFR-4 | The `-v`/`--verbose` flag SHALL be retained (not renamed or removed); its corrected behavior (all blocks shown) is the new contract | P1 |

Also applies to: 220-220

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` at line 162, FR-1.5 and NFR-4
contradict: FR-1.5 mandates fixing -v/--verbose to show all blocks (no
stop-at-error) while NFR-4 currently requires preserving existing -v behavior
(the bug). Update NFR-4 to clarify that the flag name and backward-compatible
surface (flag name and basic semantics) are preserved but explicitly allow the
FR-1.5 bugfix to change the stop-at-error behavior; e.g., reword NFR-4 to state
"Preserve the -v/--verbose flag name and general semantics for compatibility,
but permit the change mandated by FR-1.5 so that -v/--verbose shows all blocks
and does not stop at errors." Reference FR-1.5 and NFR-4 when making the
rewording to ensure no conflict remains.

Comment on lines +165 to +169
```
--quiet ⊂ (default) ⊂ --slice ⊂ --reachable ⊂ -v
│ │ │ │ │
exit code error msg ~20 insns ~100s insns ALL insns
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Slice size estimates are inconsistent across the spec.

The verbosity hierarchy diagram (line 168) states ~20 insns for --slice, UC-1's acceptance criteria (line 75) states ≤ 50 instructions, and the CLI comparison table (line 344) states ~10-50 instructions. Pick one authoritative bound and use it consistently throughout.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 165-165: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 165-165: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` around lines 165 - 169, The spec
uses inconsistent slice-size bounds for the --slice option (diagram showing "~20
insns", UC-1 acceptance criteria stating "≤ 50 instructions", and the CLI table
showing "~10-50 instructions"); unify these by choosing one authoritative bound
(e.g., "≈20 instructions" or "≤50 instructions") and update all references to
--slice across the document (the verbosity hierarchy diagram, UC-1 acceptance
criteria, and the CLI comparison table) so the same phrasing and numeric bound
is used consistently.

⚠️ Potential issue | 🟡 Minor

Multiple fenced code blocks are missing language specifiers (and some blank-line padding).

markdownlint MD040/MD031 warnings at lines 165, 316, 353, 359, 374, and 391. Suggested language tags:

  • Line 165 (verbosity hierarchy ASCII art): text
  • Line 316 (CLI usage block): text
  • Lines 353, 359 (example output blocks): text
  • Line 374 (JSON example): json
  • Line 391 (architecture diagram): text

Also applies to: 316-339, 353-371, 374-385, 391-433

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 165-165: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 165-165: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` around lines 165 - 169, Add
explicit fenced code block language specifiers and ensure blank-line padding
around the blocks flagged by markdownlint: for the "verbosity hierarchy ASCII
art" block (the block showing --quiet ⊂ ... ALL insns) add ```text before and
``` after with a blank line above and below; for the CLI usage block around line
316 add ```text; for the example output blocks around lines 353 and 359 add
```text; for the JSON example around line 374 add ```json; and for the
architecture diagram around line 391 add ```text. Locate these blocks by
searching for the ASCII art and example headings in diagnostic-reporting-spec.md
and update each fenced block header and surrounding blank lines accordingly.

| FR-2.1 | Slice mode SHALL include only instructions with data dependency to error | P1 |
| FR-2.2 | Slice mode SHALL track dependencies through registers R0-R10 | P1 |
| FR-2.3 | Slice mode SHALL track dependencies through stack memory | P2 |
| FR-2.4 | Reachable mode SHALL compute forward CFG reachability to error | P1 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

FR-2.4 terminology is imprecise and conflicts with standard graph theory.

"Forward CFG reachability to error" is ambiguous. Standard terminology: finding nodes that can reach a target = backward reachability from the target (traverse edges in reverse). The roadmap uses the clearer formulation: "Compute forward CFG reachability from each block to error location." Align the spec with that phrasing to avoid implementation confusion.

📐 Proposed fix
-| FR-2.4 | Reachable mode SHALL compute forward CFG reachability to error | P1 |
+| FR-2.4 | Reachable mode SHALL compute, for each block, whether a forward CFG path to the error location exists (equivalently: backward reachability from the error node) | P1 |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| FR-2.4 | Reachable mode SHALL compute forward CFG reachability to error | P1 |
| FR-2.4 | Reachable mode SHALL compute, for each block, whether a forward CFG path to the error location exists (equivalently: backward reachability from the error node) | P1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` at line 182, FR-2.4's wording is
ambiguous; update the spec table entry labeled "FR-2.4" to replace "Reachable
mode SHALL compute forward CFG reachability to error" with the clearer phrasing
"Compute forward CFG reachability from each block to the error location" (or
equivalent wording that explicitly states forward reachability from blocks
toward the error), ensuring consistency with the roadmap language and avoiding
the backward/forward terminology confusion.


| ID | Requirement | Priority |
|----|-------------|----------|
| NFR-1 | Slice computation SHALL complete in O(slice_size), not O(program_size) | P1 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

NFR-1's O(slice_size) complexity claim is algorithmically infeasible as written.

"Slice computation SHALL complete in O(slice_size), not O(program_size)"

Even post-hoc (with invariants already computed), computing a backward slice requires traversing the dependency graph to determine which instructions ARE and ARE NOT in the slice. This traversal is bounded by the number of reachable edges from the error node — O(program_size) in the worst case, independent of the output size. You cannot know what's excluded from the slice without examining what's included. The requirement conflates output size with computation cost.

📐 Proposed fix
-| NFR-1 | Slice computation SHALL complete in O(slice_size), not O(program_size) | P1 |
+| NFR-1 | Slice computation SHALL be bounded by the reachable dependency graph from the error site; output generation SHALL be proportional to slice_size | P1 |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| NFR-1 | Slice computation SHALL complete in O(slice_size), not O(program_size) | P1 |
| NFR-1 | Slice computation SHALL be bounded by the reachable dependency graph from the error site; output generation SHALL be proportional to slice_size | P1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` at line 217, The NFR-1
requirement is incorrect: computing a backward slice cannot be guaranteed
O(slice_size) because traversal of the dependency graph may require examining
edges proportional to program_size; update the NFR text for clarity by replacing
the absolute claim with a realistic complexity bound or qualifier — e.g., change
"SHALL complete in O(slice_size), not O(program_size)" to something like "SHALL
aim for time proportional to the reachable dependency graph of the error node
(bounded by O(program_size) worst-case), and implementations SHOULD optimize for
cases where the reachable subgraph is small (practical cost ≈ O(slice_size) when
indices/caching/incremental invariants are available)"; reference NFR-1, the
terms slice_size and program_size, and include an explicit note that worst-case
complexity is O(program_size) unless additional indexing or incremental
computation is assumed.

Comment on lines +247 to +256
interface Error {
code: ErrorCode; // e.g., "E001_BOUNDS_CHECK"
severity: "error" | "warning";
location: Location;
message: string;
instruction: string;
pre_state?: AbstractState; // If verbose
assertion?: string; // What was checked
suggestion?: string; // How to fix
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Error interface name shadows the TypeScript built-in.

TypeScript (and JavaScript) has a global built-in Error class. Using interface Error for the verification error type will shadow it and create confusion for any implementer working from this schema.

📐 Proposed fix
-interface Error {
+interface VerificationError {
   code: ErrorCode;                    // e.g., "E001_BOUNDS_CHECK"
   severity: "error" | "warning";
   ...
 }

 interface VerificationResult {
   ...
-  errors: Error[];
+  errors: VerificationError[];
   ...
 }

 interface Slice {
-  error: Error;
+  error: VerificationError;
   ...
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` around lines 247 - 256, The
interface named Error shadows the global TypeScript Error class; rename this
type (for example to VerificationError or DiagnosticError) and update all
references to that symbol in the spec and any related code (e.g., where Error is
used in arrays, function signatures, or types) so consumers don't collide with
the built-in Error; ensure the new name replaces the interface declaration
(originally declared as interface Error) and any usages such as error lists,
serializers, or type annotations to maintain consistency.

Comment on lines +360 to +371
program.o:184: error[E001]: Upper bound must be at most r0.shared_region_size

Failure trace (most recent last):
170: r0 = *(u64 *)(r1 + 0) ; r0: ptr(shared, offset=[0,∞))
175: r2 = 80 ; r2: num(80)
180: if r2 > r3 goto 190 ; r3: num([0,64]) ← bounds not checked
184: r1 = *(u64 *)(r0 + 80) ; ERROR: offset 80 exceeds max 64

State at error:
r0: ptr(shared, offset=[0,∞), size=[0,64])
r3: num([0,64])
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Slice example output is logically inconsistent.

The abstract state shown makes instruction 184 unreachable:

  • Instruction 175 sets r2 = 80 (constant)
  • Instruction 180 checks if r2 > r3 goto 190 with r3: num([0,64])
  • Since r2=80 and r3≤64, the condition r2 > r3 is always true → the branch to 190 is always taken in the abstract domain
  • The fall-through path to instruction 184 is therefore dead code under the shown invariants

The example also has a naming collision: r0.shared_region_size at instruction 170 is loading into r0 via r1, then at instruction 184 the access is *(r0 + 80) — but the slice doesn't show what r0 is at that point or how it was loaded. A corrected example should use abstract state values that are consistent with the path condition actually reaching the error instruction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/diagnostic-reporting-spec.md` around lines 360 - 371, The
slice example is inconsistent: fix the abstract trace so the path to instruction
184 is reachable and the registers are unambiguous — ensure the branch at
instruction 180 (the `if r2 > r3 goto 190` check) can evaluate false under the
shown invariants (e.g., make r3's range include values ≥80 or reduce r2), and
explicitly show how r0 is loaded at 170 (clarify r1 -> r0 assignment and
annotate r0 with the loaded shared_region_size or appropriate pointer/size
range) so the access at 184 (`*(r0 + 80)`) follows from the recorded r0 value;
update the state lines for r0 and r3 and the failure message (`Upper bound must
be at most r0.shared_region_size`) to match the corrected concrete abstract
values and control-flow.

@Alan-Jowett
Copy link
Copy Markdown
Contributor Author

@elazarg moving it to the discussion. Will close this pr

@coveralls
Copy link
Copy Markdown

Pull Request Test Coverage Report for Build 22152365554

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 87.142%

Totals Coverage Status
Change from base Build 22081081213: 0.0%
Covered Lines: 9319
Relevant Lines: 10694

💛 - Coveralls

@Alan-Jowett Alan-Jowett deleted the proposal/diagnostic-reporting branch March 17, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invariant output stops at first error, hiding loop predecessor state needed for debugging Add Intermediate Verbosity Mode for Minimal Failure Slices

3 participants