Skip to content

Fix Windows backslash path separators in JSON output and diagnostics#354

Merged
cscheid merged 6 commits into
mainfrom
bugfix/bd-dff27o04-windows-json-writer-emits
Jul 1, 2026
Merged

Fix Windows backslash path separators in JSON output and diagnostics#354
cscheid merged 6 commits into
mainfrom
bugfix/bd-dff27o04-windows-json-writer-emits

Conversation

@cderv

@cderv cderv commented Jul 1, 2026

Copy link
Copy Markdown
Member

On Windows, pampa's JSON writer and ariadne diagnostics show backslash-separated file paths, diverging from the forward-slash convention used everywhere else in the codebase (quarto_util::to_forward_slashes, already used for Lua/HTML/DocumentProfile paths) and from insta snapshots recorded on Unix. TypeScript Quarto normalizes the same way via pathWithForwardSlashes for the same reason.

Root Cause

ASTContext::with_filename stores whatever filename string it's handed verbatim. On Windows, callers that derive the filename from a real path (glob() results in the snapshot-test harness, CLI arguments) produce backslash-separated strings, which the JSON writer and SourceContext-based diagnostics then read back out unchanged.

Fix

Normalizes at the ASTContext ingress points (with_filename, add_filename), which covers the JSON writer and diagnostics built from SourceContext for free. Two more un-normalized ingress points surfaced during end-to-end verification and review, and got the same treatment:

  • main.rs's ad hoc fallback SourceContext on the hard-parse-error CLI path, built directly from the raw CLI argument, bypassing ASTContext entirely.
  • The qmd and commonmark readers each re-add the filename into context.source_context right after with_filename had already normalized it — for qmd this fully replaced the normalized entry, so warnings on an otherwise-successful parse still showed backslashes.

Left main.rs's "Missing Newline at End of File" CLI warning alone — it interpolates the raw CLI argument into a plain sentence, echoing the user's own typed path back to them rather than emitting a portable identifier, which matches how rustc/cat handle this.

Test Plan

  • Confirm astContext.files[].name in JSON output uses forward slashes for a backslash-separated input path
  • Confirm ariadne diagnostics (both the hard-parse-error path and warnings on a successful parse) show forward-slash paths on Windows
  • Confirm the full pampa crate test suite passes (aside from the pre-existing, unrelated CRLF byte-offset failures)

cderv added 6 commits July 1, 2026 14:40
Confirms root cause: ASTContext::with_filename stores raw filename
strings with no normalization; both JSON writer emission sites read
them verbatim. Single-point fix at with_filename ingress using the
existing quarto_util::to_forward_slashes helper.
…-dff27o04)

Windows callers pass backslash-separated paths into ASTContext::with_filename
(from glob() results, CLI args, etc). The JSON writer and ariadne diagnostics
both read that string verbatim, so output diverges from insta snapshots
recorded on Unix and from the forward-slash convention used everywhere else
in the codebase (quarto_util::to_forward_slashes, already used for Lua/HTML/
DocumentProfile paths). TypeScript Quarto normalizes the same way via
pathWithForwardSlashes for the same reason.

Normalizes at with_filename/add_filename (the ASTContext ingress point,
covers the JSON writer and SourceContext-based diagnostics for free), plus a
second ingress point found during end-to-end verification: main.rs's ad hoc
fallback SourceContext on the hard-parse-error path, which bypassed
ASTContext entirely. Left the "Missing Newline" CLI warning alone since it
echoes the user's own typed path back to them rather than emitting a
portable identifier.
roborev review (job 1726) caught that both readers rebuild or re-add to
context.source_context using the raw filename right after
ASTContext::with_filename already normalized it — so a successful parse
that emits warnings still rendered ariadne diagnostics with Windows
backslashes, even though the earlier fix covered the JSON writer and the
hard-parse-error fallback in main.rs.

qmd.rs's case was live: it fully replaces context.source_context, so the
normalized entry with_filename created was discarded. commonmark.rs's case
was latent (a second, unused add_file call creates a dead entry at index 1
since FileId(0) is hardcoded) but fixed anyway for consistency, since
leaving raw backslashes there is a footgun if that assumption ever changes.

Both now reuse context.filenames[0], the value with_filename already
normalized, rather than re-deriving from the raw filename argument.
roborev review (job 1730) caught that commonmark_reader_source_context_uses_forward_slashes
asserted FileId(0), but ASTContext::with_filename populates FileId(0)
internally before commonmark::read's own add_file call ever runs — the
entry that call actually writes is FileId(1). The test passed regardless
of whether that call normalized its filename, giving zero protection for
the fix in f5a93fa. Verified by reverting just that one line and
confirming the test still passed; asserting FileId(1) instead now fails
correctly when reverted.
@cderv cderv marked this pull request as ready for review July 1, 2026 16:02
@cderv cderv requested a review from cscheid July 1, 2026 16:02
@cderv

cderv commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

@cscheid asking for review to check this is the right approach: Normalize to '/' all the paths.

It think this is the right call to make our life easier.

@cscheid cscheid left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, excellent. I'm slightly terrified of the following question but I will ask it anyway, ha: are forward slashes valid characters in windows file names? I love the idea of normalizing everything to / at the deepest level of filename resolution, but I'm genuinely unsure what the actual rules for filenames on windows are.

@cscheid cscheid merged commit 0efb27e into main Jul 1, 2026
5 checks passed
@cscheid cscheid deleted the bugfix/bd-dff27o04-windows-json-writer-emits branch July 1, 2026 16:25
@cderv

cderv commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

Good question! Here is what I remember from last time I checked this:

/ is actually one of the reserved characters Windows forbids inside an individual file directory name component — my go-to page is Naming Files, Paths, and Namespaces. So it can never collide with a legitimate filename.

As a path separator though, / is accepted interchangeably with \ by ordinary Windows path parsing — this is true at the Win32 level, and in PowerShell (which explicitly documents allowing either for cross-platform compat). I checked also in Rust and std::path works that way.
So normalizing our output to / is safe — nothing we emit could ever get misinterpreted if fed back into a real Windows path API.

However, and this is the big BUT: under the \\?\ verbatim prefix, used sometimes for long absolute paths, / stops being treated as a separator and becomes a literal character instead. I think that prefix is only relevant to raw OS-API calls though, not to what this PR touches, so we're not at risk here.

Also, remote paths on network drives (UNC paths) also accept // in place of \\ — same interchangeability rule applies there too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants