Skip to content

docs(pacman): pin down delimiter selection and value trimming#13

Open
tylerbutler wants to merge 8 commits intomainfrom
docs/pacman-underspecifications
Open

docs(pacman): pin down delimiter selection and value trimming#13
tylerbutler wants to merge 8 commits intomainfrom
docs/pacman-underspecifications

Conversation

@tylerbutler
Copy link
Copy Markdown
Contributor

@tylerbutler tylerbutler commented Apr 30, 2026

Summary

The Pacman parser page described consume_until('=') and the raw return of consume_until_dedent at the pseudocode level, but left two concrete behaviors underspecified — gaps an independent Haskell port had to reverse-engineer from the shared fixtures. Adds two subsections between the pseudocode and the worked trace, each with the rule and fixture-derived examples.

Changes

  • Picking the = delimiter (Pacman page: specify the '=' delimiter selection rule #5): organized around the two delimiter modes. delimiter_first_equals (default, matching the OCaml reference) is just "first = wins" — the pseudocode is literal. delimiter_prefer_spaced prefers a = with actual space on both sides, falling back to first-=; under that strict definition, leading-= headings like == Section Header = fall through to first-= cleanly with no carve-out needed. Examples cover == Section Header = and a URL=URL key.
  • Value trimming (Pacman page: specify value trimming after consume_until_dedent #7): first-line lstrip (spaces + tabs), whole-value rstrip, preserve interior whitespace. Calls out the two ways a naive whole-value strip() corrupts multi-line values (eats the leading newline of a sub-block; shifts the first continuation's indent).
  • Notes → Multi-line keys: clarified that the resulting key is just the raw mouthful with edges trimmed — interior newlines and indentation are preserved verbatim, not folded.

Issue #6 originally proposed a "split on \n, strip each, drop empty, join with space" multi-line-key normalization. Verifying against the OCaml reference (printf 'my\n key\n= val' | dump.exekey="my\n key", not "my key") showed that rule isn't part of the canonical algorithm and no implementation actually folds interior whitespace that way. The Pacman page now states the actual behavior plainly. Closing #6 with that clarification rather than a normative rule.

Closes #5, closes #6, closes #7.

…ules

Closes #5, #6, #7. The Pacman page described `consume_until('=')`,
`key.strip()`, and the raw return of `consume_until_dedent` at the
pseudocode level, but the fixtures expect specific behavior in three
places that the page didn't pin down. Adds three subsections between
the pseudocode and the worked trace:

- "Picking the `=` delimiter" — the start-of-line override, then
  `delimiter_prefer_spaced`, then first-`=` fallback.
- "Multi-line key normalization" — split on `\n`, strip each part,
  drop empties, join with single space.
- "Value trimming" — first-line lstrip, whole-value rstrip, preserve
  interior whitespace.

Also tightens the existing Notes → Multi-line keys paragraph to point
at the new normalization subsection instead of implying the raw
mouthful is the final key.
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Deploy Preview for strong-hotteok-5d8677 ready!

Name Link
🔨 Latest commit 01bd1d0
🔍 Latest deploy log https://app.netlify.com/projects/strong-hotteok-5d8677/deploys/69f3887a6ce5100008bdeb3a
😎 Deploy Preview https://deploy-preview-13--strong-hotteok-5d8677.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

…paced

The previous wording presented the start-of-line override as a third
rule applying under both delimiter modes. Under `delimiter_first_equals`
it's vacuous — first-`=` already lands at position 0 — so the override
is meaningful only as a `delimiter_prefer_spaced` carve-out preventing
`== Section Header =` from splitting at the trailing spaced `=`.
Restructured the section around the two delimiter modes instead of a
three-rule precedence list.
…tion

The OCaml reference parser (ccl-ocaml/lib/parser.ml) is just
`many (not_char '=')` — first `=` wins, no carve-outs. The fixtures
that motivated issue #5's start-of-line override are tagged with
empty `behaviors`, meaning they pin the default `delimiter_first_equals`
behavior, where they pass naturally.

Under `delimiter_prefer_spaced`, "spaced" means an actual space on
*both* sides — not start-of-input or end-of-input. Under that strict
definition, `== Section Header =` has no spaced `=` (position 18's
right side is end-of-input, not a space), so the fallback to first-`=`
already produces the correct split. No section-header carve-out
needed.

Restructured the section to describe the two modes cleanly and
swapped the awkward `= = spaced equals` example for a URL-style
example that better illustrates the divergence between modes.
Both '=' in 'a = b = c' are spaced, so under delimiter_prefer_spaced
the first-spaced-wins rule lands on position 2 — same split as
delimiter_first_equals. The example didn't actually distinguish the
two modes.
The 'split on \n, strip each, drop empty, join with space' rule from
issue #6 isn't what the OCaml reference actually does. Confirmed by
running the reference parser on the fixtures in question:

  'my\n key\n= val'  → key='my\n key' (fixture wants 'my key')
  'a\n b\n c\n= val' → key='a\n b\n c' (fixture wants 'a b c')

The OCaml ref just String.trims the raw key bytes — interior newlines
and per-line indentation are preserved verbatim. The fixtures that do
pass do so because trim happens to land on a clean single word; the
multi-word multi-line cases actively diverge.

The collapse-interior-whitespace rule belongs to the multiline_keys
feature (already documented at /reference/features#multiline_keys),
not the Pacman page's strategy contract. Updated the Notes paragraph
to point at that feature for impls that want stricter normalization.
@tylerbutler tylerbutler changed the title docs(pacman): pin down delimiter, multi-line key, and value trimming docs(pacman): pin down delimiter selection and value trimming Apr 30, 2026
Earlier draft said impls implementing multiline_keys 'collapse interior
whitespace runs into a single space' — that's not a real behavior any
implementation does. Just state plainly that the key bytes are
edge-trimmed and otherwise preserved.
…examples

Examples 1 and 5 used '\n...' to mean 'more input I'm not specifying',
but the documented value only holds for a particular interpretation
of '...' (specifically: dedent at column 0, ending the value).
Replaced with concrete fully-specified inputs whose outputs were
verified against the OCaml reference parser.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant