docs(pacman): pin down delimiter selection and value trimming#13
Open
tylerbutler wants to merge 8 commits intomainfrom
Open
docs(pacman): pin down delimiter selection and value trimming#13tylerbutler wants to merge 8 commits intomainfrom
tylerbutler wants to merge 8 commits intomainfrom
Conversation
…ules Closes #5, #6, #7. The Pacman page described `consume_until('=')`, `key.strip()`, and the raw return of `consume_until_dedent` at the pseudocode level, but the fixtures expect specific behavior in three places that the page didn't pin down. Adds three subsections between the pseudocode and the worked trace: - "Picking the `=` delimiter" — the start-of-line override, then `delimiter_prefer_spaced`, then first-`=` fallback. - "Multi-line key normalization" — split on `\n`, strip each part, drop empties, join with single space. - "Value trimming" — first-line lstrip, whole-value rstrip, preserve interior whitespace. Also tightens the existing Notes → Multi-line keys paragraph to point at the new normalization subsection instead of implying the raw mouthful is the final key.
✅ Deploy Preview for strong-hotteok-5d8677 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…paced The previous wording presented the start-of-line override as a third rule applying under both delimiter modes. Under `delimiter_first_equals` it's vacuous — first-`=` already lands at position 0 — so the override is meaningful only as a `delimiter_prefer_spaced` carve-out preventing `== Section Header =` from splitting at the trailing spaced `=`. Restructured the section around the two delimiter modes instead of a three-rule precedence list.
…tion The OCaml reference parser (ccl-ocaml/lib/parser.ml) is just `many (not_char '=')` — first `=` wins, no carve-outs. The fixtures that motivated issue #5's start-of-line override are tagged with empty `behaviors`, meaning they pin the default `delimiter_first_equals` behavior, where they pass naturally. Under `delimiter_prefer_spaced`, "spaced" means an actual space on *both* sides — not start-of-input or end-of-input. Under that strict definition, `== Section Header =` has no spaced `=` (position 18's right side is end-of-input, not a space), so the fallback to first-`=` already produces the correct split. No section-header carve-out needed. Restructured the section to describe the two modes cleanly and swapped the awkward `= = spaced equals` example for a URL-style example that better illustrates the divergence between modes.
Both '=' in 'a = b = c' are spaced, so under delimiter_prefer_spaced the first-spaced-wins rule lands on position 2 — same split as delimiter_first_equals. The example didn't actually distinguish the two modes.
The 'split on \n, strip each, drop empty, join with space' rule from issue #6 isn't what the OCaml reference actually does. Confirmed by running the reference parser on the fixtures in question: 'my\n key\n= val' → key='my\n key' (fixture wants 'my key') 'a\n b\n c\n= val' → key='a\n b\n c' (fixture wants 'a b c') The OCaml ref just String.trims the raw key bytes — interior newlines and per-line indentation are preserved verbatim. The fixtures that do pass do so because trim happens to land on a clean single word; the multi-word multi-line cases actively diverge. The collapse-interior-whitespace rule belongs to the multiline_keys feature (already documented at /reference/features#multiline_keys), not the Pacman page's strategy contract. Updated the Notes paragraph to point at that feature for impls that want stricter normalization.
Earlier draft said impls implementing multiline_keys 'collapse interior whitespace runs into a single space' — that's not a real behavior any implementation does. Just state plainly that the key bytes are edge-trimmed and otherwise preserved.
…examples Examples 1 and 5 used '\n...' to mean 'more input I'm not specifying', but the documented value only holds for a particular interpretation of '...' (specifically: dedent at column 0, ending the value). Replaced with concrete fully-specified inputs whose outputs were verified against the OCaml reference parser.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Pacman parser page described
consume_until('=')and the raw return ofconsume_until_dedentat the pseudocode level, but left two concrete behaviors underspecified — gaps an independent Haskell port had to reverse-engineer from the shared fixtures. Adds two subsections between the pseudocode and the worked trace, each with the rule and fixture-derived examples.Changes
=delimiter (Pacman page: specify the '=' delimiter selection rule #5): organized around the two delimiter modes.delimiter_first_equals(default, matching the OCaml reference) is just "first=wins" — the pseudocode is literal.delimiter_prefer_spacedprefers a=with actual space on both sides, falling back to first-=; under that strict definition, leading-=headings like== Section Header =fall through to first-=cleanly with no carve-out needed. Examples cover== Section Header =and a URL=URL key.strip()corrupts multi-line values (eats the leading newline of a sub-block; shifts the first continuation's indent).Issue #6 originally proposed a "split on
\n, strip each, drop empty, join with space" multi-line-key normalization. Verifying against the OCaml reference (printf 'my\n key\n= val' | dump.exe→key="my\n key", not"my key") showed that rule isn't part of the canonical algorithm and no implementation actually folds interior whitespace that way. The Pacman page now states the actual behavior plainly. Closing #6 with that clarification rather than a normative rule.Closes #5, closes #6, closes #7.