Skip to content

fix: discard leading comment/whitespace preamble in parseYaml stream#867

Open
ricardbejarano wants to merge 1 commit intogoogle:masterfrom
bejaratommy:fix/parseyaml-leading-comment-null
Open

fix: discard leading comment/whitespace preamble in parseYaml stream#867
ricardbejarano wants to merge 1 commit intogoogle:masterfrom
bejaratommy:fix/parseyaml-leading-comment-null

Conversation

@ricardbejarano
Copy link
Copy Markdown

Summary

Fixes #660.

When a YAML document stream begins with comment lines or blank lines before the first --- document-start marker, std.parseYaml incorrectly emits a spurious null as the first element of the result array.

Root cause: YAMLReader.read() accumulates all input lines into a buffer until it encounters a --- marker. At that point, if the buffer is non-empty it returns the buffered bytes as a document and resets. When those bytes contain only YAML comments (#...) or blank lines, yaml.Unmarshal yields nil, which is then appended as null to the result.

Fix: Introduce isCommentOrWhitespace to detect when the accumulated buffer contains only comments and blank lines (valid YAML preamble per §9.2 of the YAML spec). When that is the case, discard the buffer rather than returning it as a document.

Before:

std.parseYaml("# Test\n---\nfoo: bar\n---\nbaz: cuux\n")
// => [null, {"foo": "bar"}, {"baz": "cuux"}]

After:

std.parseYaml("# Test\n---\nfoo: bar\n---\nbaz: cuux\n")
// => [{"foo": "bar"}, {"baz": "cuux"}]

Changes

  • yaml.go: add isCommentOrWhitespace helper; skip preamble-only buffer in YAMLReader.read()
  • testdata/parseYaml.jsonnet: add two regression test cases (single leading comment; multiple comments + blank lines)
  • testdata/parseYaml.golden: update expected output accordingly

All existing tests continue to pass (go test ./...).

When a YAML document stream begins with comment lines or blank lines
before the first '---' document-start marker, the YAMLReader would
accumulate those lines in its buffer and emit them as a document when
the marker was encountered. yaml.Unmarshal of comment-only content
returns nil, resulting in a spurious null element at the start of the
parsed array.

Per YAML spec §9.2, comments and blank lines that appear before the
first document-start marker are preamble and do not form a document.
Add isCommentOrWhitespace to detect this case and discard the buffer
instead of returning it as a document.

Fixes google#660

Signed-off-by: Tommy <tommy@bejara.net>
@google-cla
Copy link
Copy Markdown

google-cla bot commented Apr 1, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parseYaml inserts null when document stream starts with a comment

2 participants