option to allow trailing characters while parsing by tanmaykm · Pull Request #439 · JuliaIO/JSON.jl

tanmaykm · 2026-03-10T03:10:55Z

This exposes the existing internal isroot parameter of _lazy as a keyword argument on JSON.lazy (and by extension JSON.parse).

By default, isroot=true, which means the parser expects the entire buffer to be a single valid JSON value — trailing characters after the root value will raise an error. Setting isroot=false parses only the first JSON value from the buffer and silently ignores any trailing characters.

Useful for:

Parsing buffers that contain multiple concatenated JSON objects without a delimiter, e.g. {"a":1}{"b":2}
Parsing JSON followed by non-JSON content, e.g. {"a":1} : some annotation...
Restoring the pre-1.x behavior of this package

julia> JSON.parse("{\"hello\": \"world\"} extra stuff", isroot=false)
  JSON.Object with 1 entry:
  "hello" => "world"

codecov · 2026-03-10T03:13:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.26%. Comparing base (f4fbb5a) to head (5b0ddcd).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #439   +/-   ##
=======================================
  Coverage   90.26%   90.26%           
=======================================
  Files           7        7           
  Lines        1366     1366           
=======================================
  Hits         1233     1233           
  Misses        133      133

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This adds an option `allowtrailing` to tolerate additional trailing characters in the buffer while parsing json. It is off by default, which keeps the parser strict and tries to parse the entire buffer as json. But when it is switched on, it allows parsing a valid json from the beginning of the buffer and ignore any additional following characters if they are present. This is useful in parsing scenarios that contain multiple json objects without a delimiter. E.g. `{"name": "value"}{"name": "value"}`. Or a json followed by other characters. E.g. `{"name": "value"} : this is...`. This also matches the pre 1.x behavior of this package.

quinnj · 2026-03-16T15:57:43Z

I don't love introducing a new keyword argument/option for this, especially when we have the isroot property right there. I think I'd prefer allowing passing isroot=false as a keyword arg that would get passed down and then we'd have:

function lazy(buf::Union{AbstractVector{UInt8}, AbstractString}; isroot::Bool=true, kw...)
    if !applicable(pointer, buf, 1) || (buf isa AbstractVector{UInt8} && !isone(only(strides(buf))))
        if buf isa AbstractString
            buf = String(buf)
        else
            buf = Vector{UInt8}(buf)
        end
    end
    len = getlength(buf)
    if len == 0
        error = UnexpectedEOF
        pos = 0
        @goto invalid
    end
    pos = 1
    # detect and error on UTF-16LE BOM
    if len >= 2 && getbyte(buf, pos) == 0xff && getbyte(buf, pos + 1) == 0xfe
        error = InvalidUTF16
        @goto invalid
    end
    # detect and error on UTF-16BE BOM
    if len >= 2 && getbyte(buf, pos) == 0xfe && getbyte(buf, pos + 1) == 0xff
        error = InvalidUTF16
        @goto invalid
    end
    # detect and ignore UTF-8 BOM
    pos = (len >= 3 && getbyte(buf, pos) == 0xef && getbyte(buf, pos + 1) == 0xbb && getbyte(buf, pos + 2) == 0xbf) ? pos + 3 : pos
    @nextbyte
    return _lazy(buf, pos, len, b, LazyOptions(; kw...), isroot)

@label invalid
    invalid(error, buf, pos, Any)
end

the main differences being that we "capture" the isroot::Bool=true keyword arg (so it isn't passed down to _lazy and then we construct the LazyValue w/ user-provided isroot.

tanmaykm · 2026-03-17T04:03:47Z

Thanks @quinnj , this is much cleaner. I have updated the PR with your suggestion.

quinnj · 2026-03-17T04:35:02Z

Looking pretty good; will you also add this into the docs for JSON.parse in the parse.jl file? (we have several of the JSON.lazy keyword args repeated there). Then if you include a minor version bump, we can merge and release.

tanmaykm · 2026-03-17T07:52:54Z

Done. Thanks!

tanmaykm force-pushed the tan/allowtrailing branch from 8b6b362 to 3c7c03c Compare March 10, 2026 03:16

tanmaykm requested a review from quinnj March 15, 2026 03:47

use isroot instead

6673cab

add docs, bump minor version

5b0ddcd

quinnj approved these changes Mar 17, 2026

View reviewed changes

quinnj merged commit 4286a94 into master Mar 17, 2026
11 checks passed

quinnj deleted the tan/allowtrailing branch March 17, 2026 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

option to allow trailing characters while parsing#439

option to allow trailing characters while parsing#439
quinnj merged 3 commits intomasterfrom
tan/allowtrailing

tanmaykm commented Mar 10, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

quinnj commented Mar 16, 2026

Uh oh!

tanmaykm commented Mar 17, 2026

Uh oh!

quinnj commented Mar 17, 2026

Uh oh!

tanmaykm commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tanmaykm commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

quinnj commented Mar 16, 2026

Uh oh!

tanmaykm commented Mar 17, 2026

Uh oh!

quinnj commented Mar 17, 2026

Uh oh!

tanmaykm commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tanmaykm commented Mar 10, 2026 •

edited

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading