-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/json: Decoder.Token does not return an error for incomplete JSON #69449
Comments
Token
does not return an error for incomplete JSON
It's not obvious to me that this is a bug. The doc comments state that Token returns (nil, EOF) at the end of the stream, which it does; and that Token guarantees that the delimiters it returns are properly nested and matched, which it does too, to the extent that you'll never see a mismatched or premature close bracket. The actual error in this case is that we ran out of input (for which EOF is appropriate), not that there was something wrong with the input we received, so it would be strange for it to report a syntax error, or to synthesize close bracket tokens that weren't present. What would you have it do? |
The question comes down to: what is a stream? The docs for
This is unfortunately problematic. The term "value" according to RFC 8259 implies a complete object or array (i.e., including the paired terminating ']' or '}'), but the prose in the latter half of the sentence is describing what RFC 8259 would actually call a "token". Essentially:
The "github.com/go-json-experiment/json" module reports |
The same example with XML returns the error https://go.dev/play/p/wCvkSOlqeH0 dec := xml.NewDecoder(strings.NewReader(`<values>123`))
for {
_, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
panic(err)
}
} Therefore, I notice an inconsistency in the behavior of I believe the correct behavior would be to return an |
ErrUnexpectedEOF is appropriate for the case when the EOF occurs in the middle of a token, such as an unclosed string literal, or in the middle of "null", "true", or "false"; and indeed that's what the decoder does. But Token promises to deliver tokens, and a missing close bracket is not a token-decoding error.
The quoted sentence says "basic values", and immediately defines them as the elementary, single-token values; I don't think one can argue that value is meant here in its general sense. So Token returns a stream of "basic values and delimiters". |
If this were true, then I would expect the parser to be handling the tokens as just a plain sequence of JSON tokens (i.e., only validating for the grammar for a JSON "token", but not the grammar for a JSON "value"). However, that's not quite how it behaves. Consider the following: dec := json.NewDecoder(strings.NewReader(`{ "hello" }`))
for {
tok, err := dec.Token()
if err != nil {
fmt.Println(err)
return
}
fmt.Println(tok)
} which prints:
The fact that we see |
In Go 1.5 and earlier, the behavior of Given that the |
Hmm, good point. Both the (documented) no-mismatched-paren rule and the (undocumented) object grammar enforcement you just mentioned are evidence that Token actually intends to return only valid, complete sequences of tokens, and so returning ErrUnexpectedEOF if the sequence is incomplete seems reasonable. |
When dealing with incomplete JSON, the
(*Decoder).Token
method returnsio.EOF
at the end of the input instead of providing an appropriate error. For example, the following code does not produce an error:https://go.dev/play/p/HHEwVkRCs7a
According to the documentation, Token guarantees that the delimiters [ ] { } it returns are properly nested and matched. However, in this example,
[
is not properly nested and matched.The text was updated successfully, but these errors were encountered: