Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: use different error type for unknown field if they are disallowed #40982

Open
Segflow opened this issue Aug 22, 2020 · 9 comments · May be fixed by #40983
Open

encoding/json: use different error type for unknown field if they are disallowed #40982

Segflow opened this issue Aug 22, 2020 · 9 comments · May be fixed by #40983

Comments

@Segflow
Copy link
Contributor

@Segflow Segflow commented Aug 22, 2020

On Go 1.15 and below, json decoder returns a generic error when unknown fields are disallowed (via Decoder.DisallowUnknownFields. See https://github.com/golang/go/blob/master/src/encoding/json/decode.go#L737-L739

It may be useful to return a different error type so the caller can detect when it is the case.

The only way to detect that currently is to interpret the string returned by Error

@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented Aug 25, 2020

@Segflow
Copy link
Contributor Author

@Segflow Segflow commented Sep 24, 2020

I understand that this should go through the proposal process, but any idea to start that?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Sep 24, 2020

@Segflow See https://golang.org/s/proposal. To make a proposal, please propose a specific change. Thanks.

@networkimprov
Copy link

@networkimprov networkimprov commented Sep 24, 2020

First I think we want to hear what @mvdan thinks of the idea...

@dsnet
Copy link
Member

@dsnet dsnet commented Sep 24, 2020

Given the existence of custom UnmarshalJSON functions that already don't guarantee any semantics regarding the errors they returns, I don't think we should expose semantically identifiable errors for unknown fields since it would only be a partial guarantee.

Furthermore, the original poster should adequately explain what they want to use this for. It's rare for any proposal to go through without some justification for why it's useful.

@Segflow
Copy link
Contributor Author

@Segflow Segflow commented Sep 29, 2020

@dsnet one of our services process messages from a queue, there could be multiple instances reading from the same queue. Once a service reads a message from the dequeue we decode it(json) and process it.

A message could be invalid, thus a service should decide whenever it should delete the message from the queue or keep it so that probably another instance can process it (probably a newer version of the service).

For instance a "SyntaxError" on a message means we can safely delete the message.

We currently check if error starts with json: unknown field .

@Segflow
Copy link
Contributor Author

@Segflow Segflow commented Sep 29, 2020

@dsnet
Copy link
Member

@dsnet dsnet commented Sep 29, 2020

If you want to keep the message if it is syntactically valid, is there a reason why json.Valid is not being used? It seems backwards trying to determine properties about the input after the fact by interpreting the error condition, rather than checking validity of the data up-front.

@dsnet
Copy link
Member

@dsnet dsnet commented Sep 29, 2020

Following up a bit more, a problem with relying on errors to determine attributes about the input is the difficulty in presenting that information well and the confusion that arises when scaling to multiple attributes that a user may want to distinguish.

For protocol buffers, we had this issue where we wanted to present errors to distinguish between inputs that were "missing required fields" and inputs that contained "invalid UTF-8". Let's suppose the previous error was distinguishable using errors.Is(err, proto.ErrMissingRequired) and the latter distinguishable using errors.Is(err, proto.ErrInvalidUTF8).

Now suppose an input had both missing required fields and invalid UTF-8, which error do you present? Do you arbitrarily present just one of the two errors or do you return a composite error where errors.(err, proto.ErrMissingRequired) and errors.Is(err, proto.ErrInvalidUTF8) both return true? If you take the previous approach, it breaks users who expect to also detect the other error attribute. If you take the latter approach, it is subtly confusing since users are typically only asking one of the following two questions:

  • Is this error exactly identical to proto.ErrInvalidUTF8 and definitely nothing else?
  • Does this error at least contain proto.ErrInvalidUTF8 and I don't care if there are other attributes present?

The errors.Is function does not provide guidance on which of the two questions are being answered. However, in common usage, we found that both questions were reasonable ones that the user wanted to know for some specific use case. Rather than presenting an API that is confusing and easy to use incorrectly, I believe it is better to validate up-front attributes about the input than to rely on the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

5 participants