Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When a serde(untagged) enum can't deser, show all the reasons why #2376

Closed

Conversation

cbeck88
Copy link

@cbeck88 cbeck88 commented Feb 18, 2023

I was writing some code recently that uses serde(untagged) for an enum when deserializing certain JSON schema, but I was unhappy with the quality of the error messages I was getting, because it doesn't really tell you why each possibility doesn't work.

I noticed that there is a TODO item in the serde_derive proc macro code that generates this deserialization.

I decided that trying to improve the error messages upstream in serde-derive is easier than trying to change how I'm using serde, so I took a stab at implementing this TODO, and updating the tests that test error messages, and writing some more tests.

I have tried to follow the patterns and conventions that I have seen elsewhere in the serde-derive source code, and I think that the code gen is good in that it uses format_args! like other parts of serde error handling and avoids making a new dynamic memory allocation. When untagged deserialization fails, the errors are collected on the stack rather than in a new dynamically-sized container.

Let me know if you think this is a good direction, I'm happy to iterate if this patch is interesting to you. Thanks for building serde, it's great.

I was writing some code recently that uses `serde(untagged)` for
an enum when deserializing certain JSON schema, but I was unhappy
with the quality of the error messages I was getting, because it
doesn't really tell you why each possibility doesn't work.

I noticed that there is a TODO item in the `serde_derive` proc
macro code that generates this deserialization.

I decided that trying to improve the error messages upstream in
serde-derive is easier than trying to change how I'm using serde,
so I took a stab at implementing this TODO, and updating the tests
that test error messages, and writing some more tests.

I have tried to follow the patterns and conventions that I have
seen elsewhere in the serde-derive source code, and I think that
the code gen is good in that it uses `format_args!` like other
parts of serde error handling and avoids making a new dynamic memory
allocation. When untagged deserialization fails, the errors are
collected on the stack rather than in a new dynamically-sized
container.

Let me know if you think this is a good direction, I'm happy to
iterate if this patch is interesting to you. Thanks for building
serde, it's great.
@appetrosyan

This comment was marked as spam.

jonasbb added a commit to jonasbb/serde_with that referenced this pull request Apr 2, 2023
Untagged enums do not provide good error messages and likely never will,
given that there are multiple PRs which are just completely ignored
([serde#2376](serde-rs/serde#2376) and
[serde#1544](serde-rs/serde#1544)).

Instead using `content::de` the untagged enums can be replaced by custom
buffering. The error messages for `OneOrMany` and `PickFirst` now look
like this, including the original failure for each variant.

```text
OneOrMany could not deserialize any variant:
  One: invalid type: map, expected u32
  Many: invalid type: map, expected a sequence
```

```text
PickFirst could not deserialize any variant:
  First: invalid type: string "Abc", expected u32
  Second: invalid digit found in string
```

The implementations of `VecSkipError` and `DefaultOnError` are updated
too, but should not result in any visible changes.
jonasbb added a commit to jonasbb/serde_with that referenced this pull request Apr 2, 2023
Untagged enums do not provide good error messages and likely never will,
given that there are multiple PRs which are just completely ignored
([serde#2376](serde-rs/serde#2376) and
[serde#1544](serde-rs/serde#1544)).

Instead using `content::de` the untagged enums can be replaced by custom
buffering. The error messages for `OneOrMany` and `PickFirst` now look
like this, including the original failure for each variant.

```text
OneOrMany could not deserialize any variant:
  One: invalid type: map, expected u32
  Many: invalid type: map, expected a sequence
```

```text
PickFirst could not deserialize any variant:
  First: invalid type: string "Abc", expected u32
  Second: invalid digit found in string
```

The implementations of `VecSkipError` and `DefaultOnError` are updated
too, but should not result in any visible changes.
bors bot added a commit to jonasbb/serde_with that referenced this pull request Apr 2, 2023
586: Improve error messaged by dropping untagged enums r=jonasbb a=jonasbb

Untagged enums do not provide good error messages and likely never will,
given that there are multiple PRs which are just completely ignored
([serde#2376](serde-rs/serde#2376) and
[serde#1544](serde-rs/serde#1544)).

Instead using `content::de` the untagged enums can be replaced by custom
buffering. The error messages for `OneOrMany` and `PickFirst` now look
like this, including the original failure for each variant.

```text
OneOrMany could not deserialize any variant:
  One: invalid type: map, expected u32
  Many: invalid type: map, expected a sequence
```

```text
PickFirst could not deserialize any variant:
  First: invalid type: string "Abc", expected u32
  Second: invalid digit found in string
```

The implementations of `VecSkipError` and `DefaultOnError` are updated
too, but should not result in any visible changes.

Co-authored-by: Jonas Bushart <jonas@bushart.org>
Copy link
Member

@oli-obk oli-obk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll discuss this PR with dtolnay next time we chat. May take a few weeks, but if I don't comment here again in the next month, please ping me.

Comment on lines +1696 to +1698
// We need two copies of this iterator
let err_identifiers1 = (0..num_variants).map(|idx| format_ident!("_err{}", idx));
let err_identifiers2 = err_identifiers1.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not strictly necessary, as you could probably do the iteration over the attempts directly in the format_args! arguments, this seems better to avoid any question about the order of evaluation.

Comment on lines +1685 to +1694
// The format string we are building will have the following structure:
// "data did not match any variant of untagged enum{}\nvar1: {}\nvar2: {}\nvar3: {}"
let mut err_format_string = fallthrough_msg.to_owned();
let mut num_variants = 0usize;
for var in variants.iter().filter(|variant| !variant.attrs.skip_deserializing()) {
err_format_string.push_str("\n");
err_format_string.push_str(&var.ident.to_string());
err_format_string.push_str(": {}");
num_variants += 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident this will render well in all the error libraries out there, but we can adjust it once we have seen some of these messages in the wild. The Display/Debug output isn't something we provide back compat guarantees for anyway.

@oli-obk
Copy link
Member

oli-obk commented Apr 23, 2023

We had a chat, and are both worried about the diagnostic's usefulness. It will mostly appear to users of your library that have given bad input, and whether it helps them actually solve the problem is not clear. We feel like while this solves the immediate issue you are seeing, we can do better than just patching a diagnostic.

Thus we have an alternate proposal:

Add new examples to https://github.com/serde-rs/serde-rs.github.io/tree/master/_src that explains how to use untagged enums.

  • e.g. via a custom visitor for a custom Deserialize impl to produce a single unified diagnostic instead of a generic line per variant.
  • You could also ask https://github.com/jonasbb/serde_with if they'd want to add helpers for automatically creating such visitors
  • Or create a helper crate for easily writing visitors (e.g. with a builder pattern)

Thanks for your work. I'm going to close this PR in my attempt to get a hold of our PR queue, but feel free to use this PR to ask follow up questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants