Skip to content

Use serde for target spec json deserialize #144218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Noratrieb
Copy link
Member

@Noratrieb Noratrieb commented Jul 20, 2025

The previous manual parsing of serde_json::Value was a lot of complicated code and extremely error-prone. It was full of janky behavior like sometimes ignoring type errors, sometimes erroring for type errors, sometimes warning for type errors, and sometimes just ICEing for type errors (the icing on the top).

Additionally, many of the error messages about allowed values were out of date because they were in a completely different place than the FromStr impls. Overall, the system caused confusion for users.

I also found the old deserialization code annoying to read. Whenever a key! invocation was found, one had to first look for the right macro arm, and no go to definition could help.

This PR replaces all this manual parsing with a 2-step process involving serde.
First, the string is parsed into a TargetSpecJson struct. This struct is a 1:1 representation of the spec JSON. It already parses all the enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual Target. The reason for this two-step process instead of just serializing into a Target is because of a few reasons

  1. There are a few transformations performed between the two formats
  2. The default logic is implemented this way. Otherwise all the default field values would have to be spelled out again, which is suboptimal. With this logic, they fall out naturally, because everything in the json struct is an Option.

Overall, the mapping is pretty simple, with the vast majority of fields just doing a 1:1 mapping that is captured by two macros. I have deliberately avoided making the macros generic to keep them simple.

All the FromStr impls now have the error message right inside them, which increases the chance of it being up to date. Some "from_str" impls were turned into proper FromStr impls to support this.

The new code is much less involved, delegating all the JSON parsing logic to serde, without any manual type matching.

This change introduces a few breaking changes for consumers. While it is possible to use this format on stable, it is very much subject to change, so breaking changes are expected. The hope is also that because of the way stricter behavior, breaking changes are easier to deal with, as they come with clearer error messages.

  1. Invalid types now always error, everywhere. Previously, they would sometimes error, and sometimes just be ignored (which meant the users JSON was still broken, just silently!)

  2. This now makes use of deny_unknown_fields instead of just warning on unused fields, which was done previously. Serde doesn't make it easy to get such warning behavior, which was the primary reason that this now changed. But I think error behavior is very reasonable too. If someone has random stale fields in their JSON, it is likely because these fields did something at some point but no longer do, and the user likely wants to be informed of this so they can figure out what to do.

    This is also relevant for the future. If we remove a field but someone has it set, it probably makes sense for them to take a look whether they need this and should look for alternatives, or whether they can just delete it. Overall, the JSON is made more explicit.

This is the only expected breakage, but there could also be small breakage from small mistakes. All targets roundtrip though, so it can't be anything too major.

fixes #144153

@rustbot
Copy link
Collaborator

rustbot commented Jul 20, 2025

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs A-tidy Area: The tidy tool S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jul 20, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 20, 2025

This PR modifies run-make tests.

cc @jieyouxu

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

These commits modify compiler targets.
(See the Target Tier Policy.)

The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging.

cc @davidtwco, @wesleywiser

There are changes to the tidy tool.

cc @jieyouxu

@rust-log-analyzer

This comment has been minimized.

@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from a829315 to 949f059 Compare July 20, 2025 13:37
github-merge-queue bot pushed a commit to rust-lang/cargo that referenced this pull request Jul 20, 2025
### What does this PR try to resolve?

This is not necessary, as 32 is the default, and actually of the wrong
type now since it's a number now.
When planning to make these type mismatches error in
rust-lang/rust#144218, cargo would fail here, so
I just removed it.

This custom target test very much shows how Cargo should be a subtree,
in this case it was fine because there's a compatible fix that I can
push now, otherwise it would have been very annoying.

### How to test and review this PR?

If the test suite passes, it works
@rust-log-analyzer

This comment has been minimized.

@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from 949f059 to 47312aa Compare July 20, 2025 16:29
@rust-log-analyzer

This comment has been minimized.

@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from 47312aa to ae949f5 Compare July 20, 2025 18:41
@Noratrieb
Copy link
Member Author

I am finding a lot of target json specs with values that no longer exist, even an entire test case based on the assumption that it exists while it doesn't. I am getting a lot more confidence in the error behavior thanks to this.

@rust-log-analyzer

This comment has been minimized.

@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from ae949f5 to babcbc3 Compare July 20, 2025 20:35
Copy link

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"coff" => Ok(Self::Coff),
"elf" => Ok(Self::Elf),
"mach-o" => Ok(Self::MachO),
"wasm" => Ok(Self::Wasm),
"xcoff" => Ok(Self::Xcoff),
_ => Err(()),
_ => Err(format!(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: The error formats are pretty repetitive; I wonder if a few helper methods wouldn't perhaps be an improvement? Something like

#[inline(never)]
pub(crate) fn s_is_not_a_valid_value_for(s: &str, field: &str, variants: &[&str]) -> String {
  format!("'{s}' is not a valid value for {field}. Use {}", format_to_list(variants))
}

and then the same for the "use command to print allowed values" etc.

This'd probably also reduce the slight drift between the error messages; some say "for field. Use 'a', 'b', or 'c'." while others say `for field, expected 'a', 'b', or 'c'" or such.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of maybe using a macro that both generates the match with its arms and also formats the error message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that'd be a neat improvement in the future, but I wouldn't want to refactor the enum parsing in this PR here

fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"coff" => Ok(Self::Coff),
"elf" => Ok(Self::Elf),
"mach-o" => Ok(Self::MachO),
"wasm" => Ok(Self::Wasm),
"xcoff" => Ok(Self::Xcoff),
_ => Err(()),
_ => Err(format!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of maybe using a macro that both generates the match with its arms and also formats the error message.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 21, 2025
@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from babcbc3 to 95cdc61 Compare July 21, 2025 17:31
The previous manual parsing of `serde_json::Value` was a lot of
complicated code and extremely error-prone. It was full of janky
behavior like sometimes ignoring type errors, sometimes erroring for
type errors, sometimes warning for type errors, and sometimes just
ICEing for type errors (the icing on the top).

Additionally, many of the error messages about allowed values were out
of date because they were in a completely different place than the
FromStr impls. Overall, the system caused confusion for users.

I also found the old deserialization code annoying to read. Whenever a
`key!` invocation was found, one had to first look for the right macro
arm, and no go to definition could help.

This PR replaces all this manual parsing with a 2-step process involving
serde.
First, the string is parsed into a `TargetSpecJson` struct. This struct
is a 1:1 representation of the spec JSON. It already parses all the
enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual `Target`.
The reason for this two-step process instead of just serializing into a
`Target` is because of a few reasons

 1. There are a few transformations performed between the two formats
 2. The default logic is implemented this way. Otherwise all the default
    field values would have to be spelled out again, which is
    suboptimal. With this logic, they fall out naturally, because
    everything in the json struct is an `Option`.

Overall, the mapping is pretty simple, with the vast majority of fields
just doing a 1:1 mapping that is captured by two macros. I have
deliberately avoided making the macros generic to keep them simple.

All the `FromStr` impls now have the error message right inside them,
which increases the chance of it being up to date. Some "`from_str`"
impls were turned into proper `FromStr` impls to support this.

The new code is much less involved, delegating all the JSON parsing
logic to serde, without any manual type matching.

This change introduces a few breaking changes for consumers. While it is
possible to use this format on stable, it is very much subject to
change, so breaking changes are expected. The hope is also that because
of the way stricter behavior, breaking changes are easier to deal with,
as they come with clearer error messages.

1. Invalid types now always error, everywhere. Previously, they would
   sometimes error, and sometimes just be ignored (which meant the users
   JSON was still broken, just silently!)
2. This now makes use of `deny_unknown_fields` instead of just warning
   on unused fields, which was done previously. Serde doesn't make it
   easy to get such warning behavior, which was the primary reason that
   this now changed. But I think error behavior is very reasonable too.
   If someone has random stale fields in their JSON, it is likely
   because these fields did something at some point but no longer do,
   and the user likely wants to be informed of this so they can figure
   out what to do.

   This is also relevant for the future. If we remove a field but
   someone has it set, it probably makes sense for them to take a look
   whether they need this and should look for alternatives, or whether
   they can just delete it. Overall, the JSON is made more explicit.

This is the only expected breakage, but there could also be small
breakage from small mistakes. All targets roundtrip though, so it can't
be anything too major.
@Noratrieb Noratrieb force-pushed the target-spec-json-de-jank branch from 95cdc61 to dad96b1 Compare July 21, 2025 17:32
@Noratrieb
Copy link
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 21, 2025
@fee1-dead
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Jul 22, 2025

📌 Commit dad96b1 has been approved by fee1-dead

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 22, 2025
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 22, 2025
…, r=fee1-dead

Use serde for target spec json deserialize

The previous manual parsing of `serde_json::Value` was a lot of complicated code and extremely error-prone. It was full of janky behavior like sometimes ignoring type errors, sometimes erroring for type errors, sometimes warning for type errors, and sometimes just ICEing for type errors (the icing on the top).

Additionally, many of the error messages about allowed values were out of date because they were in a completely different place than the FromStr impls. Overall, the system caused confusion for users.

I also found the old deserialization code annoying to read. Whenever a `key!` invocation was found, one had to first look for the right macro arm, and no go to definition could help.

This PR replaces all this manual parsing with a 2-step process involving serde.
First, the string is parsed into a `TargetSpecJson` struct. This struct is a 1:1 representation of the spec JSON. It already parses all the enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual `Target`. The reason for this two-step process instead of just serializing into a `Target` is because of a few reasons

 1. There are a few transformations performed between the two formats
 2. The default logic is implemented this way. Otherwise all the default field values would have to be spelled out again, which is suboptimal. With this logic, they fall out naturally, because everything in the json struct is an `Option`.

Overall, the mapping is pretty simple, with the vast majority of fields just doing a 1:1 mapping that is captured by two macros. I have deliberately avoided making the macros generic to keep them simple.

All the `FromStr` impls now have the error message right inside them, which increases the chance of it being up to date. Some "`from_str`" impls were turned into proper `FromStr` impls to support this.

The new code is much less involved, delegating all the JSON parsing logic to serde, without any manual type matching.

This change introduces a few breaking changes for consumers. While it is possible to use this format on stable, it is very much subject to change, so breaking changes are expected. The hope is also that because of the way stricter behavior, breaking changes are easier to deal with, as they come with clearer error messages.

1. Invalid types now always error, everywhere. Previously, they would sometimes error, and sometimes just be ignored (which meant the users JSON was still broken, just silently!)
2. This now makes use of `deny_unknown_fields` instead of just warning on unused fields, which was done previously. Serde doesn't make it easy to get such warning behavior, which was the primary reason that this now changed. But I think error behavior is very reasonable too. If someone has random stale fields in their JSON, it is likely because these fields did something at some point but no longer do, and the user likely wants to be informed of this so they can figure out what to do.

   This is also relevant for the future. If we remove a field but someone has it set, it probably makes sense for them to take a look whether they need this and should look for alternatives, or whether they can just delete it. Overall, the JSON is made more explicit.

This is the only expected breakage, but there could also be small breakage from small mistakes. All targets roundtrip though, so it can't be anything too major.

fixes rust-lang#144153
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 22, 2025
…, r=fee1-dead

Use serde for target spec json deserialize

The previous manual parsing of `serde_json::Value` was a lot of complicated code and extremely error-prone. It was full of janky behavior like sometimes ignoring type errors, sometimes erroring for type errors, sometimes warning for type errors, and sometimes just ICEing for type errors (the icing on the top).

Additionally, many of the error messages about allowed values were out of date because they were in a completely different place than the FromStr impls. Overall, the system caused confusion for users.

I also found the old deserialization code annoying to read. Whenever a `key!` invocation was found, one had to first look for the right macro arm, and no go to definition could help.

This PR replaces all this manual parsing with a 2-step process involving serde.
First, the string is parsed into a `TargetSpecJson` struct. This struct is a 1:1 representation of the spec JSON. It already parses all the enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual `Target`. The reason for this two-step process instead of just serializing into a `Target` is because of a few reasons

 1. There are a few transformations performed between the two formats
 2. The default logic is implemented this way. Otherwise all the default field values would have to be spelled out again, which is suboptimal. With this logic, they fall out naturally, because everything in the json struct is an `Option`.

Overall, the mapping is pretty simple, with the vast majority of fields just doing a 1:1 mapping that is captured by two macros. I have deliberately avoided making the macros generic to keep them simple.

All the `FromStr` impls now have the error message right inside them, which increases the chance of it being up to date. Some "`from_str`" impls were turned into proper `FromStr` impls to support this.

The new code is much less involved, delegating all the JSON parsing logic to serde, without any manual type matching.

This change introduces a few breaking changes for consumers. While it is possible to use this format on stable, it is very much subject to change, so breaking changes are expected. The hope is also that because of the way stricter behavior, breaking changes are easier to deal with, as they come with clearer error messages.

1. Invalid types now always error, everywhere. Previously, they would sometimes error, and sometimes just be ignored (which meant the users JSON was still broken, just silently!)
2. This now makes use of `deny_unknown_fields` instead of just warning on unused fields, which was done previously. Serde doesn't make it easy to get such warning behavior, which was the primary reason that this now changed. But I think error behavior is very reasonable too. If someone has random stale fields in their JSON, it is likely because these fields did something at some point but no longer do, and the user likely wants to be informed of this so they can figure out what to do.

   This is also relevant for the future. If we remove a field but someone has it set, it probably makes sense for them to take a look whether they need this and should look for alternatives, or whether they can just delete it. Overall, the JSON is made more explicit.

This is the only expected breakage, but there could also be small breakage from small mistakes. All targets roundtrip though, so it can't be anything too major.

fixes rust-lang#144153
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-run-make Area: port run-make Makefiles to rmake.rs A-tidy Area: The tidy tool S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

custom target JSON fields do not error if they have an incorrect type
8 participants