Skip to content

Unsafe-Parser #370

Closed
DGonzalezVillal wants to merge 2 commits intovirtee:mainfrom
DGonzalezVillal:unsafe-feature
Closed

Unsafe-Parser #370
DGonzalezVillal wants to merge 2 commits intovirtee:mainfrom
DGonzalezVillal:unsafe-feature

Conversation

@DGonzalezVillal
Copy link
Member

Introduce a new unsafe_parser feature that removes the skip_bytes
logic from extended read and write operations in the parser. This
forces structures to be read and written using the exact raw contents
present in the input, avoiding issues with unparsable reports when new
report versions are released.

This change removes behind-the-scenes version-specific parsing of
attestation reports and shifts responsibility to the user to validate
and interpret report contents during attestation and related workflows.

@DGonzalezVillal
Copy link
Member Author

@cschramm please take a look at this PR, this is closer to what I was envisioning for the new parser feature.

@hyperfinitism Would also appreciate your thoughts

Introduce a new `unsafe_parser` feature that removes the `skip_bytes`
logic from extended read and write operations in the parser. This
forces structures to be read and written using the exact raw contents
present in the input, avoiding issues with unparsable reports when new
report versions are released.

This change removes behind-the-scenes version-specific parsing of
attestation reports and shifts responsibility to the user to validate
and interpret report contents during attestation and related workflows.

Signed-off-by: DGonzalezVillal <Diego.GonzalezVillalobos@amd.com>
Introduce lint checks and cargo unit tests in GitHub workflows to cover
the new unsafe-parser feature.

Signed-off-by: DGonzalezVillal <Diego.GonzalezVillalobos@amd.com>
@cschramm
Copy link

👍 Makes sense to me. I'll give it a try in my use case, just to make sure.

One thought I had a couple of times along the discussion – probably out of scope for this PR, though: The safe parser should ultimately also check if it knows the report version. A reason for the zero checks is meant to be a guarantee that parsed data is actually what it's parsed as. While the use of previously reserved areas is somewhat expected in new report versions, that only seems like a coincidence in general. Also, even if some previously reserved area gets a meaning in a new version, the value might still be zero.

@hyperfinitism
Copy link
Contributor

Thanks for opening this PR and for asking for feedback.

Overall, I think the idea of being able to toggle parsing (and validation) via a feature flag is a good direction.

One point I noticed in the current implementation is that even with the unsafe_parser feature enabled, parsing of the TcbVersion structure itself (and truncation of its reserved fields) is still unavoidable. In practice this is probably fine most of the time — changes to TcbVersion tend to be much less frequent than top-level AttestationReport changes driven by firmware updates. Still, it means that unsafe_parser is not a fully raw blob mode.

As mentioned in the earlier PR, another possible approach would be to keep the entire report (including sub-structures) as a raw binary blob. This is the approach used in:

That said, I am not sure which approach is ultimately better here. The current PR takes a reasonable middle ground by still defining reserved fields explicitly while making validation optional, whereas a fully raw-blob model trades some type safety for long-term robustness and easier maintenance.

@DGonzalezVillal
Copy link
Member Author

👍 Makes sense to me. I'll give it a try in my use case, just to make sure.

One thought I had a couple of times along the discussion – probably out of scope for this PR, though: The safe parser should ultimately also check if it knows the report version. A reason for the zero checks is meant to be a guarantee that parsed data is actually what it's parsed as. While the use of previously reserved areas is somewhat expected in new report versions, that only seems like a coincidence in general. Also, even if some previously reserved area gets a meaning in a new version, the value might still be zero.

I'm not sure if I fully understand your comment. The "safe parser" already does the version revision and fills out the contents of the report according to what should and should not be present on that version of the report.

Or do you mean that the safe parser should check to see if it already has support for the version being reported and then error out on that rather than erroring out on the skip bytes?

@DGonzalezVillal
Copy link
Member Author

Thanks for opening this PR and for asking for feedback.

Overall, I think the idea of being able to toggle parsing (and validation) via a feature flag is a good direction.

One point I noticed in the current implementation is that even with the unsafe_parser feature enabled, parsing of the TcbVersion structure itself (and truncation of its reserved fields) is still unavoidable. In practice this is probably fine most of the time — changes to TcbVersion tend to be much less frequent than top-level AttestationReport changes driven by firmware updates. Still, it means that unsafe_parser is not a fully raw blob mode.

As mentioned in the earlier PR, another possible approach would be to keep the entire report (including sub-structures) as a raw binary blob. This is the approach used in:

That said, I am not sure which approach is ultimately better here. The current PR takes a reasonable middle ground by still defining reserved fields explicitly while making validation optional, whereas a fully raw-blob model trades some type safety for long-term robustness and easier maintenance.

Thanks for the thoughtful feedback — I think we’re largely aligned on the goals here.

The intent behind unsafe_parser was not to expose a fully untyped raw blob, but to enable lossless round-tripping of known report layouts while relaxing validation and preserving firmware-reserved top-level bytes.

A fully raw-blob model is attractive for long-term robustness, but it trades away type safety and discoverability for consumers. The current approach is a deliberate middle ground: sub-structures remain strongly typed, while firmware-defined expansion points are preserved verbatim.

That said, you’re absolutely right to point out that some structures we parse directly — such as TcbVersion and GuestPolicy — still canonicalize their contents and drop reserved bytes.

A reasonable evolution here may be to shift from a single unsafe_parser flag to something like a raw_entries model: enabling raw representations of individual report structures, which can either be used directly or explicitly converted into their typed equivalents.

This would allow consumers who care about full fidelity to retain and inspect the original bytes, while preserving the existing typed APIs for those who want a stable, ergonomic interface.

Copy link

@markg-github markg-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I can't really meaningfully review this code - lack of familiarity, still learning Rust, etc

Just the one comment on use of literals, and not Rust constants, for field sizes.

I will say, in general, that interpretation of attestation report bytes on the attesting platform isn't necessary "in production". It's only necessary on the verifier and it should be easier to update verifiers than attesting platforms since far fewer verifiers. Given my current lack of familiarity, I don't know how this code relates to production attesting platform or verifier code.

writer.skip_bytes::<368>()?;

#[cfg(feature = "unsafe_parser")]
writer.write_bytes([0u8; 368], ())?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like 368 "magic number" shows up at least twice --> use constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use of constants is a good idea for all the reserved fields, That's one actionable item I can take.

@cschramm
Copy link

Or do you mean that the safe parser should check to see if it already has support for the version being reported and then error out on that rather than erroring out on the skip bytes?

Correct. The key argument for the zero checks seems to be to detect newly defined parts and avoid incorrect or incomplete interpretation. At the same time, the parser happily accepts any unknown version and interprets it as V5. There is a fair chance that the zero checks will fail if it is a newer version, but there is also a chance that they won't and the interpretation will be incorrect (the new version changed something for sure). Having "unsafe" and "safe" parser modes gives the chance to get that straight in the latter case.

@DGonzalezVillal
Copy link
Member Author

At this point, I can't really meaningfully review this code - lack of familiarity, still learning Rust, etc

Just the one comment on use of literals, and not Rust constants, for field sizes.

I will say, in general, that interpretation of attestation report bytes on the attesting platform isn't necessary "in production". It's only necessary on the verifier and it should be easier to update verifiers than attesting platforms since far fewer verifiers. Given my current lack of familiarity, I don't know how this code relates to production attesting platform or verifier code.

You bring up a good point. Attesting platforms don't NEED to know what the report values are, but verifiers do need to be able to parse this report. So this goes back to what @hyperfinitism is saying, of handling the raw blobs, and only converting when absolutely needed.

@DGonzalezVillal
Copy link
Member Author

Or do you mean that the safe parser should check to see if it already has support for the version being reported and then error out on that rather than erroring out on the skip bytes?

Correct. The key argument for the zero checks seems to be to detect newly defined parts and avoid incorrect or incomplete interpretation. At the same time, the parser happily accepts any unknown version and interprets it as V5. There is a fair chance that the zero checks will fail if it is a newer version, but there is also a chance that they won't and the interpretation will be incorrect (the new version changed something for sure). Having "unsafe" and "safe" parser modes gives the chance to get that straight in the latter case.

I see what you're saying. say V6 adds a new field, but it's zeros, the parser will report and incomplete v6 report

@cschramm
Copy link

say V6 adds a new field, but it's zeros, the parser will report and incomplete v6 report

Correct, and also if V6 just shifts things around. The safe parser could reject unknown versions to reliably catch that.

@DGonzalezVillal
Copy link
Member Author

Hey @cschramm, @markg-github, and @hyperfinitism,

After going through the feedback and thinking more about the tradeoffs, I wanted to propose a concrete design direction before implementing anything.

To address the forward-compatibility concerns raised by @hyperfinitism while preserving the existing typed API, the idea would be to introduce a small wrapper type that can represent either a parsed structure or its original raw bytes:

pub enum RawEntry<T, const N: usize> {
    Typed(T),
    Raw([u8; N]),
}

During decoding, we would always read the raw bytes first. We then attempt to parse them into the typed representation; if parsing succeeds, we store Typed(T). If parsing fails (for example due to an unknown layout or newer firmware revision), decoding still succeeds, but the entry is preserved as Raw([u8; N]):

fn parse_tcb(bytes: [u8; 8], gen: Generation) -> RawEntry<TcbVersion, 8> {
    match TcbVersion::from_bytes(&bytes, gen) {
        Ok(tcb) => RawEntry::Typed(tcb),
        Err(_) => RawEntry::Raw(bytes),
    }
}

The AttestationReport can then opt into this behavior on a per-field basis behind a feature flag:

pub struct AttestationReport {
    ...
    #[cfg(feature = "raw_entries")]
    pub current_tcb: RawEntry<TcbVersion, 8>,

    #[cfg(not(feature = "raw_entries"))]
    pub current_tcb: TcbVersion,
    ...
}

Conceptually, this shifts the model to:

“If we don’t understand a field yet, we preserve it verbatim rather than rejecting or normalizing it.”

This allows consumers who care about full fidelity and forward compatibility to retain and inspect raw bytes, while preserving the existing strongly-typed API for users who prefer a stable, ergonomic interface. Validation can then be applied explicitly to the typed entries without implicitly discarding unknown data.

I’d appreciate feedback on whether this strikes the right balance before moving forward with an implementation.

@cschramm
Copy link

@hyperfinitism seems to have the much more specific requirements in mind.

For my use case, any solution is fine that provides a parser that does not necessarily break on new report versions.

“If we don’t understand a field yet, we preserve it verbatim rather than rejecting or normalizing it.”

Perfectly fine.

Just one thought I had during the discussion of various solutions:

You are proposing another feature-based solution (raw_entries) that switches between two alternative implementations. One downside of that is that the whole dependency tree has to agree on whether or not to use the feature. If, e.g., one crate depends on raw_entries and thus expects AttestationReport::current_tcb to be a RawEntry<TcbVersion, 8>, another crate in the dependency tree that expects it to be TcbVersion breaks.

I think your unsafe_parser as well as my #366 do not lead to such breakage, but still, they do not allow both an unsafe/lax and a safe/strict parser use case side-by-side. The latter will simply be unsafe/lax as well. #363 has the same issue and also the one of raw_entries as its lax-parser disables API that another use case might require.

A fully flexible solution would allow both safe and unsafe parsing use cases, possibly using a feature to enable one or the other, but not breaking/overruling the other one. It is questionable if anyone needs such flexibility, though.

@DGonzalezVillal
Copy link
Member Author

Moving discussion to #373

A more immediate solution has been proposed in: #372

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments