Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch serde_cbor to fork to allow unsafe unchecked utf8 deserialization #745

Merged
merged 2 commits into from
Oct 14, 2020

Conversation

austinabell
Copy link
Contributor

@austinabell austinabell commented Oct 14, 2020

Summary of changes
Changes introduced in this pull request:

  • In the go-actors implementation, they serialize/deserialize an unchecked utf8 string for their DealProposal. Although it is unused, we need to be able to (de)serialize the bytes as a string.
    • This isn't as simple as just manually reading bytes for this structure, because of how Serde is architected. Since the deserialization goes through the serde_cbor deserializer and strings are automatically checked, this is impossible without replacing serde altogether for serialization (which isn't feasible right now)
    • Commit on serde_cbor fork used: ChainSafe/cbor@3e7bf81
    • This should be safe for us, because this is the only string that is Cbor serialized and it isn't used by the protocol at all (unless they plan to use in v2) but the negative is that it potentially allows non utf8 strings to be deserialized in the future.
    • Issue in specs-actors DealProposal's Label field serializes as non-UTF8 filecoin-project/specs-actors#1248
  • Unskips the utf8 failing tests (as well as others that weren't updated) and is now 522/592

This should absolutely be a temporary solution, and be removed after this bug is removed from the protocol.

The benefit to the fork, however, is that we can add the serialization length limits that they do in the go implementation (and therefore part of the spec)

There is an alternative to this that I was considering, which is on deserialization if the bytes are not utf8, deserialize the data as bytes instead. This is also a simple solution but to me seemed to give unintended side effects and also requires a custom type and deserialization to unsafely convert the bytes to str before serializing. Technically this way is more safe, because no non utf8 bytes are left as string outside of serialization, but is logically confusing and inconsistent.

Reference issue to close (if applicable)

Closes

Other information and links

@austinabell austinabell merged commit 7e7d79c into main Oct 14, 2020
@austinabell austinabell deleted the austin/nonutf8string branch October 14, 2020 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interop Interop with Lotus/specs-actors and testing Status: Needs Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants