-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix JSON canonicalisation #247
Conversation
Hrm, is this also a python-tuf issue? @lukpueh |
Ignore this comment, snapshot refers to file hashes, not metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+10000!!! I have had this issue for a very long time! THANK YOU!
FYI we may want repo.Payload(rolename)
method to export the canonicalized bytes that should be signed, and maybe some private function that takes data.Signed -> canonicalized bytes that repo.Payload can call. @znewman01
+1. Two points:
|
fyi @jku |
I think I'd suggest neither in fact.. keeping all serialised representations un-canonicalised first means they're valid json, and makes it possible to modify the payload before doing sign-payload, without having to mess around with canonicalisation. Fundamentally the only things that need to ever consider canonicalisation are sign.Sign and verifier.Verify, and all these are really doing is canonicalising the data, passing it to the specific key implementation (that operates on raw []byte) and then attaching the signature in the result. Following this, the cli sign-payload command is really just the cli analogue to sign.Sign, and so it should accept un-canonicalised data.
The canonicalised format is the specification to ensure this doesn't happen. If a client canonicalises differently, then it's not following the specification and would be unable to verify signatures correctly. |
Yep certainly quite messy. I did look at how to do it better, and would perhaps question what these tests are trying to do vs using sign.Sign (they seem to just re-implement a lot of that logic in the test). With the changes in #214 to split things out separately into payload and sign, I think it makes it a bit easier to improve these tests. |
Thanks for thinking hard about these questions! I think we're going to wind up with a better interface for users because of it.
The way I'm thinking about it, that counts as "on the way in, but not the way out". I'd actually lean towards "the way out but not necessarily the way in" to enable a use case where non-TUF tools (or a different TUF implementation) can be used to sign a blob. This workflow isn't available at the moment, but you could imagine a I guess the question is: "what is a payload?" If it's (1) a blob you can sign as-is, then we probably do want it to canonicalize. If it's just (2) a representation of the metadata that you need to sign, then I agree with your interpretation.
Maybe this is a case where the CLI and API shouldn't correspond one-to-one. The way I've been thinking about it:
This doesn't allow any flows where the metadata is modified before signing; perhaps we really want:
But it seems like we're really running up against the fact that the go-tuf CLI is more evolved than designed here :)
Strictly speaking, the specification defers format details to the POUF of the implementation. I can imagine wanting to sign payloads in very offline environments that don't have the full go-tuf implementation available (and consequently doesn't know how to canonicalize), but does have a different TUF implementation or just has the capacity to sign binary blobs in standard formats. This isn't a client (so there's no need to verify signatures) nor is it a repo (so there's no expectation of a full TUF implementation). Maybe there's no actual demand for this use case, though. |
Because the signature needs to be in a custom format for the In this design, having that tool do the canonicalisation seems optimal, as it means it cannot produce invalid signatures due to user error of providing invalid canonicalised json.
I'd misinterpreted this when reading the spec, I'd thought of this more like an HTTP accept header, where a client could request the metadata in different formats, and all signing information would remain intact regardless of the serialization... but yes, it seems this is not the case, which is a bit disappointing and makes the choice of canonical json a bit annoying, as it doesn't produce a valid json serialisation. I feel like the only way to really make the spec work coherently is to mandate that all clients are using the same canonicalisation. |
If you're right that we want a I'm not 100% convinced: I think there's a small enough number of tools that could plausibly be used to sign (and their signatures are usually in standard formats) that we can special case the I don't want to block this PR on solving the whole problem, though. So I guess the question is: does moving canonicalization to I think the answer is: only if we think that we might want to use the go-tuf client to sign payloads from other implementations. I'm okay with giving up on that hope, and then longer-run having a coherent strategy around where canonicalization should happen. WDYT? |
I've filed an issue to figure out a coherent canonicalization strategy: #261 For now, this PR fixes a real issue and doesn't seem to commit us to any decisions we're not comfortable with so I'm okay with merging it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let's merge this. I will put out a fix for CI, since it seems broken on go 1.17 and golangci
Who else should review this? |
@mnm678 @ethan-lowman-dd could one of you PTAL? |
Could you please rebase? |
5f10ab4
to
e17e19a
Compare
rebased onto master |
Rebasing again should fix the failed conventional commit check. |
e17e19a
to
78ea7c6
Compare
Bit confused by the failing security scan, this is not related to my changes.. should I just ignore it? |
Yeah, I'll dismiss! Thanks |
Fixes #246
Release Notes:
Types of changes:
I don't believe this breaks any compatibility, a valid client should still canonicalise the message before hashing.
Description of the changes being introduced by the pull request:
The JSON canonicalisation format (specified here: https://wiki.laptop.org/go/Canonical_JSON) does not escape of control sequences in string values, however the JSON standard (https://www.ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf) requires that all control characters (defined as U+0000 to U+001F) MUST be escaped in strings.
As a result new line characters in strings are not escaped and this produces invalid JSON and go-tuf will fail to marshal metadata containing these sequences. This is particularly problematic for RSA and ECDSA keys where the specification states they are encoded in PEM format, which will inevitably contain newline characters.
This change only changes repo to only used canonicalised JSON format for calculating the signature of the message. The serialised message just uses the standard library encoding/json, which correctly handles all control sequences to produce a valid JSON encoding.
The changes to tests are required as they use the
github.com/theupdateframework/go-tuf/pkg/keys.Signer.SignMessage
method directly, this method accepts a[]byte
instead of ajson.RawMessage
, which implies that it expects a canonicalised message and will not perform canonicalisation.Please verify and check that the pull request fulfills the following requirements: