-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First Version #6
Conversation
This went through way more iterations than I expected going in. I should have known better: cryptography is super subtle! |
Let me try to throw a quick few examples on this: SIWELet's start by assuming we have landed on an IPLD format (e.g. CACAO / ucan-ipld) for SIWE. Signature params:
The DagJWSOK, so there is already infrastructure built to produce DagJWS compliant signatures. Here's how we could make them compatible with Varsig Signature params:
Plain JWSMaybe this is easier? Signature params:
Note that the protected header is automatically generated. This means that we can't sign a JWT with this alg because it needs to include a |
README.md
Outdated
es256k-varsig-header = 0xe7 ; secp256k1 multicodec prefix | ||
sig-bytes = 64(OCTET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to specify the hash alg here?
For secp256k1
specifically, JOSE uses sha256, while ethereum wallets produce signatures with keccak256.
The same might be needed for other ECDSA algs as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other ECDSAs do make sense; we can add the hash parameter here. Do Ethereum wallets use e.g. EIP-191, or do they just sign the data directly?
Which opens the question about how we want to express things like EIP-191. Here's two options:
Per the 191 spec, I think that we need to support all 3 varints under EIP-191... otherwise we can break them out into their own codes.
eip-191-varsig-header = 0xd191 ; EIP-191
eip-191-version = unsigned-varint
sig-bytes = 64(OCTET)
We could also nest the Ethereum message
prefix from the signature type:
eip-191-signature = eip-191-header nested-varsig
eip-191-varsig-header = 0xd191 ; EIP-191
eip-191-version = unsigned-varint
nested-varsig = varsig
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with the first one, but can change it if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also updated it to 0xe191
. In multicodec we had used 0xd191
we wanted to namespace all signatures into 0xd_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually raises a different question for me. If I have an IPLD struct, what do I hash?
Suppose I encounter an object with the following schema,
type Data struct {
field1 Bytes
field2 String
sig Varsig
}
and the varsig is varint(0x34) varint(0xed) sig-bytes
, e.g. an ED25519 signature, I know I should use sha256
, but over what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a call with @Gozala (about some other stuff) and we chatted about the above a bit.
Irakli's Preferred Version
Basically a linear array of nested codecs
;; Irakli's Preference
varsig = multibase-prefix %x34 varsig-header varsig-body
;; Separate spec
content-info = 0xFF encoding-header encoding-body
Here, the signature bytes would be nested inside the content-encoding payload. For example:
rsa-varsig = rsa-varsig-header rsa-hash-algorithm signature-byte-length content-info
;; For example, using brackets to show grouping
0x34 (0x1205 0x12 0x0100 (0xFF 0x71 signature-bytes))
^ ^
| |
| nested-content-encoding
|
rsa-sig
The reason for this style is that you can endlessly nest these, and each segment acts as its own multiformat with its own number of segments. The downside is that if you want to update a segmant, you need to change some deeply nested structure.
Wrapped
Here's basically the opposite version which works by adding info as needed at the top level (which could lead to a lot of varsig types but keeps the "core" signature info together with the bytes)
0x34 (0xFF 0x71 (0x1205 0x12 0x0100 signature-bytes))
^ ^
| |
| rsa-sig
|
wrapped-content-encoding
Tree Structured
Arguably, we know that at minimum we want the encoding, so including at the top level rather than pushing it into every variation:
varsig = multibase-prefix %x34 encoding-info varsig-header varsig-body sig-bytes
0x34 (0x71) (0x1205 0x12 0x0100) (signature-bytes)
^ ^ ^
| | |
| rsa-sig raw-rsa-signature
|
wrapped-content-encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My argument had been that sig-bytes should use their own multiformat to describe how to derive bytes from the source data. If not specified it is raw bytes.
How do I go from the IPLD object to raw
bytes if it's not specified? I think storing the encoding inside of the sig-bytes
seems fine, but the examples in the spec needs to demonstrate this if so.
@expede Thanks for the examples. I'm kind of fine either way as long as we specify the encoding.
I would also like to know if you see any specific use case that doesn't require encoding @Gozala ?
Finally a question around the content encoding itself. Assume we have the following object IPLD object:
type Foo struct {
bar String
sig Varsig
}
My assumption would be that we remove the sig
field (or whatever field contained the varisg, then do the encoding? Is this other peoples assumption as well, or is there a different approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this other peoples assumption as well, or is there a different approach?
This is also my assumpition, and makes my life easier in a few places (though it's not something that's a hard requirement). It's not fully clear how you would sign over just the bar
string (but not the key) — I would assume that this is by using the 0x55 raw data
encoding, or introduce a "forget about the keys just the values stringjoined" encoding prefix. Alternately pushing this decision into the schema layer frees up this choice (like the extended IPLD subtyping example earlier in this thread).
If this default behaviour sounds preferable to you two, I'll add a section about this to the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@expede Thanks for the examples. I'm kind of fine either way as long as we specify the encoding.
Okay let's go with the one I labelled "Irakli's preference" then, since neither of us have super string feelings against 👍
...updating...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep things moving, I went ahead and updated the spec with encoding-info
. It was very awkawrd to expresss when encoding-info included the signature bytes, so I broke out the signature into its own segment — it seems to have worked out 👍
I embedded the encoding table right in this spec; we can always break this out at a later date. I also changed EIP-191 to be a secp256k1 signature over data encoded as EIP-191. @oed it's been a while since I worked with Ethereum signatures, can you confirm that I haven't missed any nuances and that this appraoch makes sense to you?
README.md
Outdated
encoding-info | ||
= %x55 ; Raw bytes multicodec prefix | ||
/ %x70 ; DAG-PB multicodec prefix | ||
/ %x71 ; DAG-CBOR multicodec prefix | ||
/ %x0129 ; DAG-JSON multicodec prefix | ||
/ %x6A77 ; JWT | ||
/ %xE191 eip-191-body encoding-info ; EIP-191 | ||
|
||
eip-191-body | ||
= %x00 eth-validator-address | ||
/ %x01 | ||
/ %x45 message-byte-length ; versions & payloads | ||
|
||
eth-validator-address = 20(OCTET) | ||
message-byte-length = unsigned-varint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably point you directly at it @oed 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should include the ethereum address that made the signature here. Will be redundant in many cases. The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.
I think the encoding info for a eip191 msg should look like this:
encoding-info = varint(0xe191) varint(codec)
where codec could be dag-cbor, dag-json, or something custom, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.
Any objection to keeping the version in there, just dropping the other two?
encoding-info
= %x70 ; DAG-PB multicodec prefix
;; ...
/ %xE191 %x45 message-byte-length encoding-info ; EIP-191 version 0x45
; ^
; |
; Keeping this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you're absolutley right that we probably won't see 0x00
, and 0x01
is also known as EIP-712, so it could have its own row. Okay that makes sense, I'll remove the %x45
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this seems great! Added some nitpicks.
README.md
Outdated
|
||
``` abnf | ||
encoding-info | ||
= %x55 ; Raw bytes multicodec prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raw doens't make sense to me here because afaik there's no way to go from an IPLD object to Raw. By definition raw doesn't have any encode/decode functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that it doesn't have an encode/decode function. I would assume that we should have some ability to sign over a bare sequence of bytes outside of an object, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess e.g. DAG-CBOR does have raw bytes, so that may be sufficient. Okay, removing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually scratch that. Even CBOR has a length header on binary. This inclusion makes it impossible express signatrues that are not diretcly in IPLD and need to interoperate with other systems.
How would you feel if 0x55
was a special case for when there was a single sibling, which unwraps it and passes it as raw bytes to the signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be useful, but I'm not sure RAW is the right way to express this? I don't feel super strongly though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sure, we can pick another tag. How about 0x5F
? It's _
in ASCII, and there's no namespace collision the multicodec table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
README.md
Outdated
/ %x71 ; DAG-CBOR multicodec prefix | ||
/ %x0129 ; DAG-JSON multicodec prefix | ||
/ %x6A77 ; JWT | ||
/ %xE191 eip-191-body encoding-info ; EIP-191 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, eip191 only works over a string. It would need to be paired with an IPLD codec, e.g. dag-cbor, to get the byte string to sign over first. For CACAO specifically we would want to define a new IPLD codec that takes a CACAO object and outputs an ascii SIWE string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my intention with the recursive encoding-info
. For example:
; Ed25519 256-bytes
; | |
; v v
%x34 (%xed (%xE191 %x0100 %x71) <sig-bytes>)
; ^ ^
; | |
; EIP-191 DAG-CBOR
...but if down the road there another encoding-info
that has more fields that is nestable under EIP-191, then it could also be multi-segment.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only works over a string
Also to clarify: you mean a bytestring, not an ASCII/UTF8 string, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Technically the personal_sign
eth-rpc method accepts both utf8 and hex encoded byte strings. But ofc it's encoded as a bytestring before it gets prepended and hashed.
README.md
Outdated
encoding-info | ||
= %x55 ; Raw bytes multicodec prefix | ||
/ %x70 ; DAG-PB multicodec prefix | ||
/ %x71 ; DAG-CBOR multicodec prefix | ||
/ %x0129 ; DAG-JSON multicodec prefix | ||
/ %x6A77 ; JWT | ||
/ %xE191 eip-191-body encoding-info ; EIP-191 | ||
|
||
eip-191-body | ||
= %x00 eth-validator-address | ||
/ %x01 | ||
/ %x45 message-byte-length ; versions & payloads | ||
|
||
eth-validator-address = 20(OCTET) | ||
message-byte-length = unsigned-varint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should include the ethereum address that made the signature here. Will be redundant in many cases. The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.
I think the encoding info for a eip191 msg should look like this:
encoding-info = varint(0xe191) varint(codec)
where codec could be dag-cbor, dag-json, or something custom, etc.
/ %x70 ; DAG-PB multicodec prefix | ||
/ %x71 ; DAG-CBOR multicodec prefix | ||
/ %x0129 ; DAG-JSON multicodec prefix | ||
/ %x6A77 ; JWT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: a learning from writing the CACAOv3 spec is that a pure JWT codec wouldn't be enough for UCANs.
Both of these doesn't really make sense for a JWT only codec:
- DIDs need to be converted to binary
- The
v
field needs to be set toucv
in the JWT header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm interesting; can you tell me more? Is that to normalize the fields across CACAO types, or so that you know which fields to hoist from a flat structure into the header, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that you know which fields to hoist from a flat structure into the header
This primarily.
As for the DIDs, in UCAN they are strings, for CACAO they need to be multidid encoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense. Thinking about this, we're liekly going to need a bunch of these for a bunch of cases. IIRC a CACAO itself is encoding agnostic by design. How would you feel about something like this?
encoding-info
= %x55
/ %x70
/ %x71
/ %x0129
/ %x6A77
/ %xCAC0 cacao-encoding ; NEW!
cacao-encoding
= %x3C ; ReCap
/ %x51 ; SIWE
/ %xCA ; UCAN
Alternately we could put UCAN etc right in the table, but that may cause confusion between (e.g.) CACAO and ucan-ipld
Aside
I also vaguely remember discussing there being a difference between the CACAO version and the version of the token that it maps to, but I don't see that in the latest version of ChainAgnostic/CAIPs#196 — maybe not required, but I thought I'd check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to add CACAO specific encoding-info to this spec. These can be defined in the CACAOv3 CAIP only I think?
Old versions of CACAO doesn't use varsig, so doesn't really make sense to specify codes for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay. Let's say for example that you cited where UCAN has ucv
in the header. ucv
is not a standard JWT field, but we do need to sign over the JWT-serialized format. Would the JWT serializer know that a CACAO v
should be serialized as ucv
in the header (I mean, we can also move this field to the claims/body, but it is basically header info since it describes the shape of the body).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly, the JWT serializer would need to know this. Moving the field I think could be a good option unless we want to specify a custom UCAN codec.
README.md
Outdated
|
||
``` abnf | ||
encoding-info | ||
= %x55 ; Raw bytes multicodec prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be useful, but I'm not sure RAW is the right way to express this? I don't feel super strongly though.
Co-authored-by: Joel Thorstensson <oed@3box.io>
Co-authored-by: Joel Thorstensson <oed@3box.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm!
Final nitpick but not a blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Two nitpicks, but no blockers :)
@bumblefudge evidently I also have the power 💪 We had chatted with Irakli about the final changes in some length, so I'm sure it's fine (in order to keep stuff moving forward). If not, we can cut a 0.2 😉 |
📄 Preview
TODOs Post-PR
0x34
to multicodec to table https://github.com/expede/multicodec/blob/varsig/table.csv#L34