Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECDSA: Add recovery ID to test cases #103

Open
webmaster128 opened this issue Jan 31, 2024 · 11 comments
Open

ECDSA: Add recovery ID to test cases #103

webmaster128 opened this issue Jan 31, 2024 · 11 comments

Comments

@webmaster128
Copy link

For testing an secp256k1_recover_pubkey implementation (recover pubkey from message and signature), it would be very helpful to have the recovery ID that is available during the signing process.

The recovery ID is described here:

Right now I have it working by

  1. Checking for comment "k*G has a large x-coordinate" and handle that separately
  2. Try secp256k1_recover_pubkey with recover ID 0 and 1 and see if one of them worked

It does not seem to to be exactly trivial to get it from the Java implementation due to the high level APIs. This was discussed here: https://stackoverflow.com/questions/49247397/how-to-extract-public-key-from-ecdsa-signature. But maybe you have some better idea how to approach this.

@bleichenbacher-daniel
Copy link

I looked that this issue just briefly. It seems that public key recovery has its own set of potential implementation issues, so that it would make sense to have its own test and test vector format. For example it is possible to generate invalid signatures with the property that public key recovery leads to a public key with a point at infinity. Obviously an implementation should properly handle such situations.

Generating such a set of test vectors wouldn't be the main issue here. The first step would be to determine a test vector format that has sufficiently many use cases. There are a number of somewhat different protocols that all require public key recovery and hence it would be useful to have good coverage. Another question is finding a good place for a potentially new set of test vectors. I.e., it would be useful to have test vectors for specialized protocols all in the same place.

The JCE interface does indeed not expose underlying EC arithmetic. BouncyCastle has some direct support, but having to rely on specific providers is of course not great. Test vector generation for the current test vectors does indeed try to generate a lot of edge cases. Hence the test vectors contain cases where recover ID 2 or 3 would be necessary.

@webmaster128
Copy link
Author

Having dedicated tests would be ideal indeed. In the meantime I tried generating the recover IDs but did not find an implementationation that is flexible enough which I can run with reasonable setup effort.

The use case I'm most familiar with is Ethereum (which I guess is mostly the same as Ethereum). There you only require recover ID 0 and 1 since other IDs are considered invalid. It would be great to not run everything through DER but have signature = (r, s), message_hash and recover_id as inputs.

Personally I'm interested to use them for ecdsa/secp256{k,r}1. But of course, the more use cases can be covered, the better.

@bleichenbacher-daniel
Copy link

I had some time to look into public key recovery and have written some initial code to generate test vector. Since ECDSA verification through public key recovery use a different algorithm than ordinary ECDSA verification there are different kinds of edge cases that should be checked.

One thing that needs to be done here is to define ECDSA variants, i.e., determine the following properties / parameters:

  • the curve
  • the hash function for hashing the message: i.e. SHA-256, SHA3-256, Keccak256 etc. Some protocols have additional formattings, such as constant prefixes to the message that is being hashed. I would consider such things to be a property of the protocol and not a property of the ECDSA variant.
  • the formatting of the signature: I would assume that the 65-byte vrs format is quite common. Not sure, if there are other formats.
  • valid range for v. Here, I'm assuming that the network id is a parameter of the algorithm, and that a key used for one network is never used for another network.
  • the format of the public keys, i.e., compressed, uncompressed points or other formats.

Concrete specifications (i.e. something resembling a standard) for ECDSA variants would be really helpful. I'm finding a number of small contradictions what is valid and it is often not clear if this is because of distinct protocols or misunderstandings.
Also, since Wycheproof is in limbo it might make sense to find better ways to distribute and test new test vectors.

@webmaster128
Copy link
Author

I would assume that the 65-byte vrs format is quite common [...] valid range for v.

The v variable is Ethereum specific and combines a network ID and a recovery ID in a single value: v = chain_id * 2 + 35 + recovery_id. Using a recovery ID of range 0, 1, 2, 3 is a generic representation.

@bleichenbacher-daniel
Copy link

I'm putting some experimental files here: https://github.com/bleichenbacher-daniel/Rooterberg/tree/main/ecdsa
Please note that these files are just experimental. The main goal is to simply determine the format of the test vectors.
There are essentially two options: the first one is to have test vectors with raw values, r,s and recovery_id. The second one is to have have specific encodings (such as RLP encoded message and signatures) for specific protocols. I don't have an opinion which one is more useful.
The test vector generation is nowhere near complete. The current version focuses on arithmetic errors that could occur during public key reconstruction. The construction of these test vectors depends on how EC points are represented internally. I currently use either Jacobian or affine coordinates during the construction. If there are other commonly used representations then I'll add them too.
I test my general ECDSA implementation against third party libraries and against itself, but I don't have an implementation for the Ethereum or bitcoin specific modifications. Hence it is possible that everything is completely wrong.

@webmaster128
Copy link
Author

That's great stuff, thanks a lot for the effort!

For my purpose the split r, s, revovery_id would be ideal. IMO the protocol specifics can be done by the user of the test vectors.

My experience working with wycheproof was a bit painful as I had to write a custom DER encoder with all sorts of edge case handling just to use the test vectors but do not need DER anywhere in my project.

The expected pubkey could be included in compressed and uncompressed form maybe? Not too hard to do this in the caller but if you include both that would make it more convenient. Ethereum users will need the uncompressed pubkey (65 bytes), my API needs the compressed representation (33 bytes).

@bleichenbacher-daniel
Copy link

I did push some new test vectors with a new format. These are still experimental.

I agree that tester should not have to parse DER encoding if their library is not using DER encoded signatures. Currently, there are test vectors that use P1363 encoding, which are easier to parse, but it would still make sense to add additional encodings. For standards that are well defined it would make sense to add additional encoding as well. Figuring out a convenient format is one of the goals here, I think.

@chipshort
Copy link

I implemented a prototype of some tests using these in here.
One small annoyance is that I currently have to check the comment for "(s not in range 1 .. n//2)" because all of them are marked as invalid, but we actually count high-s signatures as valid. Would be nice to add a flag for that.

@bleichenbacher-daniel
Copy link

Thanks a lot for the feedback.

One of the open issues is indeed to collect information about different variant of ECDSA in use, so that it is possible to generate test vector files for given use cases. If there is a standard, RFC, BIP, etc. that describes which signatures are valid then it should be possible to generate test vector files for that given document or application.

The test vectors do have a header describing the assumptions for the ECDSA test case:
E.g., here is one.

"testType": "EcdsaVerify",
"algorithm": {
"type": "Ecdsa",
"curve": "secp256k1",
"sha": "KECCAK256",
"encoding": "RAW",
"normalize": true,
"signature_generation": "Generic"
},

First testType describes the type of the test, which is verifying signatures here. I'll add other test vectors for deterministic signing (e.g with RFC 6979).
The algorithm itself is described below. Besides primitive and curve, other parameters are fixed too. Here "KECCAK256" is the SHA3-256 predecessor without the domain separation bits added by NIST.
Encoding is the encoding of the signature. Currently I have code for raw signatures (just a tuple of integers), DER encoding, IEEE P1363 encoding and the SSH format.
normalize determines if the signatures need to be normalized for applications that are otherwise susceptible to signature malleability attacks.
And the signature generation determines if a specified method (e.g. RFC 6979) was used to generate the signatures or not.
I'm also collecting and organizing references, which I plan to add to the test vector files too, so that hopefully it becomes more clear which test vectors belong to which application.

If the header doesn't fit your application, then please ask for a new file. Generating the test vectors is far from being the biggest issue here. Finding out what variants exist is far more time consuming.

I'm not too familiar with the crypto coin world and I easily mix up different proposals.

@chipshort
Copy link

chipshort commented Apr 24, 2024

I see, thanks for the extensive explanation. So for me the "normalize": false tests are the relevant ones. I have updated my PR.

@bleichenbacher-daniel
Copy link

I hope this works better now.

Eventually there should be some documentation listing protocols and corresponding parameters in order to simplify the selection of test vectors. Though I've just started collecting this kind of information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants