-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress public key for chat #1937
Comments
for reference, since we use https://github.com/ethereum/go-ethereum/blob/master/crypto/secp256k1/secp256.go#L126 |
Some CodeA little update on this. I've spent a little bit of time experimenting with this and here is the play code I wrote: func Test(t *testing.T) {
pk := "04261c55675e55ff25edb50b345cfb3a3f35f60712d251cbaaab97bd50054c6ebc3cd4e22200c68daf7493e1f8da6a190a68a671e2d3977809612424c7c3888bc6"
// decode hex string to bytes
pkb, err := hex.DecodeString(pk)
if err != nil {
t.Error(err)
}
// encode bytes in to variously base encodings
b32pk := base32.StdEncoding.EncodeToString(pkb)
b58pk := base58.Encode(pkb)
b64pk := base64.StdEncoding.EncodeToString(pkb)
// create crypto public key from decoded bytes
x, y := elliptic.Unmarshal(secp256k1.S256(), pkb)
// compress public key
cpkb := secp256k1.CompressPubkey(x, y)
// encode compressed to hex string
cpk := hex.EncodeToString(cpkb)
// encode compressed bytes in to variously base encodings
b32cpk := base32.StdEncoding.EncodeToString(cpkb)
b58cpk := base58.Encode(cpkb)
b64cpk := base64.StdEncoding.EncodeToString(cpkb)
// Dump values
spew.Dump(
pk,
b32pk,
b58pk,
b64pk,
cpk,
b32cpk,
b58cpk,
b64cpk,
)
// decode base58 compressed public key
b58cpkb := base58.Decode(b58cpk)
// decompress decoded base58 compressed public key
dx, dy := secp256k1.DecompressPubkey(b58cpkb)
// marshall decompressed public key
mpk := elliptic.Marshal(secp256k1.S256(), dx, dy)
// decode marshalled decompressed public key
dpk := hex.EncodeToString(mpk)
spew.Dump(dpk)
} The Results
QR Code ComparisonUncompressed hex encoded QR code with full https URLCompressed base58 encoded QR code with full https URLCompressed base58 encoded QR code text only public keySummaryWe can reduce the size of the key from 130 characters to 44 characters in this sample case. Meaning a reduction in size of 66%! |
good job! I think the only one we need is the normal hex format, so it's basically the original without the |
I think displaying the compressed key as a hex encoded string would allow the key to keep its "Ethereumy" appearance, but if our aim is to reduce the size of the QR code as much as possible then the best choice in my mind is to use the compressed base58 encoded key. A compromise approach would be to display the hex encoding in plain text and show the base58 encoded QR code. Something like below: 0x02261c55675e55ff25edb50b345cfb3a3f35f60712d251cbaaab97bd50054c6ebc |
I think having the standard hex format is important for compatibility. for the qrcode we can ask @hesterbruikman to understand how compressed it should be. In general I think if we can keep it in hex it can be easily compatible with other clients. |
anyway good ideas and good job! I think we can decide this thing and implement it easily! |
Compatibility is important.... Especially if we are to ever sync with other identity solutions (afaik there's no standards there yet really. Not sure what's going on with Iden3, but I imagine we want to keep some doors open). Let me check what this image compression can get us and follow up. This is already a huge step and improvement to legibility and url sharing |
To help with decisions see the side by side
These map to:
|
FYI. We can do a lot of things to make the QR less complex looking, but will require additional specs. But once the specs are in place there should be very little compatibility problems. Decoding hex vs decoding base58 is a fairly minor issue, and I don't see any developers struggling with the concept. But we can have that discussion. In the meantime here is a brief overview of my ideas for reducing the QR code complexity.
|
Sorry for the delay on this @Samyoul ! @andremedeiros can you please take a look as well for review of the compression approach and planning based on effort required to implement? Your brief looks good to me @Samyoul, with the most compressed version, given that this is documented well compatibility with any applications or interfaces requiring the full key. I think we should leave the full url as we don't know what application will scan the QR and there's logic in there to drive Status installs |
No problem. I just wondered if there any blockers I wasn't aware of. APIThere needs to be some clarity on the API signature.
In my mind the functions should be: func DecompressPublicKey(compressedPublicKey string) publicKey string {...} func CompressPublicKey(publicKey string) compressedPublicKey string {...} This would allow the API to compress and decompress any given public key and not just the key(s) belonging to the user / client. I'd estimate implementing the code for this in SpecificationIf we want to compress the QR code as much as possible, and retain the URL, then I'd first implement a compressed base58 encoded public key with a modified url. From : https://join.status.im/u/e2QHwp5qjYj6i3jTCfzKVdB1k1dy7NDuoRngzTrARkpT To: https://j.status.im/u/e2QHwp5qjYj6i3jTCfzKVdB1k1dy7NDuoRngzTrARkpT Removing an additional (3 * 8 =) 24 data blocks from the QR code. Set the QR code error correction to I'd estimate that writing the initial specs for this would take about 1 hour, although I don't know how long the review would take. I suppose it depends on how well the specs are written and how much consensus there is around the new specs. InfraWe'd need to register a new Additionally we'd need to make a change to the website so that it could interpret the new compressed public key. I do not think this would take much time, but @jakubgs would be the best estimation of how much time this would take. |
Domain redirect is trivial. Parsing of compressed keys seems also trivial if all we're doing is just making it into a |
Mis-clicks, sorry. But there's one tiny clash we would have which I don't think matters. Right now the Oh, and if the compressed key is mis-copied as for example 1 character short it will be treated as ENS name too. |
Consistent key length is an interesting consideration. From my experimenting the compressed keys are 44 chars, but I will generate 100,000+ keys and see if any produce a length that is not 44 chars. I will publish the results here after I've done this.
Interesting point. I think that if we use the following process flow we should have few problems.
By parsing as an ENS first we should reduce the chances of collision significantly. Although there is still the chance that someone's ENS name is exactly the same as a compressed public key.
This should hopefully not be a problem as the compressed key would only be used with QR codes. But if in some future application the compressed key isn't only used with QR codes, then perhaps we can formulate a checksum for the compressed keys. |
We don't do ENS lookups on the universal links backend. |
Would it be a problem to introduce ENS lookups? If we want to implement a version of ID parsing that doesn't require outbound calls, we could do the following:
|
Yes it would be. Because right now the universal links service doesn't depend on any other service. It's stateless, which keeps it simple, robust, and stable. By introducing something like ENS lookup you introduce complexity, delays due to slow lookups which result in delays in rendering the page, and so on. Sounds like a bad idea to me. And the flow you proposed makes sense. Except we attempt to parse only if it's 44 chars as you said. |
Ok, I will probably have the tests on the public keys done some time tomorrow. So I can see if there is any deviation from 44 chars, in that case we can implement some additional code to handle that. Hypothetically if the variation is +-2 chars we can do something like |
Test results: I ran a test on 10 million random public keys and the summary of the tests were as follows:
So 68% of the time keys will be 44 chars long and 32% of the time keys will be 45 chars long. There were no other compressed key lengths. The codefunc Test(t *testing.T) {
tot := map[int]int{}
for j := 0; j < 100; j++{
fmt.Printf("test number : %d\n", j+1)
out := map[int]int{}
for i := 0; i < 100000; i++ {
_, x, y, _ := elliptic.GenerateKey(secp256k1.S256(), rand.Reader)
cpk := compress(x, y)
l := len(cpk)
out[l]++
tot[l]++
}
spew.Dump(out)
}
spew.Dump(tot)
}
func compress(x, y *big.Int) string {
// compress public key
cpkb := secp256k1.CompressPubkey(x, y)
// encode compressed bytes in to base58 encoding
return base58.Encode(cpkb)
} Full test results
|
That sounds solid. @Samyoul @jakubgs @gravityblast what do you think? Seems to me like this is the proposal:
Result: https://join.status.im/u/e2QHwp5qjYj6i3jTCfzKVdB1k1dy7NDuoRngzTrARkpT @yenda This issue looks at compression of the chat key to shorten url and decrease QR code size. Do you foresee any issues requesting compressed keys in status-react? status-im/status-mobile#10325 |
also @jakubgs can join.status.im and j.status.im be used in parallel? As Keycard is looking at printing QR codes |
Using |
Ok, sounds like it's better to leave out that change as we already have enough challenges getting universal link behavior in line across platforms |
I hate iOS |
@status-im/status-core-ui Does anyone know if it is possible to register more than one app url in an iOS app? |
I've just been working in the status specs and I noticed the 3 word pseudonym. We'd have to ensure that parsing the compressed public key also derives the 3 word pseudonym. Presumably not an issue. |
There's https://github.com/status-im/js-status-chat-name which does that for normal public keys, it would be trivial to add a method for doing the same for base58 encoded version. |
@status-im/status-core Could you confirm the following:
Yesterday @andremedeiros said he'd be surprised if this functionality wasn't in place but could I get a solid confirmation from anyone that knows for certain? Thank you 😄 |
The first point is obviously true. We already have that working. |
I did a little bit of digging and it seems possible to have multiple In the current code we have:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>aps-environment</key>
<string>development</string>
<key>com.apple.developer.associated-domains</key>
<array>
<string>applinks:join.status.im</string>
</array>
<key>keychain-access-groups</key>
<array>
<string>$(AppIdentifierPrefix)im.status.ethereum</string>
</array>
</dict>
</plist> Which according to this https://stackoverflow.com/questions/39959347/mutiple-associated-domains-with-one-ios-app , allows us to add multiple |
@hesterbruikman has let me know @Samyoul @cammellos had raised an issue if introducing compression is worth introducing a 'breaking change'. The meta-problem here is we don't know how to handle breaking changes. This could be a great opportunity for the team to learn how to do this which is going to make life easier in the future. Without having the breaking change concerns specified I can only assume it's a matter of backwards compatibility. Without thinking about the solution deeply, it seems we could introduce client read support in an earlier release, and introduce user-facing support in a future release once the majority of clients have upgraded, thereby minimizing backwards compatibility issues. Since I posted last, I'm unsure if version info is useful, rather the read function should handle all forms of input, trying newest to oldest. Also this feature isn't purely aesthetic, we encode public keys on-chain. Every on-chain byte costs money and we have an opportunity to half those costs. That's reason enough, no? This is a miniature version of the same issue we faced introducing the last breaking change in beta, where we had opportunity to, but didn't solve the migration path and rationalized the problem away by saying we're in beta - which did come back and bite us, having to offer support to users who had difficulties accessing their funds... until we address that issue and think deeply about how we handle migrations we will continually run into the same issue until we're forced to handle it, or make a big mistake and lose reputation with our users. @arnetheduck has experience and even communicated solutions to this concern last time this issue came up, but the team wasn't in a position to listen, I'd definitely hit him up to get deeper on the topic. |
I am not sure that this is accurate, it is often repeated, but I think we have gathered a fair amount of experience in handling breaking changes, and repeating this point is just ignoring (or not knowing) all the effort and thoughts that went into this topic. As a note, we broke compatibility twice, once when the "new protocol" was introduced (just a week after I joined in spring 2018), and with We made huge amounts of changes to the protocol, without really breaking, or at the very least handling them in a graceful way: PFS, waku, negotiated topic, the most recent images, are all examples where we didn't break compatibility, and those are fundamental changes to the protocol. One way we do this is how to technically introduce a new change, and the other is to thinking long and hard before making a change. It's not only a question of how to handle changes, but also which changes we want to push for. By now we also understand what kind of changes are breaking and which one is technically impossible to make non-breaking. This is one of those cases. The best we can do, and that's what we also suggested, is to add the parsing now, and delay going live with the feature until we are fairly certain that old clients are not around anymore. Nonetheless this is a breaking change, but we can minimize the impact.
There seem to be a mix of concern here, from the tasks created (here and status-im/status-mobile#10325 ), nothing is related to the blockchain, all the impact is purely UI/UX. If we want to save bytes on the blockchain, we should address the specific problem, and not necessarily change the UI (which is the concern we raised). First we should understand the problem, which no one has done here, before jumping to a solution, and then see how to address it, and whether it needs addressing. How to store keys on-chain is not necessarily related to how we display them to the user, and might not be a breaking change, or it might have a different impact. As a side note, we widely use compressed keys in the protocol already, so might be that also those are used when writing to the blockchain. If on-chain space is an issue, I take is related to ENS management, as status-go doesn't deal with the blockchain for most parts, we should involve the people familiar with those parts (neither me nor samuel are probably very familiar with this). So my suggestion would be:
I think we focused on 5, and really haven't spent enough time on 1-4, so I would spend more time on the requirements phase. |
I have little of substance that I can add to Andrea's comment, apart from emphasising a few points.
As Andrea has stated, we discussed strategies for mitigating compatibility issues and coincidentally this very strategy was discussed. The breaking change element is not the real problem, we can implement sensible steps to mitigate the impact of changes we introduce.
I don't have any problem with aesthetics being the only reason for changing the public key format, and if there are also functional reasons that is great too. The core of our questions were centred around "Why is this change needed?". If we know the why we can work out everything else from there. |
if we're going in the direction of multiple key formats, or above all, self-describing key formats, the problem is somewhat addressed in https://github.com/multiformats/multihash / multicodec - maybe we can extend that one? oldest-to-newest schemes tend to cleverly work up to the point where there's no ambiguity - it's often a nice save in case you didn't consider the problem of upgradeability in the first implementation - one would have to "price" the extra type bytes in terms of UX degradation to make a good call here.
would we run an A/B test on things like this? it has the additional advantage that the feature implicitly get developed behind a toggle, meaning the upgrade/downgrade path is baked in from the start, additionally de-risking the final decision, and crucially, not disabling support for the old behaviour until it's comfortable to do so.
the key conversion algorithm is deterministically bounded - I would not rely on specific lengths until the code that generates the key has been analysed - in particular, I would not build in any assumptions about this into code that gets distributed to users based on an exploratory test alone.
has the performance angle of decompressing keys been explored? for a QR code it won't matter, but for a group chat or your address book it might |
Edit: This research may be redundant in the case we implement Multiformat public keys. 😭 At least the maths is interesting 😎
I've done a little research on this and I've found that deterministically the number of characters of a base58 encoded compressed secp256k1 public key will only be 44 to 46 characters long. Basically, the compression algorithm will always give a result of 33 bytes for a given secp256k1 pk. The variation is introduced with the novel way that base58 encoding works. Base58 converts a byte array into a decimal representation and then proceeds to repeatedly modulo the decimal value by 58. After each round of modulo the remainder is used as an index on the base58 character table, and the modulo product is used as the input for the next round. The final round is determined when the product of the input modulo 58 is 0. The theoretical maximum number of rounds is For a 33 byte array we get The theoretical minimum number of characters in a base58 encoded string is the number of bytes in the input array. If any leading bytes have a value of Meaning the minimum value would be 44, as even the lowest possible value of 25632 []byte{
1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,
} would give a value of |
Multiformat keys
I've explored multiformats and I think that we can make a use of: A Secp256k1 public key can be described with a prepended byte code ImplementationMy current keyHex encoded, secp256k1 public key
In raw bytes
Compressed bytes
Compressed bytes encoded into base58 (44 characters)
Multikey versionHex encoded uncompressed secp256k1 public key
In raw bytes
Compressed bytes
Compressed bytes encoded into base58 (48 characters)
AnalysisNotice the difference between the two keys Current hex encoded uncompressed pk
Multiformat hex encoded uncompressed pk
Current base58 encoded compressed pk
Multiformat base58 encoded compressed pk
Summary
|
After speaking with @jarradh today we've settled that the main reason for making these changes is for UX. Any functional benefit is coincidental and is not the focus of this change. All we want to do is:
|
With regards to on-chain formats, it should generally be less necessary to use something like multiformats or self-identifying keys - as long as the contract can be uniquely identified, that is enough to disambiguate the data within - among other things, this is why ethereum itself for example strips the compression type from the secp signatures it stores - it's just waste - a similar argument can be applied here. |
FYI, I've got a working unspeced version of this #1990 . This supports
Uses multicodec identifiers for key types, and multibase encoding of bytes. |
Spec is now submitted for peer review status-im/specs#137 |
## What has changed? I've introduced to the public binding functionality that will compress and decompress public keys of a variety of encoding and key types. This functionality supports all major byte encoding formats and the following EC public key types: - `secp256k1` pks - `bls12-381 g1` pks - `bls12-381 g2` pks ## Why make the change? We want shorter public (chat) keys and we want to be future proof and encoding agnostic. See the issue here #1937 --- * Added basic signature for compresspk and uncompresspk * Added basic encoding information * make vendor * formatted imports for the linter * Reformatted imports hoping linter likes it * This linter is capricious * Added check that the secp256k1 key is valid * Added test for valid key * Added multiformat/go-varint dep * Added public key type handling * Added key decompression with key type handling * Added handling for '0x' type indentifying * Added more robust testing * Less lint for the linting gods * make vendor for bls12_381 * Added bls12-381 compression tests * Added decompress key expected results * Refactor of typed and untyped keys in tests * Lint god appeasment * Refactor of sample public keys * Implemented bls12-381 decompression * gofmt * Renamed decode/encode funcs to be more descriptive * Added binary bindings for key de/compression * Refactor of func parameters gomobile is a bit tempermental using raw bytes as a parameter, so I've decided to use string only inputs and outputs * gofmt * Added function documentation * Moved multiformat de/compression into api/multiformat ns * Moved multiformat de/compression into api/multiformat ns * Changed compress to serialize on API
#1990 Has been merged. |
## What has changed? I've added detailed specs for the implementation of public key compression and decompression. The specifications detail the use of the following `multiformat` features: - `multibase` - `multicodec` - `unsigned-varint` `multiformat` is used to ensure that the implementation has as much flexibility and robustness as feasible. ## Why make the change? The usage of key de/compression is outside the typical usage of public keys and requires a degree of background knowledge to correctly implement. The purpose of this specification change is to provide this needed background knowledge. Please also see - status-im/status-go#1937 - status-im/status-go#1990 --- * Added Public key compression specs * Added recommendation for encoding type of compressed keys * Added unrecognised words to wordlist * Add multibase to the wordlist * Added a basic example of the multiformat EC key compression concept * Added parsable to wordlist * Hex is the only Lingua Franca we need * Language to make pk de/compression SHOULD implement * Added terms to glossary explaining key de/compression * Change terminology from compress to serialise * Added rationale for public key compression * Added deserialization to the wordlist * Concise sentence * Added url to the wordlist
status-im/specs#137 Has been merged. Now readable at https://specs.status.im/spec/2#public-key-serialization |
As far as
|
Problem
Chat key is currently shown (as used) as the public key in full. This is unnecessarily daunting. Compressing the public key used for chat can make it less daunting, more space efficient in the UI as well as reducing file size and visual complexity of the QR code.
Implementation
Implement two functions:
Use 02 and 03 for compressed and maintain 04 for decompressed key
status-react should always receive the decompressed key (no changes to current implementation) unless the compressed key is requested. Changes for this will be documented in a separate issue and include:
Profile
>Share icon
Invite friends
button and+ -button
>Invite friends
Profile details
andProfile details
>Share icon
Acceptance Criteria
Documentation is updated if needed: https://specs.status.im/spec/2
Notes
for guidance:
https://stackoverflow.com/questions/43629265/deriving-an-ecdsa-uncompressed-public-key-from-a-compressed-one
https://bitcoin.stackexchange.com/questions/69315/how-are-compressed-pubkeys-generated
https://crypto.stackexchange.com/questions/8914/ecdsa-compressed-public-key-point-back-to-uncompressed-public-key-point
Future Steps
Implementation in status-react to request compressed key
The text was updated successfully, but these errors were encountered: