Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for monero generate_key_derivation and subaddress/output … #3

Open
wants to merge 1 commit into
base: monero
from

Conversation

@vtnerd
Copy link
Contributor

vtnerd commented Jan 12, 2020

Reviewers, please look at updated README.md first which explains the goals of this PR in particular.


On ryzen 3900x the speedup for crypto::generate_key_derivation is ~153% and the speedup for crypto::derive_subaddress_public_key is ~141%. When looking at the total crypto operations for a standard 2-output transaction, this results in a ~147% speedup on this same CPU. On a recentish Intel laptop the speedup for 2-output transactions is ~140%.

I have yet to do a speed comparison in wallet2 directly. I would not expect an exact ~140%-192% speedup because there are other factors limiting the scanning. The crypto operations are fairly dominant in the profile graphs though, so there will be decent gains. I will publish numbers at some point, unless it appears fairly quickly that this PR is going to be rejected.


@kenshi84 @hyc @stoffu @xiphon @moneromooo-monero there is somewhat advanced ASM in this PR. The ASM included here is adapted from the existing ASM in crypto_sign/ed25519/<variant>/choose_t.s. That assembly was written where z == 1 since the scalarmult tables could be pre-generated. The ASM here took the existing ASM and added support for the z value in the group element with an unused x86-64 register. Diffing against the originals will show this.

The other options: (1) drop constant-time nature possibly leaking viewkey in some contexts or (2) 7 additional field inversions when generating the table for scalarmult. This table is needed when doing the ECDH for every tx (ed25519 typically does not have ECDH against arbitrary points, only G). Pinging @SarangNoether @surae-noether too.


@moneromooo-monero @moneroexamples This was specifically designed to be usable in outside projects. I believe openmonero and monero-light-wallet-server can drop the dependency on internal monero-project/monero libraries if they use ZMQ-JSON-RPC and add: (1) keccak, the (2) custom varint serialization, and (3) support for parsing tx_extra. The first two are required for crypto::derivation_to_scalar which is not provided by these sets of changes. The other big blocker for those projects is portability - @moneroexamples please let me know if you are interested and we can gauge support for adding ref10 to this repo. This will not be used by the default monero-wallet, but would provide portability for the wallet servers.

The other advantage to outside projects is all of this code should be easier to audit than the internal monero crypto library - the ref10 library should require zero changes and these amd64 variants only had position-independent changes to the ASM.

@SarangNoether

This comment has been minimized.

Copy link

SarangNoether commented Jan 12, 2020

^ Surae Noether is @b-g-goodell here.

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Jan 12, 2020

I mistakenly ran the benchmarks with randomx mining on my 3900x whoops. The numbers are ~141%, ~153% and ~147% above which is closer to the laptop in percentage increase (overall the timing is much faster). Updated the comment above.

@moneroexamples

This comment has been minimized.

Copy link

moneroexamples commented Jan 13, 2020

@vtnerd

I don't have anything against this. But I don't know when I would be able to adapt openmonero to use it.

p.s.
Awesome job, btw.

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Jan 13, 2020

@moneroexamples I'm glad your initial reaction was positive. I primarily wanted to make you aware of this effort because there are significant performance advantages and ease-of-use if ref10 is added. Let me know if there is something that can accommodate your usage.

Another thing I didn't mention - I can add support for amd64-64-24k, amd64-51-30k, and ref10 ECDH "precomp" (if reviewers were willing). There isn't any benefit to the default monero wallet, but these servers can amortize the cost of the first ECDH step by pre-computing the table once for N viewkeys. The time savings is x% per-viewkey - 1 (I forget the exact amount at the moment).

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Jan 13, 2020

I was able to implement the two functions described in the README.md, using tweetnacl as the base in 301 lines of C code.

@b-g-goodell @SarangNoether @moneromooo-monero would be it easier to drop the amd64 stuff in this PR, and provide just the CMake + tweetnacl version for now? It would limit the review to just the library concept with a minimal C base implementation that can be reviewed by the MRL.

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Jan 21, 2020

Pinging @knaccc , who may also be interested ...

@knaccc

This comment has been minimized.

Copy link

knaccc commented Jan 21, 2020

@vtnerd Thanks, this sounds awesome.

(1) drop constant-time nature possibly leaking viewkey in some contexts

I'm trying to think of ways in which this leak could happen, it seems extremely low risk to me to drop constant-time ECDH blockchain scanning if it significantly improves performance

or (2) 7 additional field inversions when generating the table for scalarmult.

Please could you elaborate on what you mean here? What is it that is creating 7 additional inversions?

I can add support for amd64-64-24k, amd64-51-30k, and ref10 ECDH "precomp" (if reviewers were willing)

I'd assume that being able to do precomp is very useful for doing a scalarmult with e.g. H, right?

@vtnerd

This comment has been minimized.

Copy link
Contributor Author

vtnerd commented Jan 21, 2020

I'm trying to think of ways in which this leak could happen, it seems extremely low risk to me to drop constant-time ECDH blockchain scanning if it significantly improves performance

The monero daemon and viewkey are not necessarily on the same machine or owner. Monero now provides a mechanism for finding publicly available RPC ports for instance. The usage is clearly less common, but I think its preferable to "do the same thing" here and keep the behavior consistent with the existing Monero crypto code. This does have to be weighed against the complexity of having to review/maintain ASM code that is close but slightly different to the upstream project.

Please could you elaborate on what you mean here? What is it that is creating 7 additional inversions?

The arbitrary point scalarmult code (for use by the public-key in a transaction) generates a 3-bit "table" for performance purposes - it reduces the total point doublings needed. The size was chosen because (1) existing monero ref10 code creates a table of this size and (2) sc25519_window4 provided by the amd64 implementations generates a "bit-array" for this size table.

The problem is the existing constant-time table selection ASM does not use projection (z) coordinates because only pre-generated tables for G are used by the ASM. The existing ASM can be used if each element in the runtime-generated table have the projection z coordinate normalized to 1. This requires a field inversion and 3 field multiplications for 7 elements (the first element is always normalized to z == 1).

I'd assume that being able to do precomp is very useful for doing a scalarmult with e.g. H, right?

Probably, but I haven't looked into that. We should be able to put a table in static-memory too, like with point-G.

It is also useful for doing scalarmult against a transaction public key when 2+ viewkeys have to be scanned against it. This occurs primarily in "light wallet" servers or any wallet tracking multiple primary accounts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.