-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VanityGPG is slow #2
Comments
Thank you for the suggestion. If I did not understood it wrong, what you are talking about is similar to Scallion's approach. I'm not familiar with neither OpenPGP nor cryptography, so I took the simplest approach. I'll try to do that, or if you are willing to help, PRs are extremely welcome. By the way, I'm so glad to see suggestions like what you said here. To be honest, I started this as a meme project as I was joking about it with my friends, never thought I could receive such useful suggestion. I just want to say thank you again! |
I just wrote a working PoC, the updated version should come soon. |
No problem!
How scallion works is described here. If I understand correctly, scallion increments the public key's exponent. I don't know how safe that is... generating RSA keys is a tricky business. Letting a cryptographic library generate the key pair, and then varying the creation time seems safer. But, I'm not a cryptographer. Perhaps the approach is sound.
No worries!
:) |
@nwalfield Please checkout the 0.3 branch! On my Jetson Nano I got an average speed of over 850,000 fingerprints per seconds (was ~1000/s before), which is, just as you described, several magnitudes faster than my original approach. Thank you! |
Very cool! A few more ideas for improving performance:
Some general comments:
FWIW, on my Xeon E3-1275 v6 (4 cores, 8 threads), I see about 800k tries per second using the following configuration: |
I just realized I didn't compile with --release. With that, I get about 8M tries per second. |
Thanks!
I cannot find a proper way to do that. The main loop is controlled by Rayon for parallel processing and I cannot come up with a clean enough solution to do that.
The progress bar is controlled by a separate thread, and the counter uses atomic operations, so I don't think this is a big problem...
Will do. I always use Ed25519 personally. but GPG defaults to RSA so I just used RSA here.
I choose to generate the full general-purpose key is because some older versions of GPG that does not support adding ECC subkeys are still in use. That function is also executed only when a match has been found. Maybe I can add an option for generating subkeys or not.
Will do.
That's probably because the status thread was spawned before the initialization finishes, and the "tries per second" is simply the result of dividing the total number of fingerprints by the number of seconds passed after the thread was spawned. I'll fix that. |
And list possible values. Signed-off-by: Kay Lin <i@v2bv.net>
Take a look at https://github.com/nwalfield/VanityGPG/tree/ideas for some ideas of how to speed it up in the way that I discussed. |
atomic operations create a memory fence, and lock the cache. They are better than a lock, but they are still expensive, especially when they are contended. |
Now it shows a reading of over 2 million tries per second on my Jetson Nano 🤣 |
Nice! I suspect that most people will want to choose a short key id. It might make sense to provide matching against a fixed suffix (perhaps even a list taken either from the command line or a file) in addition to matching a regex. |
That may be over-complicating things. Regex it self can be used to achieve this and the regex crate has a decent performance. I'll write a brief description about how to do thatin the README.
…On Oct 4, 2020, 02:02 -0700, nwalfield ***@***.***>, wrote:
> Now it shows a reading of over 2 million tries per second on my Jetson Nano rofl
Nice! I suspect that most people will want to choose a short key id. It might make sense to provide matching against a fixed suffix (perhaps even a list taken either from the command line or a file) in addition to matching a regex.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I think it would be interesting to benchmark it. When I replaced the regex with a fixed string comparison, I saw a 5-10% improvement. |
I bet that you could get a factor 4-5 improvement in throughput if you computed the fingerprint manually, and adjusted the timestamp in place. I'd serialize the key into a buffer:
To compute the hash, you could use the sha1 crate. |
FWIW, there is also https://crates.io/crates/sha-1 :D |
Of course it would be faster with
Checkout the newest commit on the 0.3 branch, I introduced a new rPGP backend. It is purely implemented in rust (so Edit: The performance has only been doubled on my Jetson Nano, but I don't know why. |
Could you provide some concrete feedback about that? sequoia-openpgp provides a low-level API, but we've tried to make it as easy to use as possible with lots of documentation and examples. What were the issues that you ran into? What did you find to be too complicated? |
Hmmm... Let me do some clarafication. The sequoia-openpgp crate itself is user-friendly enough and is full of features, but for this project I only need the key generation functionality; As I mentioned above, nettle-sys requires libclang to be present in the system, but it could be a pain to achieve in certain situation (e.g. FreeBSD does not offer binary for it, and compiling LLVM+clang is kind of painful). Don't get me wrong, it does not mean I don't like sequoia, it's just heavier than ideal for this project. If I'm going to start any new project that needs anything related to OpenPGP, sequoia is definitely my first choice. It is easier to use and may be faster than rPGP. Back to VanityGPG, I will maintain these two backends in parallel since rPGP lacks security audit and does not support the ECC curves from NIST, while sequoia requires FFI. These two backends are feature-gated and can be switched on compilation. |
Thanks for the feedback. To be clear, I'm not offended, if you don't like something about Sequoia. In fact, I'd appreciate any type of feedback.
What APIs were you thinking about here? FWIW, sequoia should be in the FreeBSD ports tree these days. I'm not a FreeBSD user, so I don't know how one would go about using it. |
I hope sequoia can get better and better. Actually I have another small question. I see there is a struct named KeyBlueprint but it is private and the actual generation work is left to the CertBuilder struct's impl blocks. Is it a good idea to move the key generation part to KeyBlueprint and make it public? Also, is it a good idea to make the generate method of CipherSuite public? In VanityGPG, I somehow have to recreate these two methods since they are private.
Actually no... My brain got confused back then and sequoia's documentations are good, seems I've messed up my memory. Sorry for that.
Yes it's in the FreeBSD ports, but as you know Rust does not have a stable ABI, and install nettle-sys from cargo requires libclang, so... Actually the people here to be blamed should be someone from the maintainers of FreeBSD. From a message in the mailing list, they say they don't do binary builds for libclang is because they think building that .so would consume too much resources, and I think that's kind of ridiculous. |
I've just pushed a commit to let the sequoia backend use this strategy. |
Do you think this issue can be closed now? |
Looking good! FWIW, my hacked up version was at 29.7 Mhashes/sec on my computer. This version is at 21.0 Mhashes/sec. These are the things that could yields some more improvement, I suspect:
|
Since the performance of the program has been increased dramatically (thanks to #2), the reporting rate can be further redused to improve a little bit of performance (idealy). Signed-off-by: Kay Lin <i@v2bv.net>
Sorry, I was kind of busy in the last month and did not reply in time. I've just switched to For the second part, I'm actually not sure about how to improve on that, because we need the string anyway. Do you mean to pass a For For regex, I wrote a simple "benchmark", and the result is, when there is a relatively large number of suffixes, regex can be a lot faster than using My "benchmark" code: https://termbin.com/6tnlv And the result on my Ryzen 3 2200G is:
The unit is microseconds. Another advantage of regex is that, it is just easier to write complex search patterns. |
@nwalfield I've implemented custom binary-to-hex conversion with AVX2 and SSE4.1, based on code from Also, I've added a feature to use I've just gathered some perf data and generated this flamegraph: So... it seems there's nothing more that can be easily optimized. And I guess it may be time to consider closing this issue...? |
Sounds cool! I haven't taken a look at the changes, but they make sense. |
An OpenPGP fingerprint is a hash not only over the public key material, but also the key's meta data. Thus, if you vary the key's creation time, you'll get a different fingerprint. Given a certificate, that should be pretty easy to do: extract the primary key (
let pk = cert.primary_key().key().clone()
), get the v4 variant (let mut pk = match pk { Key::V4(k) => k, _ => unreachable!() }
, modify the creation time (k.set_creation_time(k.creation_time() - delta)
), and check the resulting fingerprint (k.fingerprint()
). When you find a match, you'll need to generate new self-signatures. SeeCertBuilder
's code for how to do that. Depending on your requirements, you can vary from a few hours in the past to weeks or months in the past. This should speed up the search by multiple orders of magnitude.The text was updated successfully, but these errors were encountered: