New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aes: soft hazmat
backend
#268
Conversation
As an initial attempt I tried to implement the It's not quite working but it seems close. I'm not quite sure what I'm missing:
(left is actual, right is expected from the test vector) The deltas in bits look like this (broken down by AES word):
Unfortunately I can't directly compare to FIPS 197 step-by-step due to the key schedule being bitsliced and reordered. |
@peterdettman I don't suppose you have any insights here? For context the intended use case here is Deoxys, or any other construction built on the raw AES round function. |
@tarcieri Maybe I can take a closer look tomorrow, but if it's using the existing key schedule I guess the sub_bytes_nots is the problem (remove it). Although a more natural way to write the round function would be sub_bytes/shift_rows_1/mix_columns_0/add_round_key (this will also be the fastest). Edit: Oh I see it's a supplied round key. Then keep sub_bytes_nots and just use the method order above I think. (the way you have it currently, the round key hasn't been prepared with an inv_shift_rows_1 call). |
It's not. The goal is to support an AES-NI like API which can work with the standard FIPS 197-style key schedule (edit: or more specifically in the immediate intended use case, Deoxys's key schedule). We have backends working on AES-NI and the ARMv8 Cryptography Extensions. That said...
I swear I tried this before, but if I do that with the inclusion of
...it works! 🎉 |
5cfe35f
to
f831fbe
Compare
Update: I now have all 4 operations (cipher, equiv inverse cipher, mix columns, and inv mix columns) working on the 64-bit backend. Gonna do the 32-bit one. Something else we should definitely consider, especially for performance, is a |
f831fbe
to
38439ac
Compare
You mean an API that can do multiple rounds instead of a single one? I think that would help the compiler a lot with auto-vectorization, especially if it's able to unroll the loop and reuse the SIMD registers instead of reload/saving it at each iteration. Another thing we might need to consider is dynamic AES-NI detection, although I'm not sure the best way to do it. |
Yep. Each invocation to the soft backend is actually computing 4 blocks in parallel on 64-bit archs (2 blocks on 32-bit ones), so it's pretty wasteful to shoehorn a single block API on top of it. I can take a crack at adding a parallel API after I get an initial PoC working.
It's already implemented, and works portably across x86(-64) and ARMv8: https://github.com/RustCrypto/block-ciphers/blob/master/aes/src/hazmat.rs#L47-L55 |
38439ac
to
4cbc4f0
Compare
The `hazmat` API provides access to the raw AES cipher round, equivalent inverse cipher round, mix columns, and inverse mix column operations. This commit wires up support in the "soft" backend (or more specifically, both the 32-bit and 64-bit fixsliced backends). It would benefit from a parallel API instead of what's currently provided, however that's left for future work.
4cbc4f0
to
29b1bb6
Compare
Not sure how using 4 parallel blocks would be useful for Deoxys, as each blocks uses a different set of round keys(the block number is used in the key schedule) |
In a prospective API for this, you'd pass in an array of round keys and an array of blocks, and the parallel API could apply a particular round key to a particular block. I can open a PR for it and we can discuss. |
@tarcieri Note that there's no real reason to bitslice the round key here, instead it could be applied after the inv_bitslice of the state (and then only needed for the single output block). |
@tarcieri (inv_)mix_columns(_0) could also get non-bitsliced implementations for these, if/when it matters. |
I'm just about to open a follow-up which adds parallelism and does a bit of cleanup including avoiding bitslicing the round keys. Edit: opened #269. Not terribly worried about putting too much effort into this API for now. I'd just like to get it PoC'd and working. In the future, however, it might be interesting to try to use an API like this as the core of the overall implementation, which would get rid of a lot of redundant boilerplate that presently exists in the |
The
hazmat
API provides access to the raw AES cipher round, equivalent inverse cipher round, mix columns, and inverse mix column operations.This PR wires up support for these operations in the "soft" backend (or more specifically, both the 32-bit and 64-bit fixsliced backends).
It would benefit from a parallel API instead of what's currently provided, however that's left for future work.