initial rc5 impl #346

antonio-dropulic · 2022-11-23T19:52:20Z

Implemented RC5 based on the revised paper. Got test vector at: https://www.ietf.org/archive/id/draft-krovetz-rc6-rc5-vectors-00.txt

Core impl features a generic impl using typenum and generic array. Const generic implementation
without generic expressions doesn't seem appropriate.

I was not sure what api would work best with the cipher crate so i decided to propagate the InOut ptr.
Put all the core fn behind a trait to avoid rewriting generic bounds.

For now only 32, 20, 20 parametrization made public.

rc5/README.md

lumag · 2022-11-24T20:21:58Z

rc5/src/block_cipher.rs

+use cipher::{impl_simple_block_encdec, AlgorithmName, KeyInit};
+use cipher::{inout::InOut, Block, BlockCipher, KeySizeUser};
+
+pub struct RC5_32_20_16 {


The choice of 20 rounds is not obvious. Both OpenSSL and RFC 2040 default to 12 rounds.

from the: https://www.ietf.org/archive/id/draft-krovetz-rc6-rc5-vectors-00.txt

" The original RC5 publication suggested using 12 rounds when w=32 and
16 rounds when w=64. In response to cryptanalysis, the authors
changed the recommendation when w=32 to 16 rounds [RC5sec]."

I'm not sure what should be the defaults. I have no problem changing it to 12. But where should I find the test vectors?

I'm not sure how to proceed. Should i expose RC5 32, 12, 16 or 32, 16, 16? I implemented RC5 32, 20, 16 since that is the only w=32 test vector i found already generated in the krovetz draft. I suppose it is fine to use the code given there to generate others.

A possible solution would be to implement cipher api for all posible w/r/b. Then we expose some recomended / common configurations through type aliases. The issue I see is covering all the possible configurations with tests. Would it could be okay to test only the recomended configurations?

Exposing them as generic parameters makes sense to me

I'll tweak the code to do that. I don't see how to escape repeating the bounds for every impl. The code wont be pretty :D And I still have to add bounds to limit the range of rounds and key bytes.

I've added the generic implementation, but the compiler seems to strugle with optimizing it. I notice a 3x penalty in performance compared to the previous version.

name control ns/iter generic ns/iter diff ns/iter diff % speedup rc5_32_12_16_decrypt_block 14,611 (560 MB/s) 63,711 (128 MB/s) 49,100 336.05% x 0.23 rc5_32_12_16_decrypt_blocks 14,659 (558 MB/s) 58,842 (139 MB/s) 44,183 301.41% x 0.25 rc5_32_12_16_encrypt_block 11,242 (728 MB/s) 51,916 (157 MB/s) 40,674 361.80% x 0.22 rc5_32_12_16_encrypt_blocks 11,467 (714 MB/s) 51,360 (159 MB/s) 39,893 347.89% x 0.22 rc5_32_16_16_decrypt_block 19,621 (417 MB/s) 77,881 (105 MB/s) 58,260 296.93% x 0.25 rc5_32_16_16_decrypt_blocks 19,800 (413 MB/s) 76,284 (107 MB/s) 56,484 285.27% x 0.26 rc5_32_16_16_encrypt_block 17,708 (462 MB/s) 81,863 (100 MB/s) 64,155 362.29% x 0.22 rc5_32_16_16_encrypt_blocks 17,704 (462 MB/s) 75,664 (108 MB/s) 57,960 327.38% x 0.23

Any idea why this might be?

I've used cargo-show-asm to generate the asm of the bench functions in order to provide some clarity on the performance regression.

I've replaced the bench with something easier to generate asm from and looked at the asm from fn encrypt_

#![feature(test)] extern crate test; use cipher::generic_array::GenericArray; use cipher::typenum::U8; use cipher::BlockEncryptMut; use cipher::{block_decryptor_bench, block_encryptor_bench}; use rc5::{RC5_32_12_16, RC5_32_16_16}; #[bench] pub fn encrypt(bh: &mut test::Bencher) { use cipher::KeyInit; let key = test::black_box(Default::default()); let mut cipher = RC5_32_12_16::new(&key); let mut block: GenericArray<u8, U8> = test::black_box(Default::default()); bh.iter(|| { encrypt_(&mut cipher, &mut block); }); bh.bytes = (block.len()) as u64; } #[inline(never)] pub fn encrypt_(cipher: &mut RC5_32_12_16, block: &mut GenericArray<u8, U8>) { cipher.encrypt_block_mut(block); }

The asm for the non generic code just calls the fn encrypt_block. This function contains the call to
the generic version of fn encrypt, but with fixed generic parameters. The asm for the less performant generic version inlines the entire encrypt algorithm.

I've managed to reproduce the same performance regresion in the non generic code by adding #[inline(always)] to the fn encrypt_block.

It is unclear for me why this happens or how to fix it. Should this pr be turned into a draft until I find a solution or should the non generic version be submitted in the meantime?

Upon closer inspection i found that the generic version would not inline calls to word functions, while the non generic one would. After adding inline annotations to Word impl the performance is once again comparable.

rc5/src/block_cipher.rs

after 5b84ad4: generic rc5 pefromance regression was noticed. after analizing the generated asm I noticed that while the generic top level generic impl does get inlined the actual algorithm impl differs in the fact that it does not inline Word functions. After adding inline annotations performance is once again compareable.

axos88 · 2023-02-10T18:31:14Z

Can we get this merged and released? Is it usable @antonio-dropulic? I'd very much like to use it, and would be happy to do from your branch until it gets merged.

antonio-dropulic added 2 commits November 23, 2022 20:43

initial rc5 impl

92346d7

fix no std

1b86811

lumag reviewed Nov 24, 2022

View reviewed changes

antonio-dropulic added 8 commits November 25, 2022 17:21

fix typos

530a905

add macro for cipher impl

6537488

add common u32 impls

02579f4

fix macro imports

2c9c474

fix benchmarks

91d6507

add range bounds on rounds and key

294ad4d

generic rc5

5b84ad4

antonio-dropulic force-pushed the rc5 branch from 05893bb to 649f26a Compare December 5, 2022 20:48

antonio-dropulic added 3 commits December 5, 2022 22:18

clean up code organization

cd7cd32

seal Word

345590b

add u16 and u64 word impls

65edb3d

antonio-dropulic requested a review from tarcieri December 15, 2022 23:01

tarcieri approved these changes Feb 10, 2023

View reviewed changes

tarcieri merged commit cbcebe1 into RustCrypto:master Feb 10, 2023

newpavlov mentioned this pull request Jun 23, 2023

Missing block ciphers #1

Open

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial rc5 impl #346

initial rc5 impl #346

antonio-dropulic commented Nov 23, 2022

lumag Nov 24, 2022

antonio-dropulic Nov 25, 2022 •

edited

antonio-dropulic Nov 30, 2022 •

edited

tarcieri Dec 1, 2022

antonio-dropulic Dec 2, 2022

antonio-dropulic Dec 2, 2022

antonio-dropulic Dec 5, 2022

antonio-dropulic Dec 5, 2022

axos88 commented Feb 10, 2023

initial rc5 impl #346

initial rc5 impl #346

Conversation

antonio-dropulic commented Nov 23, 2022

lumag Nov 24, 2022

Choose a reason for hiding this comment

antonio-dropulic Nov 25, 2022 • edited

Choose a reason for hiding this comment

antonio-dropulic Nov 30, 2022 • edited

Choose a reason for hiding this comment

tarcieri Dec 1, 2022

Choose a reason for hiding this comment

antonio-dropulic Dec 2, 2022

Choose a reason for hiding this comment

antonio-dropulic Dec 2, 2022

Choose a reason for hiding this comment

antonio-dropulic Dec 5, 2022

Choose a reason for hiding this comment

antonio-dropulic Dec 5, 2022

Choose a reason for hiding this comment

axos88 commented Feb 10, 2023

antonio-dropulic Nov 25, 2022 •

edited

antonio-dropulic Nov 30, 2022 •

edited