Skip to content

Commit

Permalink
review 1 feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
eschorn1 committed Apr 13, 2024
1 parent 1d9ff8a commit 39bf4f3
Show file tree
Hide file tree
Showing 23 changed files with 317 additions and 348 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 0.1.5 (2024-04-14)

- Significant performance optimizations and internal revisions based upon review feedback

## 0.1.4 (2024-04-01)

- Constant-time fixes and measurement
Expand Down
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ workspace = { members = ['ffi'], exclude = ["ct_cm4", "dudect", "fuzz", "wasm"]

[package]
name = "fips203"
version = "0.1.4"
version = "0.1.5"
edition = "2021"
license = "MIT OR Apache-2.0"
description = "FIPS 203 (draft): Module-Lattice-Based Key-Encapsulation Mechanism"
authors = ["Eric Schorn <eschorn@integritychain.com>"]
documentation = "https://docs.rs/fips203"
categories = ["cryptography", "no-std"]
repository = "https://github.com/integritychain/fips203"
keywords = ["FIPS", "FIPS203", "lattice", "key", "encapsulation"]
keywords = ["FIPS", "FIPS203", "lattice", "kyber", "encapsulation"]
rust-version = "1.70" # Requires several marginally outdated dependencies


Expand All @@ -26,7 +26,7 @@ ml-kem-1024 = []
zeroize = { version = "1.6.0", default-features = false, features = ["zeroize_derive"] }
rand_core = { version = "0.6.4", default-features = false }
sha3 = { version = "0.10.2", default-features = false }
subtle = { version = "2.5.0", default-features = false }
subtle = { version = "2.5.0", default-features = false, features = ['const-generics'] }

[dev-dependencies] # Some are marginally outdated to retain MSRV 1.70
rand = "0.8.5"
Expand Down
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@

[FIPS 203] (Initial Public Draft) Module-Lattice-Based Key-Encapsulation Mechanism Standard written in pure Rust for
server, desktop, browser and embedded applications. The source repository includes examples demonstrating benchmarking,
an embedded target, constant-time statistics, fuzzing, WASM execution, C FFI and Python bindings.
an embedded target, constant-time statistical measurements, fuzzing, WASM execution, C FFI and Python bindings.

This crate implements the FIPS 203 **draft** standard in pure Rust with minimal and mainstream dependencies. All three
security parameter sets are fully functional. The implementation operates in constant-time, does not require the
standard library, e.g. `#[no_std]`, has no heap allocations, e.g. no `alloc` needed, and optionally exposes the `RNG`
so it is suitable for the full range of applications down to the bare-metal. The API is stabilized and the code is
heavily biased towards safety and correctness; further performance optimizations will be implemented as the standard
matures. This crate will quickly follow any changes to FIPS 203 as they become available.
This crate implements the FIPS 203 **draft** standard in pure Rust with minimal and mainstream dependencies **and
without any unsafe code**. All three security parameter sets are fully functional. The implementation operates in
constant-time (outside of rho), does not require the standard library, e.g. `#[no_std]`, has no heap allocations,
e.g. no `alloc` needed, and optionally exposes the `RNG` so it is suitable for the full range of applications down to
the bare-metal. The API is stabilized and the code is heavily biased towards safety and correctness; further
performance optimizations will be implemented as the standard matures. This crate will quickly follow any changes to
FIPS 203 as they become available.

See <https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.ipd.pdf> for a full description of the target functionality.

Expand Down Expand Up @@ -57,7 +58,7 @@ The Rust [Documentation][docs-link] lives under each **Module** corresponding to

* This crate is fully functional and corresponds to the first initial public draft of FIPS 203.
* Constant-time operation targets the source-code level only, with confirmation via the embedded target and
the `dudect` dynamic tests. While the API has a suffix of `_vt`, this will change to `_ct` in version 0.2.0.
the `dudect` dynamic tests. While the API uses a suffix of `_vt`, this will be changed in version 0.2.0.
* Note that FIPS 203 places specific requirements on randomness per section 3.3, hence the exposed `RNG`.
* Requires Rust **1.70** or higher. The minimum supported Rust version may be changed in the future, but
it will be done with a minor version bump (when the major version is larger than 0).
Expand Down
28 changes: 14 additions & 14 deletions benches/README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
Figure-of-merit only; no particular care has been taken to disable turbo-boost etc.
Note that constant-time restrictions significantly impact performance.
Note that constant-time restrictions on the implementation do impact performance.

Performance optimizations will quickly follow the next update to FIPS 203.
Near-obvious uplift can be had with more careful modular multiplication & addition,
then fewer reductions. Also, 'u16' arithmetic has a performance penalty.
Additional performance optimizations will follow the next update to FIPS 203.
Near-obvious uplift can be had with more careful modular multiplication & addition
using fewer reductions. Also, 'u16' arithmetic has a performance penalty.

~~~
April 1, 2024
April 13, 2024
Intel® Core™ i7-7700K CPU @ 4.20GHz × 8
$ RUSTFLAGS="-C target-cpu=native" cargo bench
ml_kem_512 KeyGen time: [52.346 µs 52.370 µs 52.386 µs]
ml_kem_768 KeyGen time: [90.119 µs 90.518 µs 91.110 µs]
ml_kem_1024 KeyGen time: [140.77 µs 140.96 µs 141.13 µs]
ml_kem_512 KeyGen time: [28.157 µs 28.164 µs 28.172 µs]
ml_kem_768 KeyGen time: [47.946 µs 47.963 µs 47.985 µs]
ml_kem_1024 KeyGen time: [74.143 µs 74.152 µs 74.162 µs]
ml_kem_512 Encaps time: [66.662 µs 66.938 µs 67.355 µs]
ml_kem_768 Encaps time: [107.48 µs 107.54 µs 107.62 µs]
ml_kem_1024 Encaps time: [156.71 µs 156.93 µs 157.19 µs]
ml_kem_512 Encaps time: [28.580 µs 28.584 µs 28.588 µs]
ml_kem_768 Encaps time: [45.487 µs 45.512 µs 45.542 µs]
ml_kem_1024 Encaps time: [67.062 µs 67.144 µs 67.252 µs]
ml_kem_512 Decaps time: [94.679 µs 94.720 µs 94.749 µs]
ml_kem_768 Decaps time: [150.44 µs 151.12 µs 152.29 µs]
ml_kem_1024 Decaps time: [213.44 µs 214.01 µs 214.65 µs]
ml_kem_512 Decaps time: [40.099 µs 40.111 µs 40.123 µs]
ml_kem_768 Decaps time: [61.509 µs 61.532 µs 61.558 µs]
ml_kem_1024 Decaps time: [91.470 µs 91.606 µs 91.744 µs]
~~~
4 changes: 2 additions & 2 deletions benches/benchmark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ impl RngCore for TestRng {

fn fill_bytes(&mut self, out: &mut [u8]) {
out.iter_mut().for_each(|b| *b = 0);
out[0..4].copy_from_slice(&self.value.to_be_bytes())
out[0..4].copy_from_slice(&self.value.to_be_bytes());
self.value = self.value.wrapping_add(1);
}

fn try_fill_bytes(&mut self, out: &mut [u8]) -> Result<(), rand_core::Error> {
self.fill_bytes(out);
self.value = self.value.wrapping_add(1);
Ok(())
}
}
Expand Down
9 changes: 5 additions & 4 deletions ct_cm4/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
[package]
name = "fips203-ct_cm4"
version = "0.1.4"
version = "0.1.5"
license = "MIT OR Apache-2.0"
description = "Cortex-M4 testbench for FIPS 203 (draft) ML-KEM"
authors = ["Eric Schorn <eschorn@integritychain.com>"]
publish = false
edition = "2021"
rust-version = "1.70"


[dependencies]
Expand All @@ -16,8 +17,7 @@ panic-rtt-target = { version = "0.1.2", features = ["cortex-m"] }
microbit-v2 = "0.13.0"
rtt-target = { version = "0.5.0" }
rand_core = { version = "0.6.4", default-features = false }
hex-literal = "0.4.1"
rand_chacha = { version = "0.3.1", default-features = false }
subtle = { version = "2.5.0", default-features = false }


[profile.dev]
Expand All @@ -29,4 +29,5 @@ opt-level = 3
codegen-units = 1


# cargo update -p fixed@1.26.0 --precise 1.23.1
# If cargo complains about 'fixed' on MSRV 1.70, use version 1.23.1
# cargo update -p fixed@1.27.0 --precise 1.23.1
2 changes: 1 addition & 1 deletion ct_cm4/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ An example for the Microbit v2 Board -- <https://docs.rust-embedded.org/discover

This example demonstrates the full loop of keygen, encaps, decaps and then shared
secret equivalency. Cycle counts are measured, displayed, and operation confirmed
to be constant-time. See the link above for tooling setup.
to be constant-time (outside of rho). See the link above for tooling setup.

~~~
$ cd ct_cm4 # <here>
Expand Down
33 changes: 23 additions & 10 deletions ct_cm4/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,13 @@ use microbit::{
use panic_rtt_target as _;
use rand_core::{CryptoRng, RngCore};
use rtt_target::{rprintln, rtt_init_print};
use subtle::{ConditionallySelectable, ConstantTimeEq};

// Simplistic RNG to regurgitate incremented values when 'asked'

// Simplistic RNG to regurgitate incremented values when 'asked' except rho every i mod 4 == 1
#[derive(Clone)]
struct TestRng {
rho: u32,
value: u32,
}

Expand All @@ -25,12 +29,14 @@ impl RngCore for TestRng {

fn fill_bytes(&mut self, out: &mut [u8]) {
out.iter_mut().for_each(|b| *b = 0);
out[0..4].copy_from_slice(&self.value.to_be_bytes())
let supply_rho = (self.value & 0x03).ct_eq(&1);
let target = u32::conditional_select(&self.value, &self.rho, supply_rho);
out[0..4].copy_from_slice(&target.to_be_bytes());
self.value = self.value.wrapping_add(1);
}

fn try_fill_bytes(&mut self, out: &mut [u8]) -> Result<(), rand_core::Error> {
self.fill_bytes(out);
self.value = self.value.wrapping_add(1);
Ok(())
}
}
Expand All @@ -46,7 +52,8 @@ fn main() -> ! {
board.display_pins.col1.set_low().unwrap();
rtt_init_print!();

let mut rng = TestRng { value: 1 };
let mut rng = TestRng { rho: 999, value: 4 }; // arbitrary choice (value must be mult of 4)
let mut spare_draw = [0u8; 32];
let mut expected_cycles = 0;
let mut i = 0u32;

Expand All @@ -71,16 +78,22 @@ fn main() -> ! {
asm::isb();
let finish = DWT::cycle_count();
asm::isb();
let _ = rng.try_fill_bytes(&mut spare_draw).unwrap(); // ease our lives; multiple of 4

let count = finish - start;
if i == 5 {

if (i % 1000) == 0 {
rng.rho += 1
}; // each rho should have a fixed cycle count
if (i % 1000) == 2 {
expected_cycles = count
}; // capture the cycle count
if ((i % 1000) > 2) & (count != expected_cycles) {
// make sure it is constant
panic!("Non constant-time operation!! iteration:{} cycles:{}", i, count)
};
if (i > 5) & (count != expected_cycles) {
panic!("Non constant-time operation!! {}", i)
if i % 100 == 0 {
rprintln!("Iteration {} cycle count: {}", i, count)
};
if i % 10 == 0 {
rprintln!("Iteration {} cycle count: {}", i, count);
}
}
}
4 changes: 3 additions & 1 deletion dudect/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
[package]
name = "fips203-dudect"
version = "0.1.4"
version = "0.1.5"
authors = ["Eric Schorn <eschorn@integritychain.com>"]
publish = false
edition = "2021"
license = "MIT OR Apache-2.0"
rust-version = "1.70"


[dependencies]
fips203 = { path = "..", default-features = false, features = ["ml-kem-512"] }
dudect-bencher = "0.6"
rand_core = { version = "0.6.4", default-features = false }
subtle = { version = "2.5.0", default-features = false, features = ['const-generics'] }


[profile.bench]
Expand Down
27 changes: 16 additions & 11 deletions dudect/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,24 @@ not entirely definitive. A work in progress.
See <https://docs.rs/dudect-bencher/latest/dudect_bencher/>

~~~
April 13, 2024
Intel® Core™ i7-7700K CPU @ 4.20GHz × 8
$ cd dudect # this directory
$ RUSTFLAGS="-C target-cpu=native" cargo run --release -- --continuous full_flow
running 1 benchmark continuously
bench full_flow seeded with 0xfe84d163d7c04e60
bench full_flow ... : n == +0.199M, max t = -1.44267, max tau = -0.00323, (5/tau)^2 = 2396165
bench full_flow ... : n == +0.329M, max t = -1.36960, max tau = -0.00239, (5/tau)^2 = 4390143
bench full_flow ... : n == +0.045M, max t = +1.34098, max tau = +0.00631, (5/tau)^2 = 628087
bench full_flow ... : n == +0.059M, max t = +1.42716, max tau = +0.00587, (5/tau)^2 = 725853
bench full_flow ... : n == +0.073M, max t = +1.32158, max tau = +0.00488, (5/tau)^2 = 1051265
bench full_flow ... : n == +0.094M, max t = +1.27343, max tau = +0.00416, (5/tau)^2 = 1447412
bench full_flow ... : n == +0.102M, max t = +1.37215, max tau = +0.00430, (5/tau)^2 = 1349570
bench full_flow ... : n == +1.558M, max t = -1.84884, max tau = -0.00148, (5/tau)^2 = 11395685
bench full_flow ... : n == +1.744M, max t = -2.01548, max tau = -0.00153, (5/tau)^2 = 10731802
bench full_flow seeded with 0xea4abed34ed3db26
bench full_flow ... : n == +0.026M, max t = +1.01335, max tau = +0.00630, (5/tau)^2 = 629412
bench full_flow ... : n == +0.337M, max t = -1.75137, max tau = -0.00302, (5/tau)^2 = 2747082
bench full_flow ... : n == +0.497M, max t = -1.51764, max tau = -0.00215, (5/tau)^2 = 5396932
bench full_flow ... : n == +0.740M, max t = -1.61671, max tau = -0.00188, (5/tau)^2 = 7074543
bench full_flow ... : n == +0.786M, max t = -1.32078, max tau = -0.00149, (5/tau)^2 = 11271287
bench full_flow ... : n == +1.064M, max t = -1.33199, max tau = -0.00129, (5/tau)^2 = 14999502
bench full_flow ... : n == +1.193M, max t = -1.67507, max tau = -0.00153, (5/tau)^2 = 10632445
bench full_flow ... : n == +1.383M, max t = -1.77745, max tau = -0.00151, (5/tau)^2 = 10940561
bench full_flow ... : n == +1.437M, max t = -1.46110, max tau = -0.00122, (5/tau)^2 = 16823530
bench full_flow ... : n == +1.746M, max t = -1.52752, max tau = -0.00116, (5/tau)^2 = 18704795
bench full_flow ... : n == +1.937M, max t = -1.54864, max tau = -0.00111, (5/tau)^2 = 20194492
bench full_flow ... : n == +2.125M, max t = -1.50879, max tau = -0.00104, (5/tau)^2 = 23332048
...
~~~
21 changes: 14 additions & 7 deletions dudect/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@ use dudect_bencher::{ctbench_main, BenchRng, Class, CtRunner};
use fips203::ml_kem_512; // Could also be ml_kem_768 or ml_kem_1024.
use fips203::traits::{Decaps, Encaps, KeyGen};
use rand_core::{CryptoRng, RngCore};
use subtle::{ConditionallySelectable, ConstantTimeEq};

// Simplistic RNG to regurgitate incremented values when 'asked'

// Simplistic RNG to regurgitate incremented values when 'asked' except rho every 4th time
#[derive(Clone)]
struct TestRng {
rho: u32,
value: u32,
}

Expand All @@ -16,12 +19,14 @@ impl RngCore for TestRng {

fn fill_bytes(&mut self, out: &mut [u8]) {
out.iter_mut().for_each(|b| *b = 0);
out[0..4].copy_from_slice(&self.value.to_be_bytes())
let supply_rho = (self.value & 0x03u32).ct_eq(&1_u32);
let target = u32::conditional_select(&self.value, &self.rho, supply_rho);
out[0..4].copy_from_slice(&target.to_be_bytes());
self.value = self.value.wrapping_add(1);
}

fn try_fill_bytes(&mut self, out: &mut [u8]) -> Result<(), rand_core::Error> {
self.fill_bytes(out);
self.value = self.value.wrapping_add(1);
Ok(())
}
}
Expand All @@ -33,8 +38,8 @@ fn full_flow(runner: &mut CtRunner, mut _rng: &mut BenchRng) {
const ITERATIONS_INNER: usize = 5;
const ITERATIONS_OUTER: usize = 200_000;

let rng_left = TestRng { value: 111 };
let rng_right = TestRng { value: 222 };
let rng_left = TestRng { rho: 9, value: 111 * 4 };
let rng_right = TestRng { rho: 9, value: 222 * 4 };

let mut classes = [Class::Right; ITERATIONS_OUTER];
let mut rng_refs = [&rng_right; ITERATIONS_OUTER];
Expand All @@ -48,11 +53,13 @@ fn full_flow(runner: &mut CtRunner, mut _rng: &mut BenchRng) {
for (class, &rng_r) in classes.into_iter().zip(rng_refs.iter()) {
runner.run_one(class, || {
let mut rng = rng_r.clone();
let mut spare_draw = [0u8; 32];
for _ in 0..ITERATIONS_INNER {
let (ek, dk) = ml_kem_512::KG::try_keygen_with_rng_vt(&mut rng).unwrap();
let (ssk1, ct) = ek.try_encaps_with_rng_vt(&mut rng).unwrap();
let (ek, dk) = ml_kem_512::KG::try_keygen_with_rng_vt(&mut rng).unwrap(); // uses 2 rng
let (ssk1, ct) = ek.try_encaps_with_rng_vt(&mut rng).unwrap(); // uses 1 rng
let ssk2 = dk.try_decaps_vt(&ct).unwrap();
assert_eq!(ssk1, ssk2);
let _ = rng.try_fill_bytes(&mut spare_draw).unwrap(); // ease our lives; multiple of 4
}
})
}
Expand Down
2 changes: 1 addition & 1 deletion ffi/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "fips203-ffi"
version = "0.1.4"
version = "0.1.5"
edition = "2021"
license = "MIT OR Apache-2.0"
description = "C shared library exposing FIPS 203 (draft): Module-Lattice-Based Key-Encapsulation Mechanism"
Expand Down
5 changes: 4 additions & 1 deletion fuzz/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
[package]
name = "fips203-fuzz"
version = "0.1.4"
version = "0.1.5"
publish = false
edition = "2021"
rust-version = "1.70"


[package.metadata]
cargo-fuzz = true


[dependencies]
libfuzzer-sys = "0.4"
rand_core = "0.6.4"
Expand Down
2 changes: 1 addition & 1 deletion fuzz/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ $ cargo fuzz run fuzz_all -j 4
Coverage status of ml_kem_512 is robust, see:

~~~
#3543: cov: 6550 ft: 4187 corp: 31 exec/s 5 oom/timeout/crash: 0/0/0 time: 170s job: 33 dft_time: 0
#129515: cov: 6282 ft: 4708 corp: 80 exec/s 9 oom/timeout/crash: 0/0/0 time: 3628s job: 167 dft_time: 0
# Warning: the following tools are tricky to install/configure
$ cargo install cargo-cov
Expand Down
2 changes: 1 addition & 1 deletion src/byte_fns.rs
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ pub(crate) fn byte_encode(d: u32, integers_f: &[Z; 256], bytes_b: &mut [u8]) {
while bit_index > 7 {
//
// Drop the byte
bytes_b[byte_index] = temp.to_le_bytes()[0];
bytes_b[byte_index] = temp.to_le_bytes()[0]; // avoids u8 cast

// Update the indices
temp >>= 8;
Expand Down

0 comments on commit 39bf4f3

Please sign in to comment.