Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Migrate to memsec #224

Merged
merged 7 commits into from
Jan 3, 2024
Merged

feat: Migrate to memsec #224

merged 7 commits into from
Jan 3, 2024

Conversation

koraa
Copy link
Member

@koraa koraa commented Dec 27, 2023

No description provided.

@koraa koraa requested a review from wucke13 December 27, 2023 15:59
@koraa
Copy link
Member Author

koraa commented Dec 27, 2023

Looks like the fuzzer might actually have found something? https://github.com/rosenpass/rosenpass/actions/runs/7339888800/job/19984940094?pr=224

@thaodt
Copy link

thaodt commented Dec 28, 2023

Looks like the fuzzer might actually have found something? https://github.com/rosenpass/rosenpass/actions/runs/7339888800/job/19984940094?pr=224

interesting... could you please share your experiment with this fuzz test? what will you do in this case? how do you debug and fix it? when i look at the error, it only displayed the fuzz fn fuzz_handle_msg, I expected it could show us which line number caused crash..

@wucke13
Copy link

wucke13 commented Dec 28, 2023

I can reproduce the CI failure with this:

cargo fuzz run fuzz_mceliece_encaps -- -max_total_time=60
# AND
cargo fuzz run fuzz_handle_msg -- -max_total_time=60
$ cargo fuzz run fuzz_mceliece_encaps -- -max_total_time=60
    Finished release [optimized] target(s) in 0.13s
    Finished release [optimized] target(s) in 0.12s
     Running `target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps -artifact_prefix=/home/wucke13/desktop/rosenpass/fuzz/artifacts/fuzz_mceliece_encaps/ -max_total_time=60 /home/wucke13/desktop/rosenpass/fuzz/corpus/fuzz_mceliece_encaps`
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1869984084
INFO: Loaded 1 modules   (6943 inline 8-bit counters): 6943 [0x5555558dc3d0, 0x5555558ddeef), 
INFO: Loaded 1 PC tables (6943 PCs): 6943 [0x5555558ddef0,0x5555558f90e0), 
INFO:        0 files found in /home/wucke13/desktop/rosenpass/fuzz/corpus/fuzz_mceliece_encaps
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
AddressSanitizer:DEADLYSIGNAL
=================================================================
==118376==ERROR: AddressSanitizer: stack-overflow on address 0x7ffffe576e78 (pc 0x555555688628 bp 0x7fffffff5dd0 sp 0x7ffffe576e40 T0)
    #0 0x555555688628 in rust_fuzzer_test_input (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x134628)
    #1 0x55555568b1c8 in std::panicking::try::do_call::h6f5a87d5990fab5b (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x1371c8)
    #2 0x555555690ae7 in __rust_try libfuzzer_sys.235000fa83432df9-cgu.0
    #3 0x55555568fb91 in LLVMFuzzerTestOneInput (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x13bb91)
    #4 0x555555696bb9 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x142bb9)
    #5 0x55555569f792 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x14b792)
    #6 0x55555569fdb6 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x14bdb6)
    #7 0x5555556c0061 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x16c061)
    #8 0x5555555a7162 in main (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x53162)
    #9 0x7ffff7a3cb0d in __libc_start_call_main (/nix/store/whypqfa83z4bsn43n4byvmw80n4mg3r8-glibc-2.37-45/lib/libc.so.6+0x23b0d) (BuildId: 2b9ebcc534a497a5e424c017f310e087ec14b7b6)
    #10 0x7ffff7a3cbc8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/whypqfa83z4bsn43n4byvmw80n4mg3r8-glibc-2.37-45/lib/libc.so.6+0x23bc8) (BuildId: 2b9ebcc534a497a5e424c017f310e087ec14b7b6)
    #11 0x5555555a72c4 in _start (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x532c4)

SUMMARY: AddressSanitizer: stack-overflow (/home/wucke13/desktop/rosenpass/target/x86_64-unknown-linux-gnu/release/fuzz_mceliece_encaps+0x134628) in rust_fuzzer_test_input
==118376==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000


artifact_prefix='/home/wucke13/desktop/rosenpass/fuzz/artifacts/fuzz_mceliece_encaps/'; Test unit written to /home/wucke13/desktop/rosenpass/fuzz/artifacts/fuzz_mceliece_encaps/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64: 

────────────────────────────────────────────────────────────────────────────────

Failing input:

	fuzz/artifacts/fuzz_mceliece_encaps/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709

Reproduce with:

	cargo fuzz run fuzz_mceliece_encaps fuzz/artifacts/fuzz_mceliece_encaps/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709

Minimize test case with:

	cargo fuzz tmin fuzz_mceliece_encaps fuzz/artifacts/fuzz_mceliece_encaps/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709

────────────────────────────────────────────────────────────────────────────────

Error: Fuzz target exited with exit status: 1

So it seems that classic mcelliece fails due to a stack overflow. My natural instinct was to resize the default stack size for new threads using export RUST_MIN_STACK=$((16*(2**20))), but that did not solve the issue.

@wucke13
Copy link

wucke13 commented Dec 28, 2023

Ok, I found the first problem that causes fuzz failure:

https://github.com/rosenpass/rosenpass/blob/a477d2d791e4b542b6469e1bf1e38f4fafa9fea3/fuzz/fuzz_targets/mceliece_encaps.rs fails, because StaticKem::encaps is passed an empty slice as sk. In particular, this assert_eq! is the culprit. Now I'm wondering, if we should either

  • statically ensure that a slice of the correct length is passed (by making the arg a &[u8; Self::SK_LEN]
  • handle the error dynamically using Result, instead of panicking on wrong buffer size
  • limit the fuzzer to only fuzz with inputs of correct length

Anyhow I added a test case for the first of the two failures to this branch.

@wucke13
Copy link

wucke13 commented Dec 28, 2023

The second error I do not understand yet. It appears the failure occurs for an empty input, at least the unit file is empty and this output is generated by cargo fuzz:

Failing input:

	fuzz/artifacts/fuzz_handle_msg/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709

Output of `std::fmt::Debug`:

	[]

Despite that, my minimal unit test that provides empty output does not fail:

    #[test]
    fn fuzz_handle_msg_error_2023_12_28() {
        sodium_init().unwrap();

        let sk = Secret::from_slice(&[0; 13568]);
        let pk = Secret::from_slice(&[0; 524160]);

        let mut cs = CryptoServer::new(sk, pk);
        let mut tx_buf = [0; 10240];

        let _ = cs.handle_msg(&[], &mut tx_buf);
    }

@thaodt
Copy link

thaodt commented Dec 29, 2023

The second error I do not understand yet. It appears the failure occurs for an empty input, at least the unit file is empty and this output is generated by cargo fuzz:

Failing input:

	fuzz/artifacts/fuzz_handle_msg/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709

Output of `std::fmt::Debug`:

	[]

Despite that, my minimal unit test that provides empty output does not fail:

    #[test]
    fn fuzz_handle_msg_error_2023_12_28() {
        sodium_init().unwrap();

        let sk = Secret::from_slice(&[0; 13568]);
        let pk = Secret::from_slice(&[0; 524160]);

        let mut cs = CryptoServer::new(sk, pk);
        let mut tx_buf = [0; 10240];

        let _ = cs.handle_msg(&[], &mut tx_buf);
    }

same here. I tried to replicate and really didn't understand the failure caused by an empty input.

@wucke13
Copy link

wucke13 commented Dec 29, 2023

@thaodt may it bee that this really is a stack overflow? Does the fuzzing framework allocate the values on stack? We had trouble with stack overflows before, always when touching Classic McEliece. I only tried the env var which changes the stack size for new threads, the main thread still has the rust default. We should try changing that as well.

@wucke13
Copy link

wucke13 commented Dec 29, 2023

@thaodt Yup, I can not reproduce the other failure when I remove stack size limitations:

$  ulimit -s unlimited
$ cargo fuzz run fuzz_mceliece_encaps -- -max_total_time=60

Edit:

# does not suffice:
$ ulimit -s $((26* 2**10))

# does suffice
$ ulimit -s $((28* 2**10))

@wucke13
Copy link

wucke13 commented Dec 29, 2023

In particular, the fuzzer function basically asks for a stack overflow:

fuzz_target!(|input: [u8; StaticKem::PK_LEN]| {
    let mut ciphertext = [0u8; 188];
    let mut shared_secret = [0u8; 32];

    // We expect errors while fuzzing therefore we do not check the result.
    let _ = StaticKem::encaps(&mut shared_secret, &mut ciphertext, &input);
});

IIRC, StaticKem::PK_LEN is somewhere around 500 kilobytes. So only putting the function call's arguments (which are passed by value (!) to the stack already adds 0.5 megabytes of stack consumption.

@koraa
Copy link
Member Author

koraa commented Dec 29, 2023

Ok, I found the first problem that causes fuzz failure:

https://github.com/rosenpass/rosenpass/blob/a477d2d791e4b542b6469e1bf1e38f4fafa9fea3/fuzz/fuzz_targets/mceliece_encaps.rs fails, because StaticKem::encaps is passed an empty slice as sk. In particular, this assert_eq! is the culprit. Now I'm wondering, if we should either

* statically ensure that a slice of the correct length is passed (by making the arg a `&[u8; Self::SK_LEN]`

* handle the error dynamically using Result, instead of panicking on wrong buffer size

* limit the fuzzer to only fuzz with inputs of correct length

Anyhow I added a test case for the first of the two failures to this branch.

I would say this is correct behavior, so the fuzzing code needs adjusting…

@koraa
Copy link
Member Author

koraa commented Dec 29, 2023

I am knee-deep into GDB at this point due to the handle_msg failure.

I can see calls such as the following:

#0  GetTLSFakeStack () at /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_fake_stack.cpp:176
#1  GetFakeStackFast () at /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_fake_stack.cpp:193
#2  OnMalloc () at /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_fake_stack.cpp:207
#3  __asan_stack_malloc_6 () at /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_fake_stack.cpp:261
#4  0x0000555556286de0 in rosenpass::protocol::CryptoServer::handle_msg (self=0x3e80, rx_buf=..., tx_buf=...)
    at rosenpass/src/protocol.rs:785

I think somehow the address sanitizer or the LeakSanitizer might be interfering with memsec. Memsec uses (standard rust) malloc and mlock whereas sodium calls mmap directly (I think), so this may be the reason for the issue not having turned up in sodium.

grim_1703857943

@koraa
Copy link
Member Author

koraa commented Jan 2, 2024

Issue found and fixed. Check commit log.

@koraa koraa force-pushed the dev/karo/nosodium branch 2 times, most recently from bb06c05 to 9f2a890 Compare January 2, 2024 19:13
@koraa
Copy link
Member Author

koraa commented Jan 2, 2024

@wucke13 Please review and merge

@wucke13
Copy link

wucke13 commented Jan 3, 2024

@koraa Impressive work. This looks mostly good to me. I have two remaining suggestions:

  1. Shall we close Crash (segfault) with AddressSanitizer, LeakSanitizer, or libFuzzer? quininer/memsec#14 again (with an explanation), now that we know this was an issue with address sanitizer + lazy_static?
  2. There is a bunch of small clippy lints, I would like to push one additional commit that fixes them.

Just give me thumbs if you agree, or comment if you have other suggestions.

@thaodt
Copy link

thaodt commented Jan 3, 2024

  1. (with an explanation), now that we know this was an issue with address sanitizer + lazy_static?

can you elaborate more on it @wucke13 ?

koraa and others added 7 commits January 3, 2024 18:36
Had to fix the tests in util/src/result.rs.
The new secret memory pool was causing CI failures in the fuzzing code,
due to the fuzzer compiling its binaries with memory sanitizer support.

https://doc.rust-lang.org/beta/unstable-book/compiler-flags/sanitizer.html

Using lazy_static was – intentionally – introducing a memory leak, but the
LeakSanitizer detected this and raised an error.

Now by using thread_local we are calling the destructors and so – while still being a
memory leak in practice – the LeakSanitizer no longer detects this behaviour as an error.

Alternatively we could have used a known-leaks list with the leak-sanitizer, but this would have increased the complexity of the build setup.

Finally, this was likely triggered with the migration to memsec, because libsodium circumvents the malloc/free calls,
relying on direct calls to MMAP.
Somehow in the past while splitting into many crates, we broke the bench
setup. This commit both fixes it, and adds a CI job that ensures it is
still working to avoid such silent failure in the future. The benchmarks
are not actually run, they would take forever on the slow GitHub Actions
runners, but they are at least compiled.
Clippy would not automatically apply these fixes, so they were applied
by hand.
@wucke13 wucke13 merged commit 62aa9b4 into main Jan 3, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants