Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Fuzzing siphash against reference implementation [Request for feedback] #19920

Closed
wants to merge 2 commits into from

Conversation

elichai
Copy link
Contributor

@elichai elichai commented Sep 8, 2020

Hash functions are complex, and even though the implementations look simple they can easily go wrong, usually in the buffer handling around the hash function itself(the "Writer") examples:
Rewriting the last byte that was written, processing the buffer too "early" when Write was called with exactly a full buffer(different hash functions require different behavior in these cases), the wrong amount of zeros were written in the case the buffer was exactly full when finalizing, and more.

I've personally found bugs in an implementation of a hash function via fuzzing against a reference implementation (the bug was actually in the ref impl), and there's a lot of precedent (See https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=924038 for the amount of bugs just in the reference implementations submitted to NIST's SHA3 competition)

This can also make PRs like #18014 and future SHA256 optimizations easier to review and be confident in.

I started with siphash specifically because it's small and simple so I can get feedback on this before continuing to SHA256 etc.
The downside of this method is that it means committing another implementation of the same thing separately.

About the implementation itself:
I've used a constant seperator so it would write each time different sizes, and sometimes even empty writes. and it's constant so coverage based fuzzers can easily figure this out and make the inputs cover all the branches.

Any feedback is welcome :)

@sipa
Copy link
Member

sipa commented Sep 8, 2020

I think including a full other reference implementation is overkill.

Every byte array as input has exactly one hash, regardless of how it is split up in chucks and hashed. You can verify the "hash everything at once" code path using test vectors. These are usually published by the same body that standardized the function, and are picked to give good coverage. By the nature of hashing, it tends to be sufficient to test things with a random inputs for a variety of lengths, so that the padding code gets exercised - but beyond that, the actual hashed data is of little relevance, as tiny errors in the actual hashing logic should mean a completely different hash anyway.

As you point out, the buffering logic is often far more error-prone, and harder to test. I think it's a useful target for fuzzing, but all you need for that is a fuzzer that takes as input a number of chunks, and then hashes them both concatenated at once, and split up as separate writes, and compares the result.

@practicalswift
Copy link
Contributor

practicalswift commented Sep 8, 2020

Concept ACK

This is excellent!

BTW, if you're interested in differential cryptography fuzzing you should check out @guidovranken's really cool Cryptofuzz project.

The list of bug trophies for Cryptofuzz is amazing! Cryptofuzz has found bugs in OpenSSL, LibreSSL, BoringSSL, libgcrypt, Crypto++, NSS, Botan, wolfCrypt, sjcl, SymCrypt, mbed TLS, etc -- more or less everything under test :)

@catenacyber's elliptic-curve-differential-fuzzer project is really cool too.

@elichai
Copy link
Contributor Author

elichai commented Sep 8, 2020

As you point out, the buffering logic is often far more error-prone, and harder to test. I think it's a useful target for fuzzing, but all you need for that is a fuzzer that takes as input a number of chunks, and then hashes them both concatenated at once, and split up as separate writes, and compares the result.

I thought about this, but even if you test chuncks vs single write, you still go through the same writer with the same logic, so it might not find an off by one bug in the memcpy of one of the branches in Write/Finalize.

@sipa
Copy link
Member

sipa commented Sep 8, 2020

I thought about this, but even if you test chuncks vs single write, you still go through the same writer with the same logic, so it might not find an off by one bug in the memcpy of one of the branches in Write/Finalize.

That should be caught using test vectors, as the code path is identical for all inputs of the same length

@elichai
Copy link
Contributor Author

elichai commented Sep 8, 2020

That should be caught using test vectors, as the code path is identical for all inputs of the same length

You can see in the paper I've linked that some times the test vectors don't cover all cases (a bug that returned the same value everytime a message of size BLOCK_SIZE was written, no matter the content, they had only 1 vector with a message of that length, so no one noticed).
Also different implementations implement the buffering differently and thus might have different branching logic, so test vectors written to cover all branches of one implementation might not cover all branches of other implementations.

but I do agree that most bugs will be covered by test vectors + fuzzing hashing by chunks vs hashing in one shot

@practicalswift
Copy link
Contributor

practicalswift commented Sep 8, 2020

The nice thing about differential fuzzing as suggested by @elichai is that we don't have to make any assumptions about the implementation details of the function being fuzzed. Personally I think that benefit is worth the cost (in the form of an extra file in src/test/fuzz/).

@sipa
Copy link
Member

sipa commented Sep 8, 2020

You can see in the paper I've linked that some times the test vectors don't cover all cases (a bug that returned the same value everytime a message of size BLOCK_SIZE was written, no matter the content, they had only 1 vector with a message of that length, so no one noticed).

So include an test vector for every length, up to some multiple of the block size.

That's still far less effort than including/integrating a whole separate implementation and maintain it.

@sipa
Copy link
Member

sipa commented Sep 8, 2020

@practicalswift I think that spending time on blindly writing fuzzers without making any assumptions about the code you're testing is often a waste of time. You can be far more effective by focussing on things that aren't guaranteed by the source code, or are likely to trigger edge cases. And this isn't just your time, but also the time of people reviewing and maintaining the codebase later.

Fuzzing is good at finding combinations of inputs that trigger various code paths, so use it for that.

@practicalswift
Copy link
Contributor

@sipa

I think that spending time on blindly writing fuzzers without making any assumptions about the code you're testing is often a waste of time.

FWIW the fuzzers that found sipa/miniscript#7, sipa/miniscript#12, sipa/miniscript#13, sipa/miniscript#14, sipa/miniscript#15 and sipa/miniscript#25 were all written without making any assumptions about the code being tested :)

Don't trust: fuzz! :)

@sipa
Copy link
Member

sipa commented Sep 8, 2020

Don't put words in my mouth.

I'm not saying that you can't find issues that way. I'm saying you can be more effective at it if you understand the code better.

@guidovranken
Copy link
Contributor

Thanks for the heads up @practicalswift . I've implemented differential testing of Bitcoin's Siphash. No bugs found.

@sipa Untested but the use of int count may cause it to produce incorrect results or worse with inputs >= 2GB.

@maflcko maflcko added Tests and removed Build system labels Sep 9, 2020
@maflcko
Copy link
Member

maflcko commented Sep 9, 2020

I don't have any background in implementing hash functions, so I have no opinion on whether to only test that different chunk sizes on the same input reveal the same output or to test against a full reference impl. Maintaining a reference impl for hash functions should hopefully be of little overhead because we can assume they are well tested, bug free and stable, right?

Though, a requirement for merging this is that all tests pass. Right now they fail:

SUMMARY: UndefinedBehaviorSanitizer: null-pointer-use /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_iterator.h:820:16 in 

@practicalswift
Copy link
Contributor

practicalswift commented Sep 9, 2020

@guidovranken

@sipa Untested but the use of int count may cause it to produce incorrect results or worse with inputs >= 2GB.

Yes, a signed integer overflow takes place in src/crypto/siphash.cpp:

$ src/test/fuzz/simplest_possible_siphash_fuzzer -rss_limit_mb=8000 crash-061a172add013c03beedf57eb2a121a8289696af
crypto/siphash.cpp:56:10: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
$ cat src/test/fuzz/simplest_possible_siphash_fuzzer.cpp
#include <cstdint>
#include <vector>

#include <crypto/siphash.h>

void test_one_input(const std::vector<uint8_t>& buffer)
{
    CSipHasher(0, 0).Write(buffer.data(), buffer.size()).Finalize();
}

@elichai
Copy link
Contributor Author

elichai commented Sep 9, 2020

I don't have any background in implementing hash functions, so I have no opinion on whether to only test that different chunk sizes on the same input reveal the same output or to test against a full reference impl. Maintaining a reference impl for hash functions should hopefully be of little overhead because we can assume they are well tested, bug free and stable, right?

Though, a requirement for merging this is that all tests pass. Right now they fail:

SUMMARY: UndefinedBehaviorSanitizer: null-pointer-use /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_iterator.h:820:16 in 

Thanks! fixed.

@practicalswift
Copy link
Contributor

practicalswift commented Sep 9, 2020

Tested ACK 7f80381 -- the fuzzing harness is written in such a way that it is able to trigger a signed integer overflow:

$ src/test/fuzz/crypto_siphash -rss_limit_mb=16000 crash-b992191b7160e637042d9918c3dffc562bc8bb14
crypto/siphash.cpp:56:10: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior crypto/siphash.cpp:56:10 in

Thanks for doing this @elichai. Don't trust: fuzz! :)

@Crypt-iQ
Copy link
Contributor

Concept ACK - really cool to see differential fuzzing in action.

@elichai
Copy link
Contributor Author

elichai commented Oct 21, 2020

Fuzzing is good at finding combinations of inputs that trigger various code paths, so use it for that.

Using the fuzzer to find new test vectors and add them manually is a good idea. the problem is that every change in the logic (ie #18014) will require to re-run and generate new test vectors (and obviously test those against a reference implementation and against single full writes)

@laanwj
Copy link
Member

laanwj commented Nov 19, 2020

I think including a full other reference implementation is overkill.

Well, I'd normally agree, but looking at the actual code it's really small and placed in the test directory in any case. So, assuming there are no other problems such as licensing, I have no problems with it in that regard.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 4, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@practicalswift
Copy link
Contributor

Needs rebase @elichai :)

@practicalswift
Copy link
Contributor

practicalswift commented Mar 19, 2021

cr ACK e0d8090: patch still looks correct

Personally I think this fuzzing harness has already proven itself since it was able to trigger a now fixed signed integer overflow. No need to theorize when we have empirical evidence: practical is better than theoretical :)

Thanks @elichai for raising the quality bar!

@maflcko
Copy link
Member

maflcko commented May 6, 2021

How is this different from google/oss-fuzz#5717 ?

@fanquake
Copy link
Member

I think we can close this for now. I'm also ~0 on adding another implementation to this repository.

@fanquake fanquake closed this Mar 24, 2022
@maflcko
Copy link
Member

maflcko commented Mar 24, 2022

There is differential fuzzing in ./src/test/fuzz/crypto_diff_fuzz_chacha20.cpp IIRC, so I think it is generally ok to do differential fuzzing.

@bitcoin bitcoin locked and limited conversation to collaborators Mar 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants