Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz: replace cfg(fuzzing) weth cfg(bitcoin-hashes-fuzz). #1821

Merged
merged 3 commits into from May 2, 2023

Conversation

apoelstra
Copy link
Member

A custom cfg flag can be turned on or off by the user. Our current use of cfg(fuzzing) is impossible to turn off when using honggfuzz, which makes it impossible to fuzz without the broken crypto. This causes trouble for some downstream crates and also makes it hard for us to fuzz our own library.

Companion to rust-secp PR (TODO open this) which does effectively the same thing.

Fixes #1587.

@TheBlueMatt
Copy link
Member

Hmm, shouldn't the broken crypto be improved? Fuzzing with real crypto is very rarely what you want, at a minimum it's pretty slow!

@apoelstra
Copy link
Member Author

@TheBlueMatt yes, but doing this is probably a masters' thesis worth of work. Not crazy, but crazy enough that it's caused me to avoid doing it for multiple years. And meanwhile I need some sort of working crypto now to get my downstream crates to work.

@apoelstra
Copy link
Member Author

Maybe not a masters thesis, but certainly an honors thesis ;P.

Though the result would be pretty high-impact so I'd bet you could get a masters out of it..

@apoelstra
Copy link
Member Author

apoelstra commented Apr 28, 2023

I also think that we'd still need something like this PR since

  • If you need collision-resistance (e.g. for merkle structures) this is inherently at odds with wanting hashes that can be broken so the fuzzer can find successful hash challenges. In theory you could have a hash that's preimage-broken but not collision-broken but it's not obvious to me how you'd do this and I suspect the result would be still be pretty slow[*].
  • If you need data that matches the output of another library, or any source of truth, you need your keys and hashes to be real

[*] The ideas that come to mind involve using a real hash to get collision resistance, but directly encoding preimage data alongside it to make it easy to extract a preimage.

@sanket1729
Copy link
Member

I think the underlying issue in trying to fix fuzzing crypto is that we cannot really know how the fuzzing is used and what properties about the primitives we need to emulate. It is impossible to determine what properties of primitives users might use.

  • For hashes, a user might expect any random subset of bytes from hash output to be random. For example, first 8 bytes of sha256 as short id for small collision resistance.
  • For keys, user might have a special hard coded key with properties that can be computed via some other EC operations. Maybe some additive properties of an exotic holomorphic commitment scheme/key-tweaking etc.

at a minimum it's pretty slow!

computing sha256 is not that slow. It's very CPU friendly. And code being fuzzed can easily rely on it for collision resistance. For example, a taproot descrpitor. The fuzzer has been useful in rust-miniscript catching several bugs in round-tripping keys and descriptors.

While we are not checking the crypto part in the fuzzing, it is difficult/impossible to separate out code cleanly and fuzz the part without the crypto part.

The annoying thing is that this has caused some abrupt failures that are difficult to figure out. One example is where people have hit rust-bitcoin/rust-miniscript#245 Cryptographically unreachable code when deriving descriptor. We could not reproduce cleanly and only can assume it is because of broken crypto here. This the unreachable is because we were "somehow" able to obtain a scalar from some fuzzing code here.

To summarize,

  • I think writing improving broken crypto is a lost battle. There are many properties of fuzzing primitives that we cannot cleanly emulate.
  • When things break, it is really hard to debug things because we really need understand what's going on in broken crypto world.
  • Lastly, there is no option for users who actually want to use the crypto part in fuzzing but rely on this broken crypto. I believe at least giving us the option would be good start. cfg(fuzzing) basically forces us to use it.

@TheBlueMatt
Copy link
Member

I'm certainly open to making the broken crypto optional, and while I'm quite dubious that's the Right fix, it sounds like we agree it's not. The broken crypto as it stands was always a hack, having fuzzers with less coverage is equally broken so.... sure, as long as it's still optional cause we desperately need the performance of it.

For the rust-miniscript case specifically, is there really any need to swap out the secp broken crypto? I'd have thought it's roughly fine as-is.

computing sha256 is not that slow. It's very CPU friendly.

The definition of "slow" when fuzzing is quite different from normal code :p

@sanket1729
Copy link
Member

is there really any need to swap out the secp broken crypto?

As such we are not using any cryptography in there, just parsing keys/points etc. Some other downstream users use it for testing addresses, tweaking etc. I agree that we can improve help improve performance of it.

But given my low confidence in crypto fuzz code and some issues in the past, in the short term horizon having an option for fuzzing against non-fuzzing provides us with more benefits.

while I'm quite dubious that's the Right fix,

Agreed, that the correct fix should be to improve the code, but I it should still be optional and not mandated via cfg(fuzzing).

Kixunil
Kixunil previously approved these changes Apr 29, 2023
Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 393187b

fuzz/README.md Outdated
# Fuzzing with weak cryptography

You may wish to replace the hashing and signing code with broken crypto,
which will be faster and enable the fuzzer to do other-wise impossible
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other-wise

Never seen it written like this, always "otherwise", is this correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what I was thinking here :). Yeah, the dash shouldn't be there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROFL, a whole bunch of times over the last few years when reading your writing @apoelstra I've thought "damn this guy knows his English", this slip up "other-wise" made me hold my stomach laughing - so glad its not only me that does shit like this :)

fuzz/README.md Show resolved Hide resolved
fuzz/README.md Show resolved Hide resolved
@apoelstra
Copy link
Member Author

I'm certainly open to making the broken crypto optional,

Thanks Matt! I will take this as a concept ACK from you on this PR (and an equivalent one on rust-secp). Definitely agreed that we should improve the broken crypto....but as Sanket says the requirements are not obvious and probably different for different users. A proper fix might even result in multiple flags and a multi-page manual explaining what they do.

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst technically correct as is, I rekon it would make sense to change the line in bitcoin/lib.rs

#![cfg_attr(fuzzing, allow(dead_code, unused_imports))]

to not use fuzzing.

A mild point; I'm confused with the name bitcoin_hashes_fuzz - a dev may ask:

  • does this mean we are only fuzzing bitcoin_hashes, why no bitcoin_fuzz?
  • does this mean 'bitcoin' and 'hashes' _fuzz since we are really fuzzing both?

Perhaps we should use hashes_fuzz and comment in the readme that currently there is no special code required when fuzzing bitcoin so there is no need for --cfg=bitcoin_fuzz?

fuzz/README.md Outdated
Meanwhile, to use the broken crypto, simply compile (and run the fuzzing
scripts) with

RUSTFLAGS="--cfg=bitcoin_hashes_fuzz --cfg=secp256k1_fuzz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is wrong mate, https://github.com/rust-bitcoin/rust-secp256k1/pull/608/files uses sec256k1_fuzzing, we should pick either _fuzz or _fuzzing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, oops :) I'll change the other PR to use _fuzz.

fuzz/README.md Outdated
Comment on lines 29 to 30
Doing so may result in spurious bug reports since the broken crypto does
not respect the encoding or algebraic invariants as the real crypto. We
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence reads a bit clunky, perhaps

Suggested change
Doing so may result in spurious bug reports since the broken crypto does
not respect the encoding or algebraic invariants as the real crypto. We
Doing so may result in spurious bug reports since the broken crypto does
not respect the encoding or algebraic invariants upheld by the real crypto. We

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our CI should be using the new cfg, right? I think rust-bitcoin could use bitcoin_hashes_fuzz ? miniscript and others need the crypto to be disabled.

@apoelstra
Copy link
Member Author

Yeah, good idea. hashes itself should not use the cfg, but bitcoin should. And then our CI would cover both cases even.

apoelstra added a commit to rust-bitcoin/rust-secp256k1 that referenced this pull request May 1, 2023
9bdab89 change --cfg=fuzzing to --cfg=secp256k1_fuzz (Andrew Poelstra)

Pull request description:

  Companion PR to rust-bitcoin/rust-bitcoin#1821

ACKs for top commit:
  sanket1729:
    ACK 9bdab89

Tree-SHA512: 9ff5ce4cae99089f85a73a845cd5dca7b7d0ad9e73c0ee180e73fd9a55c6b92f21ad0192c8c0976e2e590be9d5d899b113b8b2006a3c53e0146a3ce5ba1450ec
@apoelstra
Copy link
Member Author

Updated to change cfg name to hashes_fuzz, to fix the README, and to change CI to use the new cfg parameters for fuzzing bitcoin. (But don't use them when fuzzing hashes, since we want to actually fuzz the hash library when we're fuzzing the hash library.)

We could re-introduce the "fuzz against RustCrypto" tests at some point, though I'm a little unenthusiastic about them because there's so little value in checking a hash library against another for more than 1 or 2 examples.

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ab4a48c

@Kixunil
Copy link
Collaborator

Kixunil commented May 2, 2023

I'm a little unenthusiastic about them because there's so little value in checking a hash library against another for more than 1 or 2 examples.

I agree, the code is probably too complex for a fuzzer to identify any but trivial differences.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ab4a48c

@@ -53,6 +53,7 @@ hashes_sha1,
override: true
profile: minimal
- name: fuzz
run: if [[ "${{ matrix.fuzz_target }}" =~ ^bitcoin ]]; then export RUSTFLAGS='--cfg=hashes_fuzz --cfg=secp256k1_fuzz'; fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the env var is kept across multiple run: statements? We should debug print it I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will go ahead and merge. We can add a debug print later on. But it would really surprise me if env vars were not preserved.

@apoelstra apoelstra merged commit c1c7675 into master May 2, 2023
10 checks passed
@Kixunil Kixunil deleted the 2023-04--rename-fuzzing branch May 2, 2023 19:05
@apoelstra apoelstra mentioned this pull request May 8, 2023
apoelstra added a commit that referenced this pull request May 10, 2023
a1aaf5f ci: fix run syntax in fuzz job (Andrew Poelstra)

Pull request description:

  I think our CI failures started with #1821 when we added an extra `run:` line to `fuzz.yml` without prefixing it with `-`.

  Fix the syntax to use a multi-line `run` instead.

ACKs for top commit:
  tcharding:
    ACK a1aaf5f
  Kixunil:
    ACK a1aaf5f

Tree-SHA512: 7214291dc629b6417f8dfaf4a68bfcdc23078a57759717cef92344e1873d475f597fe447e037d15c568a1d5b04ec764ebf8477d18274f52a4b93c50d8a1615f5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cfg(fuzzing) issues with custom implementations of data structures
5 participants