Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz: add a fuzz target for Miniscript decoding #78

Closed
wants to merge 2 commits into from

Conversation

darosior
Copy link
Contributor

This isn't compiled by default in this repo since i didn't want to pull the whole fuzzing stuff from Bitcoin Core for something that should be upstreamed soon.

A minified fuzzing corpus for this target can be found at https://github.com/darosior/qa-assets/tree/miniscript_decode_corpus/fuzz_seed_corpus/miniscript_decode .

Note that there is already a fuzz target for parsing descriptors so there is no need to add one specific to Miniscript. I'll grow a Miniscript corpus for this when i'm done with the Miniscript integration in the descriptor wallet.

/** A class encapulating conversion routines for CPubKey. Since we can't have access to a
* hash->pubkey mapping, and since fuzzing the conversion itself here would be useless
* computation we use a static dummy public key and its hash. */
struct MockedKeyConverter {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's worth having 2 or 3 of each (keys/hashes/...) so that it can test not only script structure, but also that used keys/etc match? This may not matter now, but I expect we may want to extend this test later (post integration into Core) so that the produced satisfactions actually get verified against the script interpreter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After implementing the fuzz target checking against the script interpreter, i modified this using a few dozens precomputed keys, hashes, and dummy signatures. Pretty close to what you did in the unit test setup.

For what it's worth, the commit implementing the satisfaction check against the interpreter on top of this and the signing support is at darosior/bitcoin@ef7c3f0 .

bitcoin/test/fuzz/miniscript_decode.cpp Outdated Show resolved Hide resolved
bitcoin/test/fuzz/miniscript_decode.cpp Outdated Show resolved Hide resolved
bitcoin/test/fuzz/miniscript_decode.cpp Outdated Show resolved Hide resolved
@darosior
Copy link
Contributor Author

Updated to remove assertSameOps (assert the script are equal instead) since we don't use mocks anymore. Also removed an unnecessary copy.

Comment on lines 155 to 157
// The Script representation must roundtrip.
const CScript ms_script = ms->ToScript(SATISFIER);
assert(ms_script == *script);
Copy link
Contributor Author

@darosior darosior Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's wrong: i had noticed it's not actually the case when coding the Python implem, but forgot about it. It must however be pretty rare as i ran this target a decent amount of time, found a number of other bugs but this assert never failed.

EDIT: Actually it isn't, just the fuzzing target didn't iter over such a script. It's good to know..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaand in the end i actually think it is right. It's because this target works by first decoding from Script. I think:

  • This doesn't hold: "encoding a Ms to Script and then decoding it must give the same representation"
  • But this does: "decoding a Script to Ms and the re-encoding it must give the same script"

Comment on lines 19 to 22
struct Satisfier: BaseSignatureChecker {
typedef CPubKey Key;

// Precomputed public keys, and a dummy signature for each of them.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses a single object as holder of the precomputed data, converter ctx, satisfier ctx, and checker ctx. I guess the approach in #94 is cleaner..

@darosior
Copy link
Contributor Author

darosior commented Feb 10, 2022

I'm confused. The satisfier here checks against the stored keys and dummy sigs but those are never added to the Miniscript. They previously were with the MockedKeyConverter, though.

Yeah, i don't think the satisfaction testing logic here is really useful as it is.

Signed-off-by: Antoine Poinsot <darosior@protonmail.com>
The miniscript_decode fuzz target demonstrated this could lead to
crashes.
@darosior
Copy link
Contributor Author

I've significantly simplified this. It's now a dead simple target, parsing scripts from the fuzzer output and making sure we don't crash in decoding it and if it's valid that we can roundtrip. I think fuzzing the satisfaction logic here would be useless (short of replacing all pubkeys in a Script by mocked keys, and that would be very redundant (and less efficient) than the targets generating random fragments that we are currently working on), and that it's useful on its own (it found a missing bound check already). And we have a lot of seeds for scripts in qa-assets, so it makes use of them.

@sipa
Copy link
Owner

sipa commented Feb 16, 2022

Concept ACK on just testing for decode/roundtrip in this test.

I wonder if maybe we want it to be a bit smarter in constructing the input (e.g. use a loop that chooses script opcodes/pushes among the ones relevant to miniscript). Or do you think this is already constructing good coverage?

@darosior
Copy link
Contributor Author

It already found few bugs, so i think it is useful as is anyhow. Maybe it could be optimized, but i'm not sure how yet and could be left to post integration into Core.

Regarding the suggestion of using a loop that chooses the opcodes/pushes i think it would just be redundant: the current code cuts-through when it encounters an unknown opcode (exactly what we'd do in the loop):

} else {
// Unrecognised expression
return {};

Still i ran a (source-based) coverage report on this target with the corpus i currently have, and i think it's good. You can visualize it here.

@sipa
Copy link
Owner

sipa commented Feb 17, 2022

@darosior My primary concern is that this approach slows things down by consuming relatively many bytes for pubkeys/hashes, whose contents doesn't actually contribute to the usefulness of the test.

But fair enough, this can be addressed later.

@darosior darosior mentioned this pull request Feb 18, 2022
4 tasks
@sipa
Copy link
Owner

sipa commented Feb 18, 2022

Needs rebase.

@darosior
Copy link
Contributor Author

Superseded by #106, which is much nicer.

@darosior darosior closed this Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants