Add fuzz test to ensure venial parses everything that syn does #4

loiclec · 2022-03-18T12:01:56Z

Hello :)

This is the PR adding fuzzing to venial. It is not perfect, but on my machine it works well enough to find some differences. I will continue to improve it if you are happy with the general idea/implementation.

Please let me know if anything fails to compile on your machine. I have had a few problems recently due to new rust toolchains and different OSes handling code coverage instrumentation differently. Usually it can be fixed somewhat easily though.

How to use it

The fuzzer uses an unreleased version of fuzzcheck. It is recommended to first install the cargo-fuzzcheck command line tool, as follows:

cargo install --git "https://github.com/loiclec/fuzzcheck-rs"

And it is also necessary to use the nightly version of the Rust compiler.

 # install the nightly toolchain, then:
rustup override set nightly

Then you can run:

cargo fuzzcheck fuzz::fuzz_parse --profile fuzzing

Updates about the progress are printed in the terminal, for example:

5972ms  242234 simplest_cov(609 cov: 2720/4456 cplx: 63.19) failures(1) iter/s 40438

Some files will be printed inside the fuzz/ folder. If a test failure is found, it will be located inside fuzz/artifacts/. You can stop the fuzzer at any point by pressing Ctrl+C.

If you find a test failure but you think it is too large to really understand what is going on, you can run the following to minify the test failure:

cargo fuzzcheck fuzz::fuzz_parse --command minify --input-file "path/to/artifact.rs"

That will run the fuzzer repeatedly, each time with a reduced upper bound on the complexity of the generated token streams. The minified artifacts will be located in path/to/artifact.rs.minified/{cplx}-{hash}.rs. When it seems like no progress is being made anymore, press Ctrl+C and look for the simplest artifact in the folder.

About the code

There is only one property being tested so far, which checks that venial can parse everything that syn does. This code is located in fn test_parse_declaration(..). It also only parses DeriveInput and not ItemFn, because as far as I understand function parsing isn't as polished yet in venial.

The code that actually launches the fuzz test is inside #[test] fn fuzz_parse(). I have commented it lightly, but let me know if you have any questions about it. You can also read about how fuzzcheck works in general at fuzzcheck.neocities.org . The main difficulty was to observe the code coverage of both venial and syn. So instead of using the convenience functions offered by fuzzcheck::builder, I have created a “sensor” and a “pool” manually. In the future, I will improve fuzzcheck’s APIs so that it isn't necessary.

Limitations of the fuzzer

There are a few limitations of the fuzzer, some of which can be fixed easily, and some of which are just the results of a compromise between “diversity of generated token streams” and performance. The limitations are:

Just a few hardcoded identifiers are generated. These include all Rust keywords as well as some simples ones like a, b, c. The complexity of all identifiers is the same. That is, fuzzcheck considers auto and a to be equally complex. So it won't minify e.g. struct auto {} into struct a {}
No raw identifiers are generated.
Similarly, just a few hardcoded literals are generated as well. That is because I don't think venial cares about the content of the literals. If that is not true, let me know.
No groups with Delimiter::None are generated. That is because I don't yet have a grasp on what the purpose of Delimiter::None is. Furthermore, allowing Delimiter::None to be generated means that we have to use a more verbose serialisation format for the artifacts. That is because there is no visible difference between e.g. an identifier: ident and an identifier in a group without delimiters: ident.
Almost no awareness of the Rust grammar is used (yet). The fuzzer doesn't try to generate valid Rust syntax. That is partly by design, and partly because it is additional work. It would be nice to prioritise combination of punctuations that we know have a meaning in Rust (e.g. ->, ..=, ..., ::, etc.). I will work on that at some point later.
Some of the more advanced functionality of fuzzcheck isn't used. It is possible to optimise for different goals, such as finding a set of N test cases that cover the most code, or finding the test case that allocates the most, or has the highest time complexity, etc.

That's it, I think?

Let me know if it works for you. If you try it now, you should quickly (i.e. in 2 minutes at most) find a failing test case:

enum a { a = () > () }

PoignardAzur · 2022-03-18T14:29:13Z

Let me know if it works for you. If you try it now, you should quickly (i.e. in 2 minutes at most) find a failing test case:
enum a { a = () > () }

I need to kill whoever thought using angle brackets for generics in Rust was a good idea.

loiclec · 2022-03-18T14:33:13Z

For what it's worth, I think it would be completely fine to only allow literals, identifiers, or expressions within braces as the discriminant of an enum variant, since wrapping an expression inside curly braces is always possible and never changes the meaning of the code. So the type above can always be rewritten as:

enum a { a = { () > () } }

This would also get rid of the closure expression problem.

PoignardAzur · 2022-03-18T15:06:49Z

I don't think that's a satisfying solution. It breaks the "accept any code that compiles" invariant.

In any case, I want venial to parse expressions at some point, so I'll have to handle these cases anyway.

PoignardAzur · 2022-03-18T15:21:13Z

Codes LGTM.

Quick question: instead of generating variations of TokenStream and checking if syn parses them, have you considered generating variations of syn::DeriveInput instead? Although I guess orphan rules mean you can't implement your trait on syn types.

loiclec · 2022-03-18T15:55:06Z

Quick question: instead of generating variations of TokenStream and checking if syn parses them, have you considered generating variations of syn::DeriveInput instead? Although I guess orphan rules mean you can't implement your trait on syn types.

Yes, that's also possible and I have considered it. The orphan rule wouldn't be a problem since in order to fuzz-test a type T, we need a separate type that implements Mutator<T>, which can be done outside the original crate.

But it's more work and would only produce syntactically correct Rust code. For this property test in venial, it seems like this is indeed what we want, but I thought that was not the case in general. For example, the infinite loop bug may not have been found by generating a valid syn::DeriveInput and then converting it to a token stream.

If I find the time and motivation, I will also write a Mutator<syn::DeriveInput>. Then we can have two fuzz-tests: one for syntactically correct Rust code, and one for quasi-random token streams.

PoignardAzur · 2022-03-18T16:23:57Z

Thanks, that's helpful =)

Quick warning: I've been working non-stop on venial for two weeks, a lot more than I expected for something that was supposed to be a side project. I'm probably not going to have much time to read PRs or fix bugs in the coming weeks. They're still very much appreciated.

loiclec · 2022-03-18T16:42:09Z

Oh, no worries at all! I meant to write most of this code anyway, and venial was a very convenient target to test the fuzzer, it even helped me fix some bugs in fuzzcheck. So don't feel any pressure to review/fix/merge anything :-)

PoignardAzur · 2022-03-30T13:02:25Z

Cargo.toml

+inherits = "release"
+codegen-units = 1
+lto = "thin"
+debug = false


Shouldn't the fuzzing profile have debug = true instead, to catch more potential panics?

Yes, that's correct. Not sure why it's set to false, I must have been experimenting with different values and then forgot to turn it back to true.

PoignardAzur · 2022-03-30T13:23:07Z

I've rebased and merged the PR.

PoignardAzur · 2022-03-30T13:30:25Z

I may have been too hasty.

When trying to compile fuzz tests, I get this error:

test fuzz::fuzz_parse ... thread 'main' panicked at 'failed to properly link the different LLVM coverage sections: CannotFindFunctionRecordAssociatedWithPrfData("This can sometimes happen when the crate is incrementally compiled with more than one codegen-unit. Try adding codegen-units = 1 or incremental = false to the appropriate profile (release or fuzzing) in Cargo.toml")', /home/olivier-faure/.cargo/git/checkouts/fuzzcheck-rs-531eb3f95f61a5dd/b65b30b/fuzzcheck/src/code_coverage_sensor/mod.rs:71:14

loiclec · 2022-03-30T15:08:21Z

oh, sorry about that. I will have a look at it later this week

…

On 30 Mar 2022, at 15:30, Olivier FAURE ***@***.***> wrote: I may have been too hasty. When trying to run fuzz tests, I get this error: test fuzz::fuzz_parse ... thread 'main' panicked at 'failed to properly link the different LLVM coverage sections: CannotFindFunctionRecordAssociatedWithPrfData("This can sometimes happen when the crate is incrementally compiled with more than one codegen-unit. Try adding codegen-units = 1 or incremental = false to the appropriate profile (release or fuzzing) in Cargo.toml")', /home/olivier-faure/.cargo/git/checkouts/fuzzcheck-rs-531eb3f95f61a5dd/b65b30b/fuzzcheck/src/code_coverage_sensor/mod.rs:71:14 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

PoignardAzur · 2022-04-11T15:08:15Z

Any luck?

loiclec · 2022-04-13T10:19:23Z

Ah, yes, sorry for not getting back to you. I couldn't reproduce the bug unfortunately :(

There are a couple things to try:

did you try running cargo clean and then re-build from scratch again?
are you using the fuzzing profile? (i.e. run cargo fuzzcheck fuzz::parse --profile fuzzing) ?
Could you provide more information on your setup? (OS, architecture, rust version)?

PoignardAzur · 2022-04-13T12:54:25Z

Running the command cargo fuzzcheck fuzz::fuzz_parse --profile fuzzing after a full clean gives me the same panic as above (CannotFindFunctionRecordAssociatedWithPrfData).

A good first step might be to include more info in that panic message.

Current target: nightly-x86_64-unknown-linux-gnu
Compiler: rustc 1.62.0-nightly (1f7fb6413 2022-04-10)
OS: Pop_OS 21.10

PoignardAzur · 2022-04-25T09:30:07Z

@loiclec

I couldn't reproduce the bug unfortunately :(

To be clear, did you try to reproduce it on MacOS, Linux, or both?

PoignardAzur · 2022-05-18T09:29:23Z

Hi, any progress?

PoignardAzur · 2022-05-25T14:58:56Z

Alright, since I want to get done with venial, I'm going to close this PR and remove the fuzz code for now. I may re-add the code if it's shown to work on Linux.

Add fuzz test to ensure venial parses everything that syn does

508c3d8

PoignardAzur reviewed Mar 30, 2022

View reviewed changes

PoignardAzur closed this Mar 30, 2022

PoignardAzur reopened this Mar 30, 2022

PoignardAzur mentioned this pull request Apr 21, 2022

Fuzzcheck can't run on NixOS loiclec/fuzzcheck-rs#29

Open

PoignardAzur closed this May 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fuzz test to ensure venial parses everything that syn does #4

Add fuzz test to ensure venial parses everything that syn does #4

loiclec commented Mar 18, 2022 •

edited

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022 •

edited

PoignardAzur commented Mar 18, 2022

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022

PoignardAzur Mar 30, 2022

loiclec Apr 2, 2022

PoignardAzur commented Mar 30, 2022

PoignardAzur commented Mar 30, 2022 •

edited

loiclec commented Mar 30, 2022 via email

PoignardAzur commented Apr 11, 2022

loiclec commented Apr 13, 2022

PoignardAzur commented Apr 13, 2022

PoignardAzur commented Apr 25, 2022

PoignardAzur commented May 18, 2022

PoignardAzur commented May 25, 2022

Add fuzz test to ensure venial parses everything that syn does #4

Add fuzz test to ensure venial parses everything that syn does #4

Conversation

loiclec commented Mar 18, 2022 • edited

How to use it

About the code

Limitations of the fuzzer

That's it, I think?

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022 • edited

PoignardAzur commented Mar 18, 2022

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022

PoignardAzur commented Mar 18, 2022

loiclec commented Mar 18, 2022

PoignardAzur Mar 30, 2022

Choose a reason for hiding this comment

loiclec Apr 2, 2022

Choose a reason for hiding this comment

PoignardAzur commented Mar 30, 2022

PoignardAzur commented Mar 30, 2022 • edited

loiclec commented Mar 30, 2022 via email

PoignardAzur commented Apr 11, 2022

loiclec commented Apr 13, 2022

PoignardAzur commented Apr 13, 2022

PoignardAzur commented Apr 25, 2022

PoignardAzur commented May 18, 2022

PoignardAzur commented May 25, 2022

loiclec commented Mar 18, 2022 •

edited

loiclec commented Mar 18, 2022 •

edited

PoignardAzur commented Mar 30, 2022 •

edited