Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upProperty testing with quickcheck #1159
Conversation
highfive
assigned
fitzgen
Nov 23, 2017
This comment has been minimized.
This comment has been minimized.
|
Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @fitzgen (or someone else) soon. |
highfive
added
the
S-awaiting-review
label
Nov 23, 2017
This comment has been minimized.
This comment has been minimized.
|
@snewt thanks very much for this PR! And thanks for your patience -- I've been on vacation the last week, which is why you haven't heard back from me. I'll try to take a look at the code and answer your open questions today, but it might get pushed back to tomorrow. |
fitzgen
requested changes
Nov 28, 2017
|
Overall, this looks awesome -- thank you! Very close to what I was hoping for. I left a lot of nitpick-y comments on the PR, this is not intended as harsh I am very excited for this!
I think instead of generating types named We should avoid blacklisting (at least for now) because it intentionally creates We should however also randomly mark types as opaque. As far as tracking the names that we've generated, we can either do that:
We can do all of this whitelisting and scope tracking in a follow up PR, so we
Perfect!
Its fine for this to be a separate crate that is invoked separately. To get We should add a new CI job that checks that this crate continues to build, at To add the new CI job:
- LLVM_VERSION="4.0.0" BINDGEN_JOB="quickchecking"
"quickchecking")
cd ./tests/quickchecking
# TODO: Actually run quickchecks once `bindgen` is reliable enough.
cargo check
;;
This is probably because Backing up a bit: it would be kind of cool if this whole quickchecking crate was
Sounds good, we just need to make sure we don't clobber any existing failing
This is the pragmatic approach. It would be kind of nice to have cargo features control this, and by default we
Maybe eventually?
The big next piece is whitelisting and marking types opaque. Also bitfields. Generating some C++ and templates and inheritance further down the line. In general, if you go through the issue tracker (particularly
Thank you! Very excited to see the next iteration of this PR! |
| 1 => DeclarationC::FunctionPtrDecl(FunctionPointerDeclarationC::arbitrary(g)), | ||
| 2 => DeclarationC::StructDecl(StructDeclarationC::arbitrary(g)), | ||
| 3 => DeclarationC::UnionDecl(UnionDeclarationC::arbitrary(g)), | ||
| _ => DeclarationC::VariableDecl(BasicTypeDeclarationC::arbitrary(g)), |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
To be more clear of our intent with this match, lets do:
4 => DeclarationC::VariableDecl(...),
_ => unreachable!(),| } | ||
|
|
||
| trait MakeUnique { | ||
| fn make_unique(&mut self, stamp: usize); |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
I'm not clear on what abstraction this trait represents. In general, some doc comments in this file would be super useful. I like to turn on #![deny(missing_docs)] as a forcing function for myself, and I think that would be beneficial here as well.
| impl MakeUnique for DeclarationC { | ||
| fn make_unique(&mut self, stamp: usize) { | ||
| match self { | ||
| &mut DeclarationC::FunctionDecl(ref mut d) => d.make_unique(stamp), |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
It is slightly more idiomatic (and slightly shorter) to do
match *self {
DeclarationC::FunctionDecl(ref mut d) => ...,
...
}|
|
||
| impl Arbitrary for DeclarationC { | ||
| fn arbitrary<G: Gen>(g: &mut G) -> DeclarationC { | ||
| match usize::arbitrary(g) % 5 { |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
https://docs.rs/quickcheck/0.4.2/quickcheck/trait.Rng.html#method.gen_range is going to have a proper uniform distribution, where mod won't unless n is a divisor of usize::MAX, which it isn't in this case (5).
| "whitelistable", | ||
| "blacklistable", | ||
| ]; | ||
| match base_type.iter().nth(usize::arbitrary(g) % base_type.len()) { |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
https://docs.rs/quickcheck/0.4.2/quickcheck/trait.Rng.html#method.choose
g.choose(&base_type).cloned().unwrap()| for _ in 1..dimensions { | ||
| def += &format!("[{}]", (usize::arbitrary(g) % 15) + 1); | ||
| } | ||
| ArrayDimensionC { def: def } |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
Any time we have foo: foo inside a struct literal, we can use this shorthand:
ArrayDimensionC { def }|
|
||
| impl MakeUnique for BasicTypeDeclarationC { | ||
| fn make_unique(&mut self, stamp: usize) { | ||
| self.ident_id += &format!("_{}", stamp); |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
Ahhh ok so this trait is to make unique identifiers! Yeah, doc comments on the trait's definition would have been super helpful to me as a reader.
| impl Arbitrary for BasicTypeDeclarationC { | ||
| fn arbitrary<G: Gen>(g: &mut G) -> BasicTypeDeclarationC { | ||
| BasicTypeDeclarationC { | ||
| type_qualifier: TypeQualifierC::arbitrary(g).def, |
This comment has been minimized.
This comment has been minimized.
fitzgen
Nov 28, 2017
Member
Is there some intention behind pulling out the strings early, rather than making BasicTypeDeclarationC's type_qualifier be a TypeQualifierC, etc? Was it just easiest? Perhaps to break type recursion?
What I would naively expect would be having structured types instead of strings, well placed Boxes to break type recursion, and the Display implementations handling all stringification. For example, I'd expect a StructDeclarationC to have a Vec of StructFieldC rather than a string concatenation of its fields, and with something like PointerLevelC, I wouldn't expect it to contain a String, but instead just a number, and then its Display would do
for _ in 0..self.level {
write!(f, "*")?;
}In general, leveraging types and the type system as far as it will go and as long as we can before devolving into string concatenation (bash programming amirite?) is a good rule of thumb. It gets the compiler to double check that we don't do things like provide a PointerLevelC where we expect a TypeQualifierC; with strings the compiler can't help us here.
So, unless there is some underlying reason why we can't abide by this rule in this case, I think we should move to more structured types instead of strings for all of the various BlahC definitions.
| fn arbitrary<G: Gen>(g: &mut G) -> StructDeclarationC { | ||
| let mut fields_string = String::new(); | ||
| // reduce generator size as a method of putting a bound on recursion. | ||
| // when size < 1 the empty list is generated. |
This comment has been minimized.
This comment has been minimized.
| .output()?) | ||
|
|
||
| // omit close, from tempdir crate's docs: | ||
| // "Closing the directory is actually optional, as it would be done on drop." |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@fitzgen thanks for the awesome and thorough feedback, those are exactly sort of things I hoped / needed to hear. I'll keep you posted on progress or questions. Ha also I'm all about "nitpick-y" comments, I really appreciate getting familiar with and having the opportunity to live up to this project's code ideals! |
This comment has been minimized.
This comment has been minimized.
|
Address requested changes to quickchecking crate.
Note on making sure we don't lose test cases we're interested in preserving:
I'm not sure that Tasks not addressed by this PR:
Figured this would be a good point to update the PR but if any of the above TODO Thanks for taking another look! r? @fitzgen |
This comment has been minimized.
This comment has been minimized.
|
For reference, code that was generated after PR change requests were addressed is here. |
shnewto
force-pushed the
shnewto:feat/970-quickcheck-fuzzing
branch
from
edc7062
to
2aa9b1d
Nov 30, 2017
This comment has been minimized.
This comment has been minimized.
This seems good enough to me. |
fitzgen
approved these changes
Nov 30, 2017
|
This is awesome!! Care to file follow up issues for the remaining TODO items? Thanks |
This comment has been minimized.
This comment has been minimized.
|
@bors-servo r+ |
This comment has been minimized.
This comment has been minimized.
|
|
highfive
added
S-awaiting-merge
and removed
S-awaiting-review
labels
Nov 30, 2017
This comment has been minimized.
This comment has been minimized.
bors-servo
added a commit
that referenced
this pull request
Nov 30, 2017
This comment has been minimized.
This comment has been minimized.
|
@fitzgen ah awesome! yeah I can definitely file the follow up todos. Thanks!! |
This comment has been minimized.
This comment has been minimized.
|
|
shnewto commentedNov 23, 2017
•
edited
This PR represents an attempt to address issue #970. It also represents a portion of the meta issue for fuzzing #972.
The code base reflected here uses quickcheck to generate C headers that
include a variety of types including basic types, structs, unions,
function prototypes and function pointers. The headers generated by quickcheck
are passed to the
csmith-fuzzing/predicate.pyscript. Examples of headersgenerated by this iteration of the tooling can be viewed
here.
At the top of each header are two simple struct definitions,
whitelistableandblacklistable. Those types are present in the vector thatrepresents otherwise primitive types used to generate. They represent a naive
approach to exposing custom types without having to intuit generated type names like
struct_21_8though any actual whitelisting logic isn't implementedhere.
Test success is measured by the success of the
csmith-fuzzing/predicate.pyscript. This means that for a test to pass the following must be true:
Usage
Some things I'm unsure of:
Where should this feature live?
At the moment it lives in
tests/property_testbut isn't run whencargo testis invoked from bindgen's cargo manifest directory.What's an acceptable ammount of time for these tests to take?
At this point, the source is genereated in ~1 second but the files are
large enough that it takes the
predicate.pyscript ~30 seconds to runthrough each one. In order for the tests to run in under a minute only 2 are
generated by quickcheck by default. This can be changed in the
test_bindgenfunction of the
tests/property_test/tests/fuzzed-c-headers.rsfile.How do we expose the generated code for easy inspection?
For now the
run_predicate_scriptfunction in thetests/property_test/tests/fuzzed-c-headers.rsfile contains acommented block that will copy generated source in the
tests/property_test/testsdirectory. Should it be easier?
Special casing
There is some logic in the fuzzer that disallows 0 sized arrays because
tests will regulary fail due to issues documented in #684 and #1153. Should
this be special casing?
Does the fuzzer warrant its own crate?
After any iterations the reviewers are interested in required to make
this a functional testing tool, should/could the fuzzing library be made into
its own crate? I didn't move in that direction yet because having it all in one
place seemed like the best way to figure out what works an doesn't but I'm
interested in whether it might be useful as a standalone library.
What does it look like to expose more useful functionality?
I'm looking forward to feedback on how to make this a more useful tool
and one that provides the right configurability.
Thanks!
r? @fitzgen