implement a generator based on Arbitrary for compatibility #108

Ekleog · 2023-01-07T02:32:21Z

Fixes #106
Fixes #31

bolero-generator/Cargo.toml

bolero-generator/src/arbitrary.rs

camshaft · 2023-01-09T18:32:56Z

bolero-generator/src/arbitrary.rs

+        let size = T::size_hint(0);
+        let mut data = match T::size_hint(0) {
+            (min, Some(max)) if max < ABUSIVE_SIZE => gen_with::<Vec<u8>>()
+                .len(min..=max)
+                .generate(&mut *driver)?,
+            (min, _) => gen_with::<Vec<u8>>().len(min).generate(&mut *driver)?,
+        };


While I appreciate not having to change anything, I'm thinking it might be more efficient to move this logic into the driver itself, since a lot of them are going to by byte-based anyway and we can potentially eliminate a copy. I'm thinking something like:

trait Driver { ... fn gen_from_bytes<F: FnOnce(&[u8]) -> Option<R>, R>(&mut self, len: Range<usize>, f: F) -> Option<R>; }

I'm also wanting to support existing proptest and quickcheck generators so moving this into a shared location would make that easier.

To be completely honest, I don't understand the DriverMode abstraction yet. I do understand that it's so that fuzzers can use Direct and cargo test can use Forced, but… why is it kept alongside the input for shrunk inputs, which indicates that byteslice drivers in forced mode do happen in certain circumstances? Why is it exposed from FillBytes and not from Driver? Why is there both DirectRng and ForcedRng despite the fact that a rng should have an infinite amount of data available anyway? (Ok for this one I actually know, it's because mode is also used for eg. ranged-int-generation) And, maybe even more than that, why is it even a property related to a Driver?

I'm thinking that semantically generating a value would require a ValueGenerator that shows how to generate a value, a Driver that gives a stream of data, and a Mode that says how to interpret invalid data. And right now, the Mode is merged inside the Driver.

But we could have instead:

ValueGenerator as currently

Driver generates a possibly-finite stream of data. ExtendWithZeros(driver) extends a driver with zeroes to emulate current Forced mode for byteslice and rng.

enum GenerationMode { FailIfInvalid, RepairIfInvalid } is the current DriverMode, and is not a property of the driver nor is it a property of the generator.

Then, when generating a value, it just needs additionally setting the driver mode. cargo test could still default to RepairIfInvalid & ExtendWithZeros and cargo bolero test to FailIfInvalid.

Does that make sense?

Anyway, I've implemented what you suggested in this PR, but it ended up requiring quite a few other refactorings. WDYT? (I've also significantly reduced ABUSIVE_SIZE, so that tests pass quickly without generating megabytes worth of data for a Vec. Anyway, arbitrary and similar were designed to work well with fuzzers, that rarely generate more than 4-8k worth of data)

Yeah there may be some conflation of the drivers and modes that could be simplified. But I don't want to conflate those changes with this PR. I think it looks great as is!

bolero-generator/src/arbitrary.rs

Ekleog · 2023-01-22T23:44:58Z

I've also just implemented the API suggested in #31, so that this PR could solve it alongside #106 :)

bolero/src/lib.rs

camshaft · 2023-01-24T20:29:18Z

bolero-generator/src/arbitrary.rs

+        let size = T::size_hint(0);
+        let mut data = match T::size_hint(0) {
+            (min, Some(max)) if max < ABUSIVE_SIZE => gen_with::<Vec<u8>>()
+                .len(min..=max)
+                .generate(&mut *driver)?,
+            (min, _) => gen_with::<Vec<u8>>().len(min).generate(&mut *driver)?,
+        };


Yeah there may be some conflation of the drivers and modes that could be simplified. But I don't want to conflate those changes with this PR. I think it looks great as is!

camshaft · 2023-01-25T17:37:36Z

bolero-kani/src/lib.rs

+        fn gen_from_bytes<Gen, T>(
+            &mut self,
+            _len: std::ops::RangeInclusive<usize>,
+            mut gen: Gen,
+        ) -> Option<T>
+        where
+            Gen: FnMut(&[u8]) -> Option<(usize, T)>,
+        {
+            let value = gen(kani::any()).map(|v| v.1);
+            kani::assume(value.is_some());
+            value
+        }


I don't think this will work with kani, since it doesn't implement any for &[u8] (I'm actually surprised it compiled - I will need to look into that separately).

Let's do this instead:

Suggested change

fn gen_from_bytes<Gen, T>(

&mut self,

_len: std::ops::RangeInclusive<usize>,

mut gen: Gen,

) -> Option<T>

where

Gen: FnMut(&[u8]) -> Option<(usize, T)>,

{

let value = gen(kani::any()).map(|v| v.1);

kani::assume(value.is_some());

value

}

fn gen_from_bytes<Gen, T>(

&mut self,

len: std::ops::RangeInclusive<usize>,

mut gen: Gen,

) -> Option<T>

where

Gen: FnMut(&[u8]) -> Option<(usize, T)>,

{

let bytes = kani::vec::any_vec::<u8, 256>();

kani::assume(len.contains(&bytes.len()));

let value = gen(&bytes).map(|v| v.1);

kani::assume(value.is_some());

value

}

It's probably due to the #[cfg(not(kani))] stubs at the top of the file. TBH I have literally no idea how to use kani, so I wrote something that seemed to work and tests passed 😅 I guess the CI also runs without kani enabled?

Anyway, I've updated the code to your version and just added a stub above so that tests pass :)

rust-toolchain

bolero-generator/Cargo.toml

cargo doesn't think of taking old arbitrary versions and otherwise fails with error: package `arbitrary v1.2.3` cannot be built because it requires rustc 1.63.0 or newer, while the currently active rustc version is 1.57.0

Ekleog · 2023-01-28T21:23:27Z

Turns out cargo fails to build arbitrary on earlier than 1.63: it can't think of taking older versions than the last, IIRC that's a bug that's already been reported to cargo.

I think bumping the MSRV make sense: there's no reason why bolero should be more stable than arbitrary I think, seeing how arbitrary is the current go-to standard 😅 and the next release must be a major one anyway given it changed the api of Driver.

That said if keeping the current MSRV at 1.57 is important to you, I can try to wiggle things around so that tests run with 1.57 for non-arbitrary and 1.63 for arbitrary :)

Ekleog · 2023-01-28T21:36:05Z

The failing clippy checks appear to be unrelated to this PR again. FWIW, this is probably due to just taking in the latest clippy rather than pinning a specific version and updating it from time to time. I personally use nix to handle this in my projects, and can submit a PR using it if you want, but lots of people don't like nix so I won't by default :)

As for the unit tests failing, they're due to the fact that I changed the behavior to always request from the arbitrary-requested range rather than limiting to ABUSIVE_SIZE when using DriverMode::Direct; which in turn completely breaks DirectRng-based arbitrary generation. I think the current code makes much more sense because it's actually in line with what DriverMode::Direct means. But I'm not yet sure just removing DirectRng is a possible option? So depending on the results of discussion at #117 I guess I may have to roll back these changes and limit the size given to arbitrary on direct-mode too.

Ekleog · 2023-01-28T23:30:13Z

Oh or maybe the clippy lints are due to the MSRV bump? Anyway, I've just fixed hopefully all the test errors, hopefully this is good to go if you agree with #108 (comment) :)

camshaft · 2023-02-01T00:58:18Z

.github/workflows/main.yml

@@ -44,7 +44,7 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        rust: [1.57.0, stable, beta, nightly]
+        rust: [1.63.0, stable, beta, nightly]


I don't see a compelling reason to bump MSRV here, especially considering arbitrary is an optional dependency. My preference would be to just not enable it for the MSRV tests.

Bolero is a bit different than cargo-fuzz/arbitrary in this area since we allow writing basic unit tests mixed in with your application code. Cargo-fuzz, OTOH, forces you to put them in completely separate crates that don't have the same compatibility requirements as your main product.

While what you're saying is definitely right for cargo-fuzz vs. bolero, I think bolero-generator and arbitrary have basically the same goal: have all crates expose implementation of them, which means they need to be as stable as possible.

Anyway, I'll look into figuring out how to get cargo test --features arbitrary run with 1.63 and stay on 1.57 for the other tests when I look into it next, I'll just need to revert the changes made for clippy in addition to that :)

have all crates expose implementation of them, which means they need to be as stable as possible.

Yeah that's another great point to consider here!

Sorry for the extra work with the clippy stuff! But I think other than reverting it, we should be good to merge!

Actually, it looks like miniz_oxide just broke compatibility with rust versions up to 1.59.0... And we indirectly depend on that through the backtrace crate. I think we can just bump MSRV to 1.60.0 and call it good (the formatting feature should be available there).

Edit: although i guess we still have the problem of arbitrary's MSRV being 1.63... ☹️

TBH I personally don't believe in MSRV 😅 If rust-lang/cargo#9930 were implemented then I'd believe in it, but until it's done I don't think it makes sense to actually worry about it (and even once done I still don't think it'd actually be useful, but… I mean sure people want stability for the user, but the ones actually doing the program building can stomach keeping their toolchains up-to-date, and I say that as maintainer for the nixos distribution 😅 )

IMO having the MSRV being older than whatever is in debian stable is pointless, and debian stable currently has 1.63 — I guess that's also the reason why arbitrary is also living with a 1.63 MSRV.

Anyway, I've reverted the clippy changes and instead fixed the CI to handle two different MSRVs depending on the feature set, so this should be ready to go! But let me know if you change your mind, I still have the tip of the branch around! :)

camshaft · 2023-02-01T01:03:57Z

The clippy issues should be addressed by setting our MSRV in the clippy config

https://doc.rust-lang.org/clippy/configuration.html#specifying-the-minimum-supported-rust-version

This reverts commit 6d55ef6.

Co-authored-by: Cameron Bytheway <bytheway.cameron@gmail.com>

Ekleog · 2023-02-19T20:31:25Z

@camshaft I think I handled all your review comments, WDYT about the current state of this PR? :)

Ekleog force-pushed the gen-arbitrary branch from 0831c0f to fa05456 Compare January 7, 2023 02:35

Ekleog mentioned this pull request Jan 7, 2023

impl TypeGenerator for chrono::DateTime, chrono_tz::Tz, etc? #106

Closed

camshaft reviewed Jan 9, 2023

View reviewed changes

Ekleog force-pushed the gen-arbitrary branch from 105a264 to 55323d9 Compare January 22, 2023 23:53

camshaft reviewed Jan 24, 2023

View reviewed changes

camshaft reviewed Jan 25, 2023

View reviewed changes

bolero-generator/Cargo.toml Show resolved Hide resolved

Ekleog added 5 commits January 28, 2023 21:33

implement a generator based on Arbitrary for compatibility

0d89008

make the arbitrary feature opt-in

72f0448

hoist generation from a byte slice into Driver

779d911

also implement with_arbitrary

32257b2

avoid being too slow when the len range is too wide in forced mode

fd6e10a

Ekleog force-pushed the gen-arbitrary branch from 7bc23a0 to fd6e10a Compare January 28, 2023 20:41

Ekleog mentioned this pull request Jan 28, 2023

Detangling Driver and DriverMode #117

Open

Ekleog added 4 commits January 28, 2023 22:08

fix kani gen_from_bytes implementation

54e0146

keep msrv as it was

3bf9e07

add unit tests for arbitrary in makefile

c0283d6

bump msrv for tests to pass

6d55ef6

cargo doesn't think of taking old arbitrary versions and otherwise fails with error: package `arbitrary v1.2.3` cannot be built because it requires rustc 1.63.0 or newer, while the currently active rustc version is 1.57.0

Ekleog force-pushed the gen-arbitrary branch from f7ab4d4 to 6d55ef6 Compare January 28, 2023 21:20

do not oom trying to allocate big lots of memory

2af269d

Ekleog force-pushed the gen-arbitrary branch 3 times, most recently from 0dd6d83 to e5c1f2e Compare January 28, 2023 22:51

Ekleog added 2 commits January 28, 2023 23:58

do not even try to allocate too much at once, for asan

fe75abd

handle clippy lints

154db5e

Ekleog force-pushed the gen-arbitrary branch from e5c1f2e to 154db5e Compare January 28, 2023 22:58

cargo fmt

25f9d92

Ekleog force-pushed the gen-arbitrary branch from 182cda2 to 25f9d92 Compare January 30, 2023 16:07

camshaft reviewed Feb 1, 2023

View reviewed changes

Ekleog added 3 commits February 5, 2023 16:23

Undo cargo lints change

83d6a81

Revert "bump msrv for tests to pass"

923ccee

This reverts commit 6d55ef6.

Make CI work properly with two different MSRVs

66684c1

Ekleog force-pushed the gen-arbitrary branch from e08d670 to 66684c1 Compare February 5, 2023 16:05

Ekleog and others added 6 commits February 5, 2023 17:11

Set MSRV in clippy config

deaa389

Fix syntax in makefile

e47271b

Check version of rustc rather than cargo

cf82d60

Move condition to github action

f4ff0a3

Use matrix exclude

939cb0f

fix typo

51e78c7

Co-authored-by: Cameron Bytheway <bytheway.cameron@gmail.com>

camshaft approved these changes Feb 20, 2023

View reviewed changes

camshaft merged commit a58800b into camshaft:master Feb 20, 2023

camshaft mentioned this pull request Mar 30, 2023

fix: branch gen_from_bytes macro in no_std mode #135

Merged

mwhicks1 mentioned this pull request May 17, 2023

How to use Arbitrary generators? #150

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement a generator based on Arbitrary for compatibility #108

implement a generator based on Arbitrary for compatibility #108

Ekleog commented Jan 7, 2023 •

edited

camshaft Jan 9, 2023

Ekleog Jan 22, 2023

camshaft Jan 24, 2023

Ekleog commented Jan 22, 2023

camshaft Jan 24, 2023

camshaft Jan 25, 2023

Ekleog Jan 28, 2023

Ekleog commented Jan 28, 2023

Ekleog commented Jan 28, 2023 •

edited

Ekleog commented Jan 28, 2023

camshaft Feb 1, 2023

camshaft Feb 1, 2023

Ekleog Feb 1, 2023

camshaft Feb 1, 2023

camshaft Feb 1, 2023 •

edited

Ekleog Feb 5, 2023 •

edited

camshaft commented Feb 1, 2023

Ekleog commented Feb 19, 2023

implement a generator based on Arbitrary for compatibility #108

implement a generator based on Arbitrary for compatibility #108

Conversation

Ekleog commented Jan 7, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ekleog commented Jan 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ekleog commented Jan 28, 2023

Ekleog commented Jan 28, 2023 • edited

Ekleog commented Jan 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

camshaft Feb 1, 2023 • edited

Choose a reason for hiding this comment

Ekleog Feb 5, 2023 • edited

Choose a reason for hiding this comment

camshaft commented Feb 1, 2023

Ekleog commented Feb 19, 2023

Ekleog commented Jan 7, 2023 •

edited

Ekleog commented Jan 28, 2023 •

edited

camshaft Feb 1, 2023 •

edited

Ekleog Feb 5, 2023 •

edited