New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a ppx_deriving plugin would be super-useful #7
Comments
The Crowbar interface seems to nicely map from a PPX similar to ppx_deriving_random. In particular, there are no distribution-related parameters, which simplifies things. I think the thing which most limits the usefulness of PPX for random number generation is the need of controlling the distributions. Some generation of some ADTs diverges without the |
So @yomimono has a first prototype of ppx_deriving plugin at https://github.com/yomimono/ppx_deriving_crowbar , but as I understand there are issues with stack overflows caused by deep non-tail-call calls when generating deep values at recursive types. |
It's true! (Both things.) The stack issue isn't specific to users of ppx_deriving_crowbar but it's much easier to trigger without understanding why when automatically deriving a generator from a type. In addition to the repository itself, there's a proof-of-concept which automatically derives the types in |
(cc @Armael who was looking at the stack-overflow thing tonight, and would need a repro case.) |
Yes. #16 isn't sufficient for very branchy and recursive structures like the generators we get from the parsetree types. |
What about writing the generators in CPS? This would not fix the distribution issues, but at least ensure we do not stack overflow (at the price of probably slower generators). |
I don't think CPS-style is the answer. Generators should never generate test cases deep enough to stack overflow - bugs tend to be reproducible with small examples, and smaller examples test faster. The bug is that Crowbar does not bound the size of test cases well, because I wrote awful sizing logic. (Once again, crowbar proves to be an excellent tool for finding bugs in crowbar). #16 fixed the worst one, and #18 fixes more. With both, the o-m-p tests now run without stack overflows (on my machine...), but I wouldn't be surprised if there were other sizing bugs. |
I agree hard limits may be needed to guard against stack overflow, but if there is a distribution issue then the coverage will be very poor, so I think it needs to be addressed as well. I didn't solve this in ppx_deriving regexp, but pushed the issue onto the user by adding the If we recall linear algebra class, we can consider what it takes for random sampling of a recursive data type to converge. Given a set of mutually recursive algebraic types
where
That system is not always solvable, and also if any of the sizes comes out negative, I gather the distribution must be divergent. E.g. the just-divergent In the case where manual tweaking of weights is acceptable, a convergence check could be done at compile time. But it may also be possible to automatically find suitable weights. If we expand the co-recursion and collect same-order terms into a single But please do check my calculations, as I was just sketching this out while writing. Clarification: By |
I've now released ppx_deriving_crowbar into opam, so I'll close this issue. |
When I look at @yomimono's heroic code, it appears clear that without a metaprogramming way to generate Crowbar generators for algebraic datatypes to be enumerated in "the obvious way" (with annotations to override special cases, etc.), it can currently be very painful to deploy Crowbar for a library.
Do some people here have plans for such a plugin?
I looked around a bit and there already exist at least one ppx_deriving plugin for random generation, @paurkedal's ppx_deriving_random. Unfortunately, because of the difference in type signatures for generators and generator language in general, I don't think that it would be possible to reuse it directly. It should be a very good start for hacking a crowbar generator, though.
(In general the distance between Crowbar-style generation interfaces and Quickcheck-style generation interface is an uncanny zero, so it sounds like a nice factorization should be within reach, but right now I'm tempted to try Crowbar on stuff and I would really like even a separate/specialized generator.)
The text was updated successfully, but these errors were encountered: