-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate all members of a grammar? #20
Comments
i have some code that "refines" a grammar, ie Prod, (or rather, see below)
(we could add a "OneTerminal :: String -> (String -> a) -> Prod a" case, (or, maybe some dsl for defining predicates that can be "show"n? this seems anyway, thats why in my project i have a type (call it My.Prod) with a unfortunately, but it (1) may lose some efficiency, where distinct calls On Monday, February 22, 2016, Daniel Peebles notifications@github.com
(this message was composed with dictation: charitably interpret typos)Sam |
I've been thinking about doing something like this for testing, to basically have tests that first generate arbitrary grammars and then test that against random (or all) strings in the language. I used to do something like that in my old Grempa library. If the |
I guess |
or optionally have |
Use case: test the grammar for ambiguities. |
Generally whether CFG is ambigious is undecidable, but you indeed could QuickCheck it. Yet, Earley's power is that it can deal with ambiguous grammars. |
I think we could also do an exhaustive test up to some fixed input length and get some more confidence than random tests would give us. |
There's some code for this here: https://github.com/ollef/Earley/tree/Generator Does anyone want to volunteer some testing and/or tests? |
language romanNumeralsGrammar "IVX"
= [(0,""),(1,"I"),(5,"V"),(10,"X"),(20,"XX"),(11,"XI"),(15,"XV"),(6,"VI"),(9,"IX"),(4,"IV"),(2,"II"),(3,"III"),(19,"XIX"),(16,"XVI"),(14,"XIV"),(12,"XII"),(7,"VII"),(21,"XXI"),(25,"XXV"),(30,"XXX"),(31,"XXXI"),(35,"XXXV"),(8,"VIII"),(13,"XIII"),(17,"XVII"),(26,"XXVI"),(29,"XXIX"),(24,"XXIV"),(22,"XXII"),(18,"XVIII"),(36,"XXXVI"),(39,"XXXIX"),(34,"XXXIV"),(32,"XXXII"),(23,"XXIII"),(27,"XXVII"),(33,"XXXIII"),(28,"XXVIII"),(37,"XXXVII"),(38,"XXXVIII")] Pretty fun stuff! |
@ollef is the "IVX" above the set of terminals to explore the grammar over, due to the |
@copumpkin: Yeah, that's it! |
Oh that's super cool, so it's actually finite in this case |
It's finite unless you give it some more tokens to work with (apparently). TODO:
|
@ollef have you tried it with nasty grammars? left-recursive, right-recursive, ridiculously ambiguous? |
@phadej yeah, for termination?
what i did before was just to first validate the grammar as finite (by
checking recursively for "non-finite" cases like many), and then return a
Maybe [t].
also, known-infinite (or unknown-to-be-finite) grammars could be
represented by an opaque symbol (possible the (<?>) if present), if it's
needed. like a list of "Either (Maybe e) t" or something.
…On Wed, Feb 1, 2017 at 10:12 PM Oleg Grenrus ***@***.***> wrote:
@ollef <https://github.com/ollef> have you tried it with nasty grammars?
left-recursive, right-recursive, ridiculously ambiguous?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#20 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACNoMRQ5h1R-WPrX_lMO7M1Dg5OJFELMks5rYXPBgaJpZM4HfJGl>
.
|
@phadej: I haven't done any principled testing, just tried a few examples. Those are good starting points, thanks! 👍 @sboosali: Hmm, I don't quite understand what you mean. Don't we want this to work even if a grammar generates an infinite language? What we're after, I think, is that the language generation is productive, such that you can It'll likely have roughly the same restrictions as the parser by the way, so degenerate grammars like https://github.com/ollef/Earley/blob/master/examples/VeryAmbiguous.hs will loop because they produce circular results. |
@sboosali for productivity. For finite languages you could use |
@olle (right, productivity makes sense. I was just thinking about finite
subgrammars.)
are the results "breath first" (or broadly, not too biased in some way)?
otherwise, they might not be representative of the language.
also, what use cases did you have in mind? maybe we could give a basic
sanity check for ambiguity by running the parser on the "language"? or even
check the stream for duplicates, depending on how its generated?
(not sure about any of this, havent studied cfg's that much).
…On Wed, Feb 1, 2017 at 11:14 PM Oleg Grenrus ***@***.***> wrote:
@sboosali <https://github.com/sboosali> for productivity. For finite
languages you could use regex-applicative (and even regular languages
aren't always finite).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACNoMUP5Xk11aJ1e8tEgbXU_W_f32jZKks5rYYJsgaJpZM4HfJGl>
.
|
The results are roughly in the same order as from the parser, so ordered by input length, and within the same input length shuffled around a bit (governed by the internals of the implementation). I'm sure people can find more uses, but here are some that I can think of:
|
This is included in the 0.12.0.0 release. Thanks for reminding me! |
Intuitively, it feels like it should be possible for your library to generate all strings (as a potentially infinite list) that a grammar would parse, as well as all parsed values that those strings would generate. Is that something you've considered or would consider adding?
The text was updated successfully, but these errors were encountered: