Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marpa R2 parity #2

Closed
dginev opened this issue Jan 15, 2019 · 8 comments
Closed

Marpa R2 parity #2

dginev opened this issue Jan 15, 2019 · 8 comments

Comments

@dginev
Copy link
Contributor

dginev commented Jan 15, 2019

I am highly interested in getting to Marpa::R2-like ergonomics in Rust, and am wondering if this crate could be the place to contribute to towards that end.

@jrobsonchase are you still interested in maintaining a rust Marpa crate? I see the documentation has gone 404 for example.

But for this issue, I wanted to draft an example of what I'm after. In the official perl synpopsis the example snippet is:

use Marpa::R2;
 
my $dsl = <<'END_OF_DSL';
:default ::= action => [name,values]
lexeme default = latm => 1
 
Calculator ::= Expression action => ::first
 
Factor ::= Number action => ::first
Term ::=
    Term '*' Factor action => do_multiply
    | Factor action => ::first
Expression ::=
    Expression '+' Term action => do_add
    | Term action => ::first
Number ~ digits
digits ~ [\d]+
:discard ~ whitespace
whitespace ~ [\s]+
END_OF_DSL
 
my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
my $input = '42 * 1 + 7';
my $value_ref = $grammar->parse( \$input, 'My_Actions' );
 
sub My_Actions::do_add {
    my ( undef, $t1, undef, $t2 ) = @_;
    return $t1 + $t2;
}
 
sub My_Actions::do_multiply {
    my ( undef, $t1, undef, $t2 ) = @_;
    return $t1 * $t2;
}

I would love to make available an equivalent in Rust which performs the DSL parse at compile time and preserves the convenience of this syntax. As a first stab at an example, I am thinking:

#[macro_use]
extern crate marpa_derives;
#[macro_use]
extern crate marpa_scanless;

let grammar = grammar!(r###"
:default ::= action => [name,values]
lexeme default = latm => 1
 
Calculator ::= Expression action => ::first
 
Factor ::= Number action => ::first
Term ::=
    Term '*' Factor action => do_multiply
    | Factor action => ::first
Expression ::=
    Expression '+' Term action => do_add
    | Term action => ::first
Number ~ digits
digits ~ [\d]+
:discard ~ whitespace
whitespace ~ [\s]+
"###);

let input = "42 * 1 + 7";
let value_ref = grammar.parse( input, my_actions );


mod my_actions; // define the equivalent actions in the mod dependency

I have some experience with working on "custom derive" sub-packages, so I can see this realized via a marpa_derives subcrate in this repository. @jrobsonchase is this something you'd be interested in? Trying to find a viable plan of attack... Thanks!

@jrobsonchase
Copy link
Owner

Wow, literally haven't thought about this in years!

I had something along the same lines as your proposed grammar in mind way back when, but I guess I got distracted and forgot about this project.

I'd be happy to get all of the metadata back up to date and add you as a contributor!

@dginev
Copy link
Contributor Author

dginev commented Jan 16, 2019

Excellent, happy to see you're still around and are interested in extending the crate a bit further! Will try to see if I can make some progress on my suggestion above and get back to you.

@daxim
Copy link

daxim commented Feb 1, 2019

See also https://crates.io/crates/frithu

@dginev
Copy link
Contributor Author

dginev commented Feb 12, 2019

Let me leave a quick note on progress here:

  • I have successfully ported the pre-compilation of MetaG.pm which contains the SLIF grammar (aka the G "meta grammar") to an equally impressive auto-generated metag.rs
  • I have in place the boilerplate for doing the grammar! compile-time precompilation of the extended SLIF BNF strings, via Rust's "proc-macro" subcrates.
  • I'm now slowly+gradually rewriting the rather sizeable SLG.pm, SLR.pm and MetaAST.pm to a point where the basic grammar build + basic parse works, at which point I'm planning to open a demo PR for feedback.

Note on compile-time speedups - I believe we can execute everything down to the lowest level libmarpa calls during compile-time, so that a SLIF grammar will only have minimal runtime overhead for initializing the C data structures. But the grammar will be pre-parsed, defaults will be instantiated where needed, etc, during the main rustc call. Can't wait to have something to benchmark.

link to working branch here

@jeffreykegler
Copy link

A couple of hints:

The code in SLG.pm and SLR.pm contains too many layers. When I revisited them in Marpa::R3 I eliminated IIRC 4 full layers. Looking back, it seems like in R2 every time I added a major new feature, I did it by adding a layer.

If you haven't glanced at R3, you might consider doing so. It is alpha, but does everything R2 does and has the test suite to prove it. And its internal architecture is something I am considerably less ashamed of. R3's major fault is that, since I do continuous integration, it represents a transition toward where I was going, and isn't there yet -- there's a lot of half converted code. Because of that it may look ugly on the surface, but it's actual "architecture" represents a lot more thinking and experience than R2.

@dginev
Copy link
Contributor Author

dginev commented Feb 14, 2019

Thanks for the pointers @jeffreykegler ! I took a quick glance and - on a very surface level - it's not particularly encouraging from a rewrite perspective - SLG is twice the size and SLR is about the same (counting lines via wc -l).

1080 Marpa--R2/cpan/lib/Marpa/R2/SLG.pm
1927 Marpa--R2/cpan/lib/Marpa/R2/SLR.pm

2079 Marpa--R3/cpan/lib/Marpa/R3/SLG.pm
1627 Marpa--R3/cpan/lib/Marpa/R3/SLR.pm

That said, I've ported a couple of Perl projects to Rust already and I've always seen the best results when I take minimal liberties with redesigning the ported code, only adjusting very direct idiom translations between Perl and Rust, adding the type interfaces etc. I've also only used Marpa::R2 personally so far, so it may be best if I kept closer to that architecture.

My usual hope is that if a port proceeds successfully, and the test suite passes, one can then proceed to perform any large-scale redesigns that make the code both Rust-friendly and with less boilerplate... But it looks like there's no way around doing the work and hammering down the full SLG and SLR in rust before I can reach a demo.

@jeffreykegler
Copy link

You'll find in studying R2 that a lot is done is XS code, an approach which I re-decided for R3. I don't know if you're taking that into account in assessing the complexity of the R2 code. The XS in R3 is nothing but a Lua interface, so it can be regarded as a "black box". In fact, if you did a little to make the initialization configurable, it could be broken out as a separate module for interfacing with Lua.

In converting the Marpa logic from XS, I used continuous integration, and first converted data, then logic. Some of the logic remains unconverted meaning that the code has to call down to Lua every time that it reads or writes each piece of data, which is kind of crazy. The increased size of SLG probably reflects this, and the smaller size of SLR is because IIRC it is already largely converted.

From your POV, in R3 you have to recognize the Lua interface boilerplace and realize that all that is going on, often, is a read or a write of a single variable.

In any case, I'll do my best to help whatever your choice.

@dginev
Copy link
Contributor Author

dginev commented Aug 21, 2019

Sadly, after a long idle pause, I have to come back and say I feel a bit too overwhelmed by the complexity to try and approach the port myself. Maybe someone else gets ambitious enough to try, I would still love to be able to use the same convenient SLIF from native Rust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants