Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't parse exponentiation with multiple unary minuses #2

Closed
rns opened this issue Jul 23, 2015 · 18 comments
Closed

Can't parse exponentiation with multiple unary minuses #2

rns opened this issue Jul 23, 2015 · 18 comments

Comments

@rns
Copy link
Contributor

rns commented Jul 23, 2015

ok 1 - use MarpaX::Languages::Lua::Parser;
Error in SLIF parse: No lexeme found at line 1, column 27
* String before error: assert(2^-2 == 1/4 and -2^
* The error was at line 1, column 27, and at character 0x002d '-', ...
* here: - -2 == - - -4);\n
Marpa::R2 exception at ../lib/MarpaX/Languages/Lua/Parser.pm line 293.
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 1.

test case: https://gist.github.com/rns/a877e2942376569154c8

Update: can be made to work with the below patch (not sure if it's ok to change precedence this way, though)

      | '...'
      | <tableconstructor>
      | <function>
+    || '-' <exp>
     || <exp> '^' <exp> assoc => right
     || <keyword not> <exp>
      | '#' <exp>
-     | '-' <exp>
     || <exp> '*' <exp>
      | <exp> '/' <exp>
      | <exp> '%' <exp>
@jeffreykegler
Copy link
Contributor

I took the precedence from section 2.5.6 of the Lua manual. One of the disadvantages of recursive descent parsing is that it's hard to know, from reading the code, which syntax it implements. This could be Roberto's mistake.

I think we should change the precedence to match Lua's behavior, perhaps adding a comment to the code.

So, yes, add the patch. One thing, though -- as above it has unary minus in there twice. I'd remove the alternative at the looser precedence level.

@jeffreykegler
Copy link
Contributor

Argh! Experimenting, it seems to make a difference whether it's the right arg or the left arg of the exponentiation.

If worst comes to worst, it may turn out Lua's behavior does not follow a normal precedence scheme, and unary terms may have to be pulled out of the precedenced statement.

@rns
Copy link
Contributor Author

rns commented Jul 30, 2015

I noticed that <C90 strtod decimal> has <optional sign> ~ [-+]. Can that clash with unary minus, lexing, e.g. -2 as <Number> rather than lexing 2 as Number and then using '-' <exp>?

@jeffreykegler
Copy link
Contributor

I think the solution may be to treat exponents as special -- perhaps with
their own precedenced statement.

I think it might work if you make exponentiation a kind of postfix unary --

::= '^'

where is then another precedenced statement for the things
allowed in an exponent -- terms and prefix unary expressions, but also,
recursively, an if parenthesized.

On Thu, Jul 30, 2015 at 1:11 PM, rns notifications@github.com wrote:

I noticed that has ~ [-+]. Can that clash with unary minus, lexing, e.g.
-2 as rather than lexing 2 as Number and then using '-' ?


Reply to this email directly or view it on GitHub
#2 (comment)
.

@jeffreykegler
Copy link
Contributor

Re the optional sign in strtod -- I think I recall, from looking at the Lua code, that it strips off leading minus signs before passing a string on to strtod()

@rns
Copy link
Contributor Author

rns commented Jul 30, 2015

First cut at exponent as a separate statement:

assert(2^-2 == 1/4 and -2^- -2 == - - -4)

parses with this patch (needs ranks to avoid ambiguity)

@@ -163,6 +163,7 @@ sub BUILD
                Marpa::R2::Scanless::R -> new
                ({
                        grammar => $self -> grammar,
+            ranking_method => 'high_rule_only',
                })
        );
        $self -> renderer
@@ -735,10 +736,10 @@ lexeme default = latm => 1 action => [name,values]
      | '...'
      | <tableconstructor>
      | <function>
-    || <exp> '^' <exp> assoc => right
-    || <keyword not> <exp>
+    || <exp> '^' <exponent exp> assoc => right
+    || '-' <exp>
+     | <keyword not> <exp>
      | '#' <exp>
-     | '-' <exp>
     || <exp> '*' <exp>
      | <exp> '/' <exp>
      | <exp> '%' <exp>
@@ -749,11 +750,13 @@ lexeme default = latm => 1 action => [name,values]
      | <exp> '<=' <exp>
      | <exp> '>' <exp>
      | <exp> '>=' <exp>
-     | <exp> '==' <exp>
+     | <exp> '==' <exp> rank => 1
      | <exp> '~=' <exp>
-    || <exp> <keyword and> <exp>
+    || <exp> <keyword and> <exp> rank => 1
     || <exp> <keyword or> <exp>

+<exponent exp> ::=
+       <var>
+     | '(' <exp> ')'
+     | <exp> <args>
+     | <exp> ':' <Name> <args>
+     | <Number>
+     | '#' <exp>
+     | '-' <exp>
+
 <prefixexp> ::= <var>
 <prefixexp> ::= <functioncall>

@rns
Copy link
Contributor Author

rns commented Jul 30, 2015

I'll try that approach in MarpaX-Languages-Lua-AST on lua 5.1 test suite files. Thanks!

@jeffreykegler
Copy link
Contributor

In the <exponent exp> statements, all the <exp> references will start again at the top. What we need is a special precedence hierarchy, which we create by

  • copying every alternative below exponentiation in the <exp> hierarchy; and adding
  • the unary operators.

My take at it would be like this:

<exponent> ::=
       <var>
     | '(' <exp> ')'
    || <exponent> <args>
    || <exponent> ':' <Name> <args>
     | <keyword nil>
     | <keyword false>
     | <keyword true>
     | <Number>
     | <String>
     | '...'
     | <tableconstructor>
     | <function>
    || <keyword not> <exponent>
     | '#' <exponent>
     | '-' <exponent>

Notes: I changed <exponent exp> to <exponent>. The right and left associations I eliminated because they are no-ops in unary alternatives.

The group association I eliminated because it is a no-op in a nullary alternative. (In the <exponent> precedenced statement, the RHS '(' <exp> ')' is nullary.) The meaning of group association is precedence starts over again from the loosest precedence. In the <exponent> precedenced statement, we want to start over again the loosest precedence of the <exp> hierarchy, not the <exponent> hierarchy, so I keep the <exp> where ordinarily I would substitute <exponent>

@ronsavage
Copy link
Owner

Are the current source code files sufficient to exercise this? If not, someone will have to write 1 or more for me to add to the distro.

@rns
Copy link
Contributor Author

rns commented Jul 31, 2015

@jeffreykegler -- very cool! Now,

assert(2^-2 == 1/4 and -2^- -2 == - - -4)

parses with this patch (against the current repo)

--- a/lib/MarpaX/Languages/Lua/Parser.pm
+++ b/lib/MarpaX/Languages/Lua/Parser.pm
@@ -825,7 +825,7 @@ lexeme default = latm => 1 action => [name,values]
      | '...'
      | <tableconstructor>
      | <function>
-    || <exp> '^' <exp> assoc => right
+    || <exp> '^' <exponent> assoc => right
     || <keyword not> <exp>
      | '#' <exp>
      | '-' <exp>
@@ -844,6 +844,23 @@ lexeme default = latm => 1 action => [name,values]
     || <exp> <keyword and> <exp>
     || <exp> <keyword or> <exp>

+<exponent> ::=
+       <var>
+     | '(' <exp> ')'
+    || <exponent> <args>
+    || <exponent> ':' <Name> <args>
+     | <keyword nil>
+     | <keyword false>
+     | <keyword true>
+     | <Number>
+     | <String>
+     | '...'
+     | <tableconstructor>
+     | <function>
+    || <keyword not> <exponent>
+     | '#' <exponent>
+     | '-' <exponent>
+
 <prefixexp> ::= <var>
 <prefixexp> ::= <functioncall>
 <prefixexp> ::= '(' <exp> ')'

@ronsavage -- the line is from https://github.com/rns/MarpaX-Languages-Lua-AST/blob/master/t/lua5.1-tests/constructs.lua -- lua 5.1 test suite -- hope you can incorporate it into the module’s test suite, fully or partially.

@ronsavage
Copy link
Owner

Hi rns

On 31/07/15 11:34, rns wrote:
[snip]

@ronsavage https://github.com/ronsavage -- the line is from
https://github.com/rns/MarpaX-Languages-Lua-AST/blob/master/t/lua5.1-tests/constructs.lua
-- lua 5.1 test suite -- hope you can incorporate it into the module’s
test suite, filly or partially.

$many x $thanx;

Ron Savage - savage.net.au

@jeffreykegler
Copy link
Contributor

If we want to show this to the Lua community as their BNF, we may want to tweak the "presentation". @rns - would it be more elegant if we broken the <exp> precedence into <tight exp>, below <loose exp>, plus <exponent> which would be <tight exp> plus the unary alternatives from the <loose exp> hierarchy? And is it clear what I'm suggesting?

@rns
Copy link
Contributor Author

rns commented Jul 31, 2015

@jeffreykegler -- good point about presentation, a bit busy now, will peek at it later.

@rns
Copy link
Contributor Author

rns commented Jul 31, 2015

With the above fix, lua 5.1 test suite is parsed and reproduced correctly in MarpaX-Languages-Lua-AST. Thanks a lot.

@jeffreykegler -- on <tight/loose exp> -- I'm really split on this -- on the one hand, would be good as an illustration of what Marpa can do; on the other, the lua manual doesn't have this concept, so we probably shouldn't.

@jeffreykegler
Copy link
Contributor

@rns: re <tight/loose> -- I'm happy to go with your judgement on this, particularly if you're the one who'll be introducing it to the Lua folks.

I would note that the Lua manual does not have the distinction between right and left precedence for the '^' operator either. We should mention that to them.

@ronsavage
Copy link
Owner

Hi

On 31/07/15 19:35, rns wrote:

With the above fix, lua 5.1 test suite is parsed and reproduced
correctly in MarpaX-Languages-Lua-AST.

@jeffreykegler https://github.com/jeffreykegler -- on |<tight/loose
exp>| -- I'm really split on this -- on the one hand, would be good as
an illustration of what Marpa can do; on the other, the lua manual
doesn't have this concept, so we probably shouldn't.

I think we should go with it:

  1. Then we can show the Lua people what we think and ask what they think.

  2. The BNF is not forever, so if they vote against the idea, we can
    (possibly :-) re-write the BNF.

Ron Savage - savage.net.au

@rns
Copy link
Contributor Author

rns commented Aug 1, 2015

@jeffreykegler, @ronsavage: Well, on a second thought, the exponent solution is now used in Lua::AST, so going for <tight/loose> would make sense in Lua::Parser in terms of '2 independent implementations' (of Lua EBNF grammar from the manual, that is).

@jeffreykegler: good point about left/right precedence of exponentiation.

Other differences re Lua::AST vs. Lua::Parser which come to mind: external lexing vs. events, left recursion vs. sequences, manual traversing/MarpaX::AST vs. Tree::DAG_Node, nameless nullables vs. explicit <optional ...> symbols

ronsavage added a commit that referenced this issue Aug 2, 2015
@rns
Copy link
Contributor Author

rns commented Aug 3, 2015

Fixed in just released 1.04

@rns rns closed this as completed Aug 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants