Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of single quote instead of back quote in C11 parser is intentional ? #3

Closed
mingodad opened this issue Dec 8, 2022 · 15 comments
Closed

Comments

@mingodad
Copy link

mingodad commented Dec 8, 2022

While trying to create a parser using lpegrex and looking at C11 parser for usage patterns I found the one shown bellow, where most of the binary operands are captured with {`op`} but the AND operator is captured with {'&&'} is this intentional or a mistake ?

logical-AND-expression <--
  (inclusive-OR-expression op-logical-AND*) ~> foldleft
op-logical-AND:binary-op <==
  {'&&'} SKIP @inclusive-OR-expression

logical-OR-expression <--
  (logical-AND-expression op-logical-OR*) ~> foldleft
op-logical-OR:binary-op <==
  {`||`} @logical-AND-expression
@mingodad
Copy link
Author

mingodad commented Dec 8, 2022

It seems to be a mistake because parsing this example:

int main() {
    int n, x;
    if(n && x)
        return 1;
    else if(n || x)
        return 1;
    return 0;
}

Gives this AST:

translation-unit
| function-definition
| | declaration-specifiers
| | | type-specifier
| | | | "int"
| | declarator
| | | declarator-parameters
| | | | identifier
| | | | | "main"
| | declaration-list
| | compound-statement
| | | declaration
| | | | type-declaration
| | | | | declaration-specifiers
| | | | | | type-specifier
| | | | | | | "int"
| | | | | init-declarator-list
| | | | | | init-declarator
| | | | | | | declarator
| | | | | | | | identifier
| | | | | | | | | "n"
| | | | | | init-declarator
| | | | | | | declarator
| | | | | | | | identifier
| | | | | | | | | "x"
| | | if-statement
| | | | expression
| | | | | binary-op
| | | | | | identifier
| | | | | | | "n"
| | | | | | "&"
| | | | | | unary-op
| | | | | | | "&"
| | | | | | | identifier
| | | | | | | | "x"
| | | | return-statement
| | | | | expression
| | | | | | integer-constant
| | | | | | | "1"
| | | | if-statement
| | | | | expression
| | | | | | binary-op
| | | | | | | identifier
| | | | | | | | "n"
| | | | | | | "||"
| | | | | | | identifier
| | | | | | | | "x"
| | | | | return-statement
| | | | | | expression
| | | | | | | integer-constant
| | | | | | | | "1"
| | | return-statement
| | | | expression
| | | | | integer-constant
| | | | | | "0"

But changing it by {`&&`} we get the expected AST:

translation-unit
| function-definition
| | declaration-specifiers
| | | type-specifier
| | | | "int"
| | declarator
| | | declarator-parameters
| | | | identifier
| | | | | "main"
| | declaration-list
| | compound-statement
| | | declaration
| | | | type-declaration
| | | | | declaration-specifiers
| | | | | | type-specifier
| | | | | | | "int"
| | | | | init-declarator-list
| | | | | | init-declarator
| | | | | | | declarator
| | | | | | | | identifier
| | | | | | | | | "n"
| | | | | | init-declarator
| | | | | | | declarator
| | | | | | | | identifier
| | | | | | | | | "x"
| | | if-statement
| | | | expression
| | | | | binary-op
| | | | | | identifier
| | | | | | | "n"
| | | | | | "&&"
| | | | | | identifier
| | | | | | | "x"
| | | | return-statement
| | | | | expression
| | | | | | integer-constant
| | | | | | | "1"
| | | | if-statement
| | | | | expression
| | | | | | binary-op
| | | | | | | identifier
| | | | | | | | "n"
| | | | | | | "||"
| | | | | | | identifier
| | | | | | | | "x"
| | | | | return-statement
| | | | | | expression
| | | | | | | integer-constant
| | | | | | | | "1"
| | | return-statement
| | | | expression
| | | | | integer-constant
| | | | | | "0"

@mingodad
Copy link
Author

mingodad commented Dec 8, 2022

Also I just made a Lua-5.4.4 , lpeglabel and lpegrex online playground here https://github.com/mingodad/lua-wasm-playground

@edubart
Copy link
Owner

edubart commented Dec 8, 2022

That was indeed a mistake. I've update the C11 parser fixing that, along with some other issues I've found last months when putting the C11 AST to real use. Thanks for reporting.

Also I just made a Lua-5.4.4 , lpeglabel and lpegrex online playground here

Nice! Very useful to quickly prototype grammars! Can I add a link for it in the LPegRex readme?

@edubart edubart closed this as completed Dec 8, 2022
@mingodad
Copy link
Author

mingodad commented Dec 8, 2022

Thank you !
Yes of course !

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

How do you debug lpegrex grammars ?
I'm looking to something like https://nmap.org/nsedoc/lib/lpeg-utility.html debug function.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

@edubart
Copy link
Owner

edubart commented Dec 9, 2022

How do you debug lpegrex grammars ?

Well usually I don't need to, because I usually prototype grammars incrementally, by creating small tests to cover each grammar rule change or addition. As seen in https://github.com/edubart/nelua-lang/blob/master/spec/syntaxdefs_spec.lua . When some change does not workout I can usually reason about why.

Also like this https://lua-users.org/lists/lua-l/2009-10/msg00774.html

Something like that is probably doable with some patches in lpegrex.lua, it would be an enhancement for the library to add debug support like that.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

Could you help me with a grammar I'm trying to use with lpegrex ?
I'ts a an already working grammar (by working I mean it does parse several input sources) that works with https://chrishixon.github.io/chpeg/playground/ and https://yhirose.github.io/cpp-peglib/ but with lpegrex I'm getting:

lua gmpl-ast.lua /home/mingo/dev/lp/GLPK-dad-master-if/examples/diet.mod
nil	fail	3630
lua: gmpl-ast.lua:731: syntax error: /home/mingo/dev/lp/GLPK-dad-master-if/examples/diet.mod:100:0: fail

^
stack traceback:
	[C]: in function 'error'
	gmpl-ast.lua:731: in local 'parse_gmpl'
	gmpl-ast.lua:762: in main chunk
	[C]: in ?

The grammar seems to be compiled fine and the input text diet.mod is parsed till the end but fails.

I'm not ready to publish the grammar yet but if you could give a hand on it I could email it to you.

@edubart
Copy link
Owner

edubart commented Dec 9, 2022

Instead of debugging for you, I've added experimental support for a simple debug mode, inspired by the link you have shared. Also added in the README how to use it. Let me know if that helps you debugging.

When you see fail error label, I think it's a misuse of lpeglabel, maybe checking its sources could give a hint why.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

Thank you !
I'm looking at it now.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

Thank you again !
With the debug info I could see that I was using this definition of LINEBREAK that works in other peg parsers but not with lpeg and derivatives:

LINEBREAK <- ('\n' '\r'? / '\r' '\n'?)

Changing it to:

LINEBREAK <- %nl %cr? / %cr %nl?

And now it parses fine and I can start changing/figuring out how to create a desired AST.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

I also just updated the https://mingodad.github.io/lua-wasm-playground/ to use the latest lpegrex version.
And the lpegrex.debug mode can be used there.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

I was trying to improve a bit the debug info with this:

    if lpegrex.debug then
      for k, patt in pairs(G) do
        if k ~= 1 then
          local enter = lpeg.Cmt(lpeg.P(true), function(s, p)
            local lineno, colno = lpegrex.calcline(s, p)
            io.stderr:write(string.format('ENTER %s (%d:%d)\n', k, lineno, colno))
            return true
          end)
          local succeded = lpeg.Cmt(lpeg.P(true), function(s, p)
            local lineno, colno = lpegrex.calcline(s, p)
            io.stderr:write(string.format('OK %s (%d:%d)\n', k, lineno, colno))
            return true
          end)
          local failed = lpeg.Cmt(lpeg.P(true), function(s, p)
            local lineno, colno = lpegrex.calcline(s, p)
            io.stderr:write(string.format('FAILED %s (%d:%d)\n', k, lineno, colno))
            return false
          end)
          G[k] = enter * (patt + failed) * succeded
        end
      end
    end

But I'm getting this error:

lua gmpl-ast.lua /home/mingo/dev/lp/GLPK-dad-master-if/examples/diet.mod
>lua gmpl-ast.lua /home/mingo/dev/lp/GLPK-dad-master-if/examples/diet.mod
lua: gmpl-ast.lua:765: rule 'Primary_expression' may be left recursive
stack traceback:
	[C]: in function 'error'
	gmpl-ast.lua:521: in field 'compile'
	gmpl-ast.lua:732: in function 'parserGmpl'
	gmpl-ast.lua:765: in main chunk
	[C]: in ?

I understand the consequence of left recursion but I can't see why it's been detected here.
Any idea of a workaround ?

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

With the suggestion of @roberto-ieru I extended a bit the debug output and it can be seen on https://mingodad.github.io/lua-wasm-playground/ with the Json parser** where I inlined lpegrex.lua to test/show it.

@mingodad
Copy link
Author

mingodad commented Dec 9, 2022

And here is the output replacing ENTER by -->, FAILED by <-- and OK by == :

--> Json (1:1)
--> SKIP (1:1)
==  SKIP (1:1)
--> Object (1:1)
--> { (1:1)
<-- { (1:1)
<-- Object (1:1)
--> Array (1:1)
--> [ (1:1)
--> SKIP (1:2)
==  SKIP (1:2)
==  [ (1:2)
--> Value (1:2)
--> String (1:2)
<-- String (1:2)
--> Number (1:2)
<-- Number (1:2)
--> Object (1:2)
--> { (1:2)
--> SKIP (2:0)
==  SKIP (2:5)
==  { (2:5)
--> Member (2:5)
--> String (2:5)
--> SKIP (2:13)
==  SKIP (2:13)
==  String (2:13)
--> : (2:13)
--> SKIP (2:14)
==  SKIP (2:15)
==  : (2:15)
--> Value (2:15)
--> String (2:15)
--> ESCAPE (2:21)
==  ESCAPE (2:22)
--> SKIP (2:27)
==  SKIP (2:27)
==  String (2:27)
==  Value (2:27)
==  Member (2:27)
--> , (2:27)
--> SKIP (3:0)
==  SKIP (3:5)
==  , (3:5)
--> Member (3:5)
--> String (3:5)
--> SKIP (3:14)
==  SKIP (3:14)
==  String (3:14)
--> : (3:14)
--> SKIP (3:15)
==  SKIP (3:16)
==  : (3:16)
--> Value (3:16)
--> String (3:16)
<-- String (3:16)
--> Number (3:16)
<-- Number (3:16)
--> Object (3:16)
--> { (3:16)
<-- { (3:16)
<-- Object (3:16)
--> Array (3:16)
--> [ (3:16)
<-- [ (3:16)
<-- Array (3:16)
--> Boolean (3:16)
--> NAME_SUFFIX (3:20)
<-- NAME_SUFFIX (3:20)
--> SKIP (3:20)
==  SKIP (3:20)
==  Boolean (3:20)
==  Value (3:20)
==  Member (3:20)
--> , (3:20)
--> SKIP (4:0)
==  SKIP (4:5)
==  , (4:5)
--> Member (4:5)
--> String (4:5)
--> SKIP (4:13)
==  SKIP (4:13)
==  String (4:13)
--> : (4:13)
--> SKIP (4:14)
==  SKIP (4:15)
==  : (4:15)
--> Value (4:15)
--> String (4:15)
<-- String (4:15)
--> Number (4:15)
--> SKIP (4:22)
==  SKIP (4:22)
==  Number (4:22)
==  Value (4:22)
==  Member (4:22)
--> , (4:22)
--> SKIP (5:0)
==  SKIP (5:5)
==  , (5:5)
--> Member (5:5)
--> String (5:5)
--> SKIP (5:11)
==  SKIP (5:11)
==  String (5:11)
--> : (5:11)
--> SKIP (5:12)
==  SKIP (5:13)
==  : (5:13)
--> Value (5:13)
--> String (5:13)
<-- String (5:13)
--> Number (5:13)
<-- Number (5:13)
--> Object (5:13)
--> { (5:13)
<-- { (5:13)
<-- Object (5:13)
--> Array (5:13)
--> [ (5:13)
<-- [ (5:13)
<-- Array (5:13)
--> Boolean (5:13)
<-- Boolean (5:13)
--> Null (5:13)
--> NAME_SUFFIX (6:0)
<-- NAME_SUFFIX (6:0)
--> SKIP (6:0)
==  SKIP (6:1)
==  Null (6:1)
==  Value (6:1)
==  Member (6:1)
--> , (6:1)
<-- , (6:1)
--> } (6:1)
--> SKIP (6:2)
==  SKIP (6:2)
==  } (6:2)
==  Object (6:2)
==  Value (6:2)
--> , (6:2)
<-- , (6:2)
--> ] (6:2)
--> SKIP (7:0)
==  SKIP (7:0)
==  ] (7:0)
==  Array (7:0)
==  Json (7:0)
= Rule counters:
	Total	Succed	%	Failed	%	Rule
	74	52	70.27	22	29.73	all rules
	20	20	100.00	0	0.00	SKIP
	9	5	55.56	4	44.44	String
	5	3	60.00	2	40.00	,
	5	5	100.00	0	0.00	Value
	4	4	100.00	0	0.00	Member
	4	1	25.00	3	75.00	Object
	4	1	25.00	3	75.00	{
	4	4	100.00	0	0.00	:
	4	1	25.00	3	75.00	Number
	3	1	33.33	2	66.67	Array
	3	1	33.33	2	66.67	[
	2	0	0.00	2	100.00	NAME_SUFFIX
	2	1	50.00	1	50.00	Boolean
	1	1	100.00	0	0.00	ESCAPE
	1	1	100.00	0	0.00	}
	1	1	100.00	0	0.00	Null
	1	1	100.00	0	0.00	]
	1	1	100.00	0	0.00	Json
Array
| Object
| | Member
| | | "string"
| | | "some\ntext"
| | Member
| | | "boolean"
| | | true
| | Member
| | | "number"
| | | -150.0
| | Member
| | | "null"

mingodad referenced this issue in sqmedeiros/lpeglabel Dec 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants