Segfault when setting precedence on a rule with empty rhs #22

mingodad · 2021-12-01T16:49:04Z

Making comparisons of 'LALR(1)' grammars between parsers with a modified version of byacc and lemon to convert the grammars (see https://github.com/mingodad/lalr-parser-test) I converted the sqlite3 grammar to unicc with byacc -C sqlite3-parser.y and when running the resulting converted grammar we get segfault because there is a rule with an empty rhs and a precedence (postgresql-13-3 also have several of then), to facilitate see the problem I'm posting a dummy small grammar that also segfault due to the empty rhs and precedence, also with the converted postgresql-13.3 unicc takes a long time to process it (unicc -> 100s, byacc -> 1.2s)

//%start list


@UMINUS "-" ;
@error "error" ;

@LETTER         'A-Za-z_' 'A-Za-z0-9_'*;
@number            '0-9'+;

whitespace          :       ' \r\n\t'+
                    |       "//" !'\n'* '\n'
                    ;

#left '|' ;
#left '&' ;
#left '+' '-' ;
#left '*' '/' '%' ;
#left @UMINUS  ; /* supplies precedence for unary minus */

//%% /* beginning of rules section */

list$  :  /* empty */
      |  list stat '\n'
      |  list @error '\n'
      ;

stat  :  expr
      |  @LETTER '=' expr
      ;

expr  :  '(' expr ')'
      |  expr '+' expr
      |  expr '-' expr
      |  expr '*' expr
      |  expr '/' expr
      |  expr '%' expr
      |  expr '&' expr
      |  expr '|' expr
      |  '-' expr #precedence @UMINUS
      |  @LETTER
      |  @number
      | /*empty*/ #precedence '&'
      ;

The text was updated successfully, but these errors were encountered:

mingodad · 2021-12-01T16:56:08Z

Here I¡m attaching the postgresql-13.3 and sqlite3 converted grammars with the rules with empty rhs and precedence where the precedence were commented out to be able to process with unicc,

Just converted:

on_opt : @ON /*13N*/ expr #precedence @ON /*13N*/ ;
on_opt : /*empty*/ #precedence @OR /*1L*/ ;

Commented to allow process with unicc:

on_opt : @ON /*13N*/ expr #precedence @ON /*13N*/ ;
on_opt : /*empty*/ ; //#precedence @OR /*1L*/ ;

grammars.zip

mingodad · 2021-12-01T17:03:58Z

On my previous messages where I wrote "empty lhs" in reality it's "empty rhs".

mingodad · 2021-12-02T09:23:09Z

After reading almost all manual I've changed the conversion from yacc to unicc grammar to output quoted strings for terminals instead of @regex and also add #!mode insensitive; but even then the time spent on postgresql converted grammar is still high (100s).

mingodad · 2021-12-02T10:02:11Z

It segfaults here:

prod_directives<int>: 	'#' "precedence" terminal

						[*	current_prod->prec = @terminal->prec; *]

					;

Where current_prod is NULL, also it doesn't seem to be a good choice here:

production<PROD*>	:	line_number rhs_opt:rhs code_opt_dup:act
							ast_node
							prod_directives* //<<< Wouldn't prod_directives? be better

Do we accept multiple prod_directives ?

mingodad · 2021-12-02T11:30:59Z

This seems to fix the segfault problem:

@@ -712,7 +712,7 @@ production<PROD*>   :       line_number rhs_opt:rhs code_opt_dup:act
 
 rhs_opt<PROD*>         :       rhs
 
-                                       |       [*      @@ = create_production( parser,
+                                       |       [*      @@ = current_prod = create_production( parser,
                                                                (SYMBOL*)NULL );
                                                *]
                                        ;

mingodad · 2021-12-02T11:41:46Z

With the previous fix now we can process the converted grammar for postgresql-13.3 and cql (https://github.com/facebookincubator/CG-SQL/blob/main/sources/cql.y) without any warning (see atthached).
sql-grammars.zip

phorward · 2021-12-04T13:16:56Z

Hello @mingodad, thank you very much for your diving into this issue and finding a fix for the problem.

Indeed, the production-directives could be parsed by changing the grammar to accept just ? one of them, instead of an optional list *, but this won't work anymore in case a unicc grammar should accept more than one directive for productions.

Unfortunately, I don't spend much time on unicc and its abandoned successor unicc2 currently, as my focus moved to other projects. Anyway, I think your pull request is helpful and makes unicc a little bit more stable. I still like the idea of having one grammar and turning it into several target languages, optionally without change. Maybe this feature could be part of another, more actively developed program, for example Lark.

phorward · 2022-07-21T09:34:15Z

Segfault fixed by #24

mingodad changed the title ~~Segfault when setting precedence on a rule with empty lhs~~ Segfault when setting precedence on a rule with empty rhs Dec 1, 2021

phorward added the bug label Dec 4, 2021

phorward mentioned this issue Dec 4, 2021

Fix segfault that occurs when trying to add precedence to a productio… #24

Merged

mingodad mentioned this issue Jul 16, 2022

Performance issues #27

Open

phorward closed this as completed Jul 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault when setting precedence on a rule with empty rhs #22

Segfault when setting precedence on a rule with empty rhs #22

mingodad commented Dec 1, 2021 •

edited

Loading

mingodad commented Dec 1, 2021 •

edited

Loading

mingodad commented Dec 1, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021 •

edited

Loading

phorward commented Dec 4, 2021

phorward commented Jul 21, 2022

Segfault when setting precedence on a rule with empty rhs #22

Segfault when setting precedence on a rule with empty rhs #22

Comments

mingodad commented Dec 1, 2021 • edited Loading

mingodad commented Dec 1, 2021 • edited Loading

mingodad commented Dec 1, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021

mingodad commented Dec 2, 2021 • edited Loading

phorward commented Dec 4, 2021

phorward commented Jul 21, 2022

mingodad commented Dec 1, 2021 •

edited

Loading

mingodad commented Dec 1, 2021 •

edited

Loading

mingodad commented Dec 2, 2021 •

edited

Loading