Wrong python code generated #66

bendrissou · 2022-12-02T16:32:18Z

Grammarinator generating incorrect code with some grammars. In particular, when a non-terminal is in the definition of a token. When generating the code with grammarinator-process, no error is thrown. Only when generating inputs with grammarinator-generate ,does the error appear.

The bug can be reproduced with the following grammar:

json
   : token
   ;

token
   : NUMBER   ;

NUMBER
   : '-'? INT ('.' [0-9] +)? token?
   ;

fragment INT
   : '0' | [1-9] [0-9]*
   ;

After running grammarinator-process and grammarinator-generate. The error message thrown is:

Traceback (most recent call last):
  File "/home/bachir/.local/bin/grammarinator-generate", line 8, in <module>
    sys.exit(execute())
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 281, in execute
    with Generator(unlexer_path=args.unlexer, unparser_path=args.unparser, rule=args.rule, out_format=args.out,
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 71, in __init__
    self.unlexer_cls = import_entity('.'.join([unlexer, unlexer]))
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 59, in import_entity
    return getattr(importlib.import_module('.'.join(steps[0:-1])), steps[-1])
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 844, in exec_module
  File "<frozen importlib._bootstrap_external>", line 981, in get_code
  File "<frozen importlib._bootstrap_external>", line 911, in source_to_code
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/bachir/grammar-mutation/json/grammarinator-bugs/13/generate/JSONUnlexer.py", line 42
    return current
    ^
IndentationError: expected an indented block

When looking at the file JSONUnlexer.py, we see the following invalid code:

        if self.unlexer.max_depth >= 0:
            for _ in self.zero_or_one():

        return current

The text was updated successfully, but these errors were encountered:

renatahodovan · 2022-12-02T21:29:35Z

Hi @bendrissou

Thanks for the report! The generated Python code is incorrect, since the input grammar is not correct either. Grammarinator follows the syntax of ANTLR where terminal rules must not refer to non-terminals. If you generate a parser from the grammar above with ANTLR then you get the following error message:

error(160): /work/dir/json.g4:11:29: reference to parser rule token in lexer rule NUMBER
warning(125): json.g4:8:5: implicit definition of token NUMBER in parser

If you transform token into a terminal rule (rename it to Token) then it will work as expected (both with Grammarinator and ANTLR).

You are right though, that it would be helpful to throw an error from grammarinator-process if the input grammar is not correct. I'll think about it how to handle all the possible input errors (w/o explicitly validating it with ANTLR).

renatahodovan · 2022-12-13T22:29:27Z

@bendrissou Did you manage to fix the grammar? Can we close this issue?

bendrissou · 2022-12-14T12:19:09Z

Yes, I have fixed my grammar. I only wanted to report the bug. Found the bug with my fuzzer.

bendrissou changed the title ~~Bug: Wrong python code generated~~ Wrong python code generated Dec 2, 2022

renatahodovan closed this as completed Dec 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong python code generated #66

Wrong python code generated #66

bendrissou commented Dec 2, 2022 •

edited

renatahodovan commented Dec 2, 2022

renatahodovan commented Dec 13, 2022

bendrissou commented Dec 14, 2022

Wrong python code generated #66

Wrong python code generated #66

Comments

bendrissou commented Dec 2, 2022 • edited

renatahodovan commented Dec 2, 2022

renatahodovan commented Dec 13, 2022

bendrissou commented Dec 14, 2022

bendrissou commented Dec 2, 2022 •

edited