Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong python code generated #66

Closed
bendrissou opened this issue Dec 2, 2022 · 3 comments
Closed

Wrong python code generated #66

bendrissou opened this issue Dec 2, 2022 · 3 comments

Comments

@bendrissou
Copy link

bendrissou commented Dec 2, 2022

Grammarinator generating incorrect code with some grammars. In particular, when a non-terminal is in the definition of a token. When generating the code with grammarinator-process, no error is thrown. Only when generating inputs with grammarinator-generate ,does the error appear.

The bug can be reproduced with the following grammar:

json
   : token
   ;

token
   : NUMBER   ;

NUMBER
   : '-'? INT ('.' [0-9] +)? token?
   ;

fragment INT
   : '0' | [1-9] [0-9]*
   ;

After running grammarinator-process and grammarinator-generate. The error message thrown is:

Traceback (most recent call last):
  File "/home/bachir/.local/bin/grammarinator-generate", line 8, in <module>
    sys.exit(execute())
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 281, in execute
    with Generator(unlexer_path=args.unlexer, unparser_path=args.unparser, rule=args.rule, out_format=args.out,
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 71, in __init__
    self.unlexer_cls = import_entity('.'.join([unlexer, unlexer]))
  File "/home/bachir/.local/lib/python3.8/site-packages/grammarinator/generate.py", line 59, in import_entity
    return getattr(importlib.import_module('.'.join(steps[0:-1])), steps[-1])
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 844, in exec_module
  File "<frozen importlib._bootstrap_external>", line 981, in get_code
  File "<frozen importlib._bootstrap_external>", line 911, in source_to_code
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/bachir/grammar-mutation/json/grammarinator-bugs/13/generate/JSONUnlexer.py", line 42
    return current
    ^
IndentationError: expected an indented block

When looking at the file JSONUnlexer.py, we see the following invalid code:

        if self.unlexer.max_depth >= 0:
            for _ in self.zero_or_one():

        return current
@bendrissou bendrissou changed the title Bug: Wrong python code generated Wrong python code generated Dec 2, 2022
@renatahodovan
Copy link
Owner

Hi @bendrissou

Thanks for the report! The generated Python code is incorrect, since the input grammar is not correct either. Grammarinator follows the syntax of ANTLR where terminal rules must not refer to non-terminals. If you generate a parser from the grammar above with ANTLR then you get the following error message:

error(160): /work/dir/json.g4:11:29: reference to parser rule token in lexer rule NUMBER
warning(125): json.g4:8:5: implicit definition of token NUMBER in parser

If you transform token into a terminal rule (rename it to Token) then it will work as expected (both with Grammarinator and ANTLR).

You are right though, that it would be helpful to throw an error from grammarinator-process if the input grammar is not correct. I'll think about it how to handle all the possible input errors (w/o explicitly validating it with ANTLR).

@renatahodovan
Copy link
Owner

@bendrissou Did you manage to fix the grammar? Can we close this issue?

@bendrissou
Copy link
Author

Yes, I have fixed my grammar. I only wanted to report the bug. Found the bug with my fuzzer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants