You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think I found a bug with parse table caching and the start keyword to
yacc.yacc().
This script illustrates the problem:
""" Nasty behavior for start=
"""
tokens = ['FOO', 'BAR']
t_FOO = r'foo'
t_BAR = r'bar'
def p_foo_bar(p):
' foo_bar : FOO BAR'
p[0] = 'have foobar'
def p_bar(p):
' bar : BAR '
p[0] = 'have bar'
if __name__ == '__main__':
import os
from ply import lex, yacc
lex.lex()
# Remove written parsed tables
if os.path.exists('parsetab.py'):
os.unlink('parsetab.py')
if os.path.exists('parsetab.pyc'):
os.unlink('parsetab.pyc')
# Generate a parser with non-default start rule
parser = yacc.yacc(start='bar') # no error if commenting
assert parser.parse('bar') == 'have bar' # out these two lines
# Generate a parser with default start rule and another tabmodule
parser = yacc.yacc(start='foo_bar', tabmodule='another')
# This works
assert parser.parse('foobar') == 'have foobar'
# Generate a parser with default start rule and tabmodule
parser = yacc.yacc(start='foo_bar')
# The following failus "yacc: Syntax error at line 1, token=FOO"
assert parser.parse('foobar') == 'have foobar'
Investigating further, I think what is happening is that the changes to the
start symbol around 3129 of yacc.py get written to the parsetab module, but
they do not change the signature of the parsetab module. When yacc.yacc()
gets called with another start symbol (or the default), it reads the lex /
yacc symbols from the relevant module or class, checks the signature, detects
that the signature matches the cached parsetab signature, and uses the cached
parstab, even though the specified (or default) start synbol differs from the
start symbol in the previoulsy written parsetab. This can be very confusing,
because the actual start symbol used will depend which one got written first.
It wasn't clear to me what the right fix for this was. I wonder whether the
yacc() should specify the start symbol in the lexer / grammar symbols before
checking the signatures, something like:
diff --git a/ply/yacc.py b/ply/yacc.py
index f70439e..e50d81c 100644
--- a/ply/yacc.py
+++ b/ply/yacc.py
@@ -3054,6 +3054,10 @@ def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, star
else:
pdict = get_caller_module_dict(2)
+ # Set start symbol if specified
+ if start is not None:
+ pdict['start'] = start
+
# Collect parser information from the dictionary
pinfo = ParserReflect(pdict,log=errorlog)
pinfo.get_all()
This does change the signature from pinfo.signature() so will force yacc()
to regenerate the parsetab module unless the explicit start symbol was the
same.
Thanks for a lot for Ply, I have had good use from it.
The text was updated successfully, but these errors were encountered:
I think I found a bug with parse table caching and the
start
keyword toyacc.yacc().
This script illustrates the problem:
Investigating further, I think what is happening is that the changes to the
start symbol around 3129 of yacc.py get written to the parsetab module, but
they do not change the signature of the parsetab module. When yacc.yacc()
gets called with another start symbol (or the default), it reads the lex /
yacc symbols from the relevant module or class, checks the signature, detects
that the signature matches the cached parsetab signature, and uses the cached
parstab, even though the specified (or default) start synbol differs from the
start symbol in the previoulsy written parsetab. This can be very confusing,
because the actual start symbol used will depend which one got written first.
It wasn't clear to me what the right fix for this was. I wonder whether the
yacc() should specify the start symbol in the lexer / grammar symbols before
checking the signatures, something like:
This does change the signature from
pinfo.signature()
so will force yacc()to regenerate the parsetab module unless the explicit start symbol was the
same.
Thanks for a lot for Ply, I have had good use from it.
The text was updated successfully, but these errors were encountered: