cannot mix strings and tokens #22

GoogleCodeExporter · 2015-04-17T03:09:37Z

the code below

-----
from lepl import *

v = Token('[a-z]+') & Token(' +') & String()
v.parse('aaa "aaa"')
-----

gives the error

------
lepl.lexer.support.LexerError: The grammar contains a mix of Tokens and 
non-Token matchers at the top level.  If Tokens are used then non-token 
matchers that consume input must only appear "inside" Tokens.  The non-Token 
matchers include: Any(None); Literal('"'); Lookahead(Literal, True); 
Literal('"'); Literal('"'); Literal('\\').
------

trying to tokenize string fails as well

-------
from lepl import *

v = Token('[a-z]+') & Token(' +') & Token(String())
v.parse('aaa "aaa"')
-------

as the code above gives

-------
lepl.lexer.support.LexerError: A Token was specified with a matcher, but the 
matcher could not be converted to a regular expression: And(NfaRegexp, 
Transform, NfaRegexp)
--------

Original issue reported on code.google.com by wrob...@gmail.com on 26 Dec 2011 at 6:10

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-04-17T03:09:38Z

hi.  the main issue here isn't a bug - it's how lepl works.  you can either 
work in tokens, or not, but not both.  that's what the error message says.

trying to tokenize String fails because String is too complex for lepl to 
tokenize automatically.  it might be possible to write String so that it can be 
converted automatically, and i will add that to the list of things to do, but 
meantime you can simply define your own regular expression:

  myString = Regexp("'[^']*'")

or similar.

andrew

Original comment by acooke....@gmail.com on 1 Jan 2012 at 10:55

Changed state: WontFix

GoogleCodeExporter · 2015-04-17T03:09:38Z

[deleted comment]

GoogleCodeExporter · 2015-04-17T03:09:38Z

Well, I have defined my own string indeed

    string = Token('"[^"]+"') | Token("'[^']+'")

However I have no idea how to allow quoting of apostrophe or quotation
characters with backslash, i.e. "test\"string" or 'test\'string'.

Using Python regular expressions that would be

     r""""(([^"]|\")+)"|'(([^']|\')+)'"""

w

Original comment by wrob...@gmail.com on 1 Jan 2012 at 11:48

GoogleCodeExporter · 2015-04-17T03:09:38Z

the syntax for regexps should be the same as python, except that capturing 
groups are not supported.  so you need to replace each (...) with (?:...)

andrew

Original comment by acooke....@gmail.com on 2 Jan 2012 at 12:11

GoogleCodeExporter · 2015-04-17T03:09:38Z

Thanks for the tip.

The definition is as follows

    string = Token(r'"(?:[^"]|\\")+"') | Token(r"'(?:[^']|\\')+'")

But I wonder how above is different from SingleLineString?

If it is not different, then I would like to report that the following code 
fails

"""
from lepl import *

v = Token('[a-z]+') & Token(' +') & Token(SingleLineString())
v.parse('aaa "aaa"')
"""

w

Original comment by wrob...@gmail.com on 5 Jan 2012 at 6:53

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels Apr 17, 2015

GoogleCodeExporter closed this as completed Apr 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cannot mix strings and tokens #22

cannot mix strings and tokens #22

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

cannot mix strings and tokens #22

cannot mix strings and tokens #22

Comments

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015

GoogleCodeExporter commented Apr 17, 2015