-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zero width negative lookahead #23
Comments
I started to write a long post about how I need to implement what Scala parser combinators calls from parsita import *
keywords = {"def", "class", "in", "out"}
class NonkeywordParsers(TextParsers):
variable = pred(reg(r'[a-zA-Z_][a-zA-Z0-9_]*'), lambda x: x not in keywords, 'nonkeyword')
assert NonkeywordParsers.variable.parse('foo') == Success('foo')
NonkeywordParsers.variable.parse('class').or_die()
# parsita.state.ParseError: Expected nonkeyword but found 'class'
# Line 1, character 1
# class
# ^ I probably still need to implement |
Thanks. I'm trying to generalize it though, so that I can condition on a parser instance. I'd expect the following to work as well, but it quietly succeeds: from parsita import *
class NonkeywordParsers(TextParsers):
keyword = lit("class")
variable = (keyword & failure("Unexpected keyword")) | reg(r'[a-zA-Z_][a-zA-Z0-9_]*')
NonkeywordParsers.variable.parse('foo') == Success('foo')
NonkeywordParsers.variable.parse('class').or_die() I find that a little surprising. |
What does work is this: from parsita import *
keywords = {"def", "class", "in", "out"}
class NonkeywordParsers(TextParsers):
keyword = lit("def", "class", "in", "out")
variable = pred(reg(r'[a-zA-Z_][a-zA-Z0-9_]*'), lambda x: not isinstance(NonkeywordParsers.keyword.parse(x), Success), 'nonkeyword')
assert NonkeywordParsers.variable.parse('foo') == Success('foo')
NonkeywordParsers.variable.parse('class').or_die() It's a bit ugly though, referencing the |
In this case, |
so EDIT: I get it, so |
No, it is a |
Parsita does not really have a concept of "This is not just a failure, but getting here is a catastrophic failure that should cause all alternatives to be ignored". I guess you could do this is you wanted to blow up the parser: from parsita import *
keywords = {"def", "class", "in", "out"}
def abort(message):
raise ParseError(message)
class NonkeywordParsers(TextParsers):
keyword = lit("class")
variable = (keyword > (lambda x: abort('Unexpected keyword'))) | reg(r'[a-zA-Z_][a-zA-Z0-9_]*')
assert NonkeywordParsers.variable.parse('foo') == Success('foo')
NonkeywordParsers.variable.parse('class').or_die() |
I see now why you want from typing import Generic
from parsita import *
from parsita.state import Input, Output, Convert, Reader, Continue, Backtrack
class NegationParser(Generic[Input, Output], Parser[Input, Output]):
def __init__(self, parser: Parser[Input, Output]):
super().__init__()
self.parser = parser
def consume(self, reader: Reader[Input]):
status = self.parser.consume(reader)
if isinstance(status, Backtrack):
return Continue(reader, None)
else:
return Backtrack(reader, lambda: self.name_or_nothing())
def name_or_nothing(self):
return 'not ' + self.parser.name_or_nothing()
def __repr__(self):
return '-' + self.name_or_nothing()
def neg(parser):
return NegationParser(parser)
class NonkeywordParsers(TextParsers):
keyword = lit("def", "class", "in", "out")
variable = neg(keyword) >> reg(r'[a-zA-Z_][a-zA-Z0-9_]*')
assert NonkeywordParsers.variable.parse('foo') == Success('foo')
NonkeywordParsers.variable.parse('class').or_die() Alternatively, in the EDIT: hmm, |
Could we have a way to do a zero width negative lookahead?
The first thing I tried to do was define a set of reserved words, and then define identifiers as "words but not keywords".
In terms of syntax you could use unary minus, as it binds tightly. Something like:
EDIT: I realise that I could do it using just regexes, but then word boundaries get a bit clunky, as I'd need to embed the whitespace handling in the regex, which isn't ideal.
The text was updated successfully, but these errors were encountered: