Add support for bracket expressions #12
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds basic support for POSIX bracket expressions such as:
[a-zA-Z0-9]
[^\n\r\t]
[aceghi-z]
Both parsing (parsing a bracket expression as part of a regular expression) and unparsing (turning a set of
char
into a bracket expression are supported. Both directions involve some care to correctly handle characters that have special meaning either in regular expressions (e.g..
,|
,*
) or in bracket expressions (]
,^
,-
). For example,^
has a special meaning (complement) at the beginning of a bracket expression, but not elsewhere-
has a special meaning (range) except at the beginning or end of a bracket expression]
has a special meaning (terminate the bracket expression) except at the beginning of a bracket expression (i.e. after the opening[
or[^
)Unparsing prefers shorter output expressions: for example, for the set of all characters except
e
unparsing returns[^e]
, not[\000-df-\255]
.