@@ -892,6 +892,63 @@ that their regular expression needs to match every piece of data in the line,
892
892
including what they want to match. Write just enough to match the data you're
893
893
looking for, no more, no less.
894
894
895
+ = head1 X < Tilde for nesting structures|tilde,regex;~,regex >
896
+
897
+ The ~ operator is a helper for matching nested subrules with a
898
+ specific terminator as the goal. It is designed to be placed between
899
+ an opening and closing bracket, like so:
900
+
901
+ / '(' ~ ')' <expression> /
902
+
903
+ However, it mostly ignores the left argument, and operates on the next
904
+ two atoms (which may be quantified). Its operation on those next two
905
+ atoms is to "twiddle" them so that they are actually matched in
906
+ reverse order. Hence the expression above, at first blush, is merely
907
+ shorthand for:
908
+
909
+ / '(' <expression> ')' /
910
+
911
+ But beyond that, when it rewrites the atoms it also inserts the
912
+ apparatus that will set up the inner expression to recognize the
913
+ terminator, and to produce an appropriate error message if the inner
914
+ expression does not terminate on the required closing atom. So it
915
+ really does pay attention to the left bracket as well, and it actually
916
+ rewrites our example to something more like:
917
+
918
+ = begin code :skip-test
919
+ $<OPEN> = '(' <SETGOAL: ')'> <expression> [ $GOAL || <FAILGOAL> ]
920
+ = end code
921
+
922
+ Note that you can use this construct to set up expectations for a
923
+ closing construct even when there's no opening bracket:
924
+
925
+ / <?> ~ ')' \d+ /
926
+
927
+ Here <?> returns true on the first null string.
928
+
929
+ By default the error message uses the name of the current rule as an
930
+ indicator of the abstract goal of the parser at that point. However,
931
+ often this is not informative, especially when rules are
932
+ named according to an internal scheme that will not make sense to the
933
+ user. The :dba("doing business as") adverb may be used to set up a
934
+ more informative name for what the following code is trying to parse:
935
+
936
+ token postfix:sym<[ ]> { :dba('array subscript') '[' ~ ']' <expression> }
937
+
938
+ Then instead of getting a message like:
939
+
940
+ = begin code :skip-test
941
+ Unable to parse expression in postfix:sym<[ ]>; couldn't find
942
+ final ']'
943
+ = end code
944
+
945
+ you'll get a message like:
946
+
947
+ = begin code :skip-test
948
+ Unable to parse expression in array subscript; couldn't find final
949
+ ']'
950
+ = end code
951
+
895
952
= head1 X < Subrules|declarator,regex >
896
953
897
954
Just like you can put pieces of code into subroutines, you can also put
0 commit comments