Index the various bits of regex syntax

Mouq · Mouq · commit a3701a0da98c · 2015-01-26T22:44:47.000-05:00
diff --git a/lib/Language/regexes.pod b/lib/Language/regexes.pod
@@ -79,7 +79,7 @@ Otherwise it is L<Nil>.
 
 =head1 Wildcards and character classes
 
-=head2 Dot to match any character
+=head2 X<Dot to match any character|regex syntax,.>
 
 An unescaped dot C<.> in a regex matches any single character.
 
@@ -101,7 +101,7 @@ because there is no character to match before C<per> in the target string.
 There are predefined character classes of the form C<\w>. Its negation is
 written with an upper-case letter, C<\W>.
 
-=item \d and \D
+=item X<\d and \D|regex syntax,\d;regex syntax,\D>
 
 C<\d> matches a single digit (Unicode property C<N>), and C<\D> matches a
 single character that is not a digit.
@@ -119,7 +119,7 @@ Examples for digits are
     U+0E53 ๓ THAI DIGIT THREE
     U+1B56 ᭖ BALINESE DIGIT SIX
 
-=item \h and \H
+=item X<\h and \H|regex syntax,\h;regex syntax,\H>
 
 C<\h> matches a single horizontal whitespace character. C<\H> matches a
 single character that is not a horizontal whitespace character.
@@ -134,27 +134,27 @@ Examples for horizontal whitespace characters are
 Vertical whitespaces like newline characters are explicitly excluded; those
 can be matched with C<\v>, and C<\s> matches any kind of whitespace.
 
-=item \n and \N
+=item X<\n and \N|regex syntax,\n;regex syntax,\N>
 
 C<\n> matches a single, logical newline character. C<\n> is supposed to also
 match a Windows CR LF codepoing pair; though it is unclear whether the magic
 happens at the time that external data is read, or at regex match time. C<\N>
 matches a single character that's not a logical newline.
 
-=item \s and \S
+=item X<\s and \S|regex syntax,\s;regex syntax,\S>
 
 C<\s> matches a single whitespace character. C<\S> matches a single
 character that is not a whitspace.
 
 TODO: examples
 
-=item \t and \T
+=item X<\t and \T|regex syntax,\t;regex syntax,\T>
 
 C<\t> matches a single tab/tabulation character, C<U+0009>. (Note that
 exotic tabs like the C<U+000B VERTICAL TABULATION> character are not included
 here). C<\T> matches a single character that is not a tab.
 
-=item \v and \V
+=item X<\v and \V|regex syntax,\v;regex syntax,\V>
 
 C<\v> matches a single vertical whitespace character. C<\V> match a single
 character that is not a vertical whitspace.
@@ -169,7 +169,7 @@ Examples for vertical whitespace characters:
 
 Use C<\s> to match any kind of whitespace, not just vertical whitespace
 
-=item \w and \W
+=item X<\w and \W|regex syntax,\w;regex syntax,\W>
 
 C<\w> matches a single word character, that is a letter (Unicode category L),
 a digit or an underscore. C<\W> matches a single character that isn't a word
@@ -183,7 +183,7 @@ Examples of word characters:
     03F3 ϳ GREEK LETTER YOT
     0409 Љ CYRILLIC CAPITAL LETTER LJE
 
-=head2 Unicode properties
+=head2 X«Unicode properties|regex syntax,<:property>»
 
 The character classes so far are mostly for convenience; a more systematic
 approach is the use of Unicode properties. They are called in the form
@@ -265,7 +265,7 @@ C<< <:Ll+:N> >> or C<< <:Ll+:Number> >> or C<< <+ :Lowercase_Letter + :Number> >
 (Grouping of set operations with round parens inside character classes is
 supposed to work, but not supported by Rakudo at the time of writing).
 
-=head2 Enumerated character classes and ranges
+=head2 X«Enumerated character classes and ranges|regex syntax,<[ ]>;regex assertion,<-[ ]>»
 
 Sometimes the pre-existing wildcards and character classes are just not
 enough. Fortunately, defining your own is simple enough. Between C<< <[ ]> >>,
@@ -312,7 +312,7 @@ Quantifiers bind tighter than concatenation, so C<ab+> matches one C<a>
 followed by one or more C<b>s. This is different for quotes, so C<'ab'+>
 matches the strings C<ab>, C<abab>, C<ababab> etc.
 
-=head2 One or more: +
+=head2 X<One or more: +|regex syntax,+>
 
 The C<+> quantifier makes the preceding atom match one or more times, with
 no upper limit.
@@ -322,7 +322,7 @@ like this:
 
     / \w+ '=' \w+ /
 
-=head2 Zero or more: *
+=head2 X<Zero or more: *|regex syntax,*>
 
 The C<*> quantifier makes the preceding atom match zero or more times, with
 no upper limit.
@@ -331,19 +331,19 @@ For example to optional whitespace between C<a> and C<b> you can write
 
     / a \s* b /
 
-=head2 Zero or one match: ?
+=head2 X<Zero or one match: ?|regex syntax,?>
 
 The C<?> quantifier makes the preceding atom match zero or one time.
 
-=head2 General quantifier: ** min..max
+=head2 X<General quantifier: ** min..max|regex quantifier,**>
 
 To quantifier an atom an arbitrary number of times, you can say for example
 C<a ** 2..5> to match the character C<a> at least twice and at most 5 times
 
 If minimal and maximal number of matches are the same, a single integer
 is possible: C<a ** 5> to match C<a> exactly five times.
 
-=head1 Alternation
+=head1 X<Alternation|regex syntax,||>
 
 To match one of several possible alternatives, separate them by C<||>; the
 first matching alternative wins.
@@ -379,7 +379,7 @@ match.
 Anchors need to match successfully in order for the whole regex to match, but
 they do not use up characters while matching.
 
-=head2 C<^>, Start of String
+=head2 X«C<^>, Start of String|regex syntax,^»
 
 The C<^> assertion only matches at the start of the string.
 
@@ -388,7 +388,7 @@ The C<^> assertion only matches at the start of the string.
     say so 'perly'    ~~ /^ perl/;      # True
     say so 'perl'     ~~ /^ perl/;      # True
 
-=head2 C<^^>, Start of Line and C<$$>, End of Line
+=head2 X«C<^^>, Start of Line and C<$$>, End of Line|regex syntax,^^;regex syntax,$$»
 
 The C<^^> assertion matches at the start of a logical line. That is, either at
 the start of the string, or after a newline character.
@@ -420,7 +420,7 @@ leading space, and the third and fourth lines have two leading spaces each).
                                         #        and the end of line)
     say so $str ~~ / '."' $$/;          # True  (at the last line)
 
-=head2 C<<< << >>> and C<<< >> >>>, left and right word boundary
+=head2 X<<<<C<<< << >>> and C<<< >> >>>, left and right word boundary|regex syntax,<<;regex syntax,>>;regex syntax,«;regex syntax,»>>>>
 
 C<<< << >>> matches a left word boundary, so positions where at the left there
 a non-word character (or the start of the string), and to the right there is a
@@ -438,7 +438,7 @@ the end of the string.
     say so $str ~~ /<< own/;            # False
     say so $str ~~ /own >>/;            # True
 
-=head1 Grouping and Capturing
+=head1 X«Grouping and Capturing|regex syntax,( );regex syntax,[ ];regex syntax,$<capture> =»
 
 In regular (non-regex) Perl 6, you can use parenthesis to group things
 together, usually to override operator precedence:
@@ -561,7 +561,7 @@ named captures:
 But there is a more convenient way to get named captures, discussed in the
 next section.
 
-=head1 Subrules
+=head1 X<Subrules|declarator,regex>
 
 Just like you can put pieces of code into subroutines, so you can also put
 pieces of regex into named rules.
@@ -649,7 +649,7 @@ like C<:overlap> go along with the matching:
     #     aA
 
 
-=head2 Regex Adverbs
+=head2 X<Regex Adverbs|regex adverb,:ignorecase;regex adverb,:i>
 
 Adverbs that appear at the time of a regex declaration  are part of the actual regex,
 and influences how the Perl 6 compiler translates the regex into binary code.
@@ -677,7 +677,7 @@ Brackets and parenthesis limit the scope of an adverb:
     / (:i a b) c /          # matches 'ABc' but not 'ABC'
     / [:i a b] c /          # matches 'ABc' but not 'ABC'
 
-=head3 Ratchet
+=head3 X<Ratchet|regex adverb,:ratchet;regex adverb,:r>
 
 The C<:ratchet> or C<:r> adverb causes the regex engine not to backtrack.
 
@@ -710,7 +710,7 @@ to declaring ratcheting regex:
     # short for
     my regex thing { :r ... }
 
-=head3 Sigspace
+=head3 X<Sigspace|regex adverb,:sigspace;regex adverb,:s>
 
 The B<C<:sigspace>> or B<C<:s>> adverb makes whitespace significant in a regex.
 
@@ -809,7 +809,7 @@ matching adverbs only make sense while matching a string against a regex.
 They can never appear inside a regex, only on the outside - either as part of
 an C<m/.../> match, or as arguments to a match method.
 
-=head3 Continue
+=head3 X<Continue|matching adverb,:continue;matching adverb,:c>
 
 The C<:continue> or short C<:c> adverb takes an argument. The argument is the
 position where the regex should start to search. By default, it searches from
@@ -824,7 +824,7 @@ the start of the string, but C<:c> overrides that.
 
 TODO
 
-=head2 Global
+=head3 X<Global|regex adverb,:global;regex adverb,:g>
 
 Instead of search just one match, and returning a L<Match|/type/Match>, search
 for every non-overlapping match and returns them in a L<List|/type/List>.
@@ -837,7 +837,7 @@ for every non-overlapping match and returns them in a L<List|/type/List>.
 
 C<:g> is a shortcut for C<:global>.
 
-=head3 Pos
+=head3 X<Pos|regex adverb,:pos;regex adverb,:p>
 
 Anchor the match at a specific position in the string: