Skip to content

Commit 2e92958

Browse files
committed
Changes capitalization (#2146) and anchors (to avoid #561)
1 parent 984ad94 commit 2e92958

File tree

5 files changed

+60
-44
lines changed

5 files changed

+60
-44
lines changed

doc/Language/5to6-nutshell.pod6

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ features and idioms are not).
1212
Hence this should not be mistaken for a beginner tutorial or a promotional
1313
overview of Perl 6; it is intended as a technical reference for Perl 6
1414
learners with a strong Perl 5 background and for anyone porting Perl 5 code
15-
to Perl 6 (though note that L<#Automated Translation> might be more
15+
to Perl 6 (though note that L<#Automated translation> might be more
1616
convenient).
1717
1818
A note on semantics; when we say "now" in this document, we mostly just

doc/Language/glossary.pod6

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -865,7 +865,14 @@ is its usual acronym.
865865
866866
=head1 property
867867
868-
In this context, it either refers to an L<object property|https://docs.perl6.org/language/objects#index-entry-Property>, which is the value of an instance variable, or an L<Unicode property|https://docs.perl6.org/language/regexes#Unicode_Properties> which are codepoint features that allow programs to identify what kind of entity they represent, that is, if they are a letter, or a number, or something completely different like a control character.
868+
In this context, it either refers to an
869+
L<object property|/language/objects#index-entry-Property>,
870+
which is the value of an instance variable, or an
871+
L<Unicode property|/language/regexes#Unicode_properties>
872+
which are codepoint features that
873+
allow programs to identify what kind of entity they represent, that is, if they
874+
are a letter, or a number, or something completely different like a control
875+
character.
869876
870877
X<|pugs>
871878
=head1 pugs

doc/Language/regexes.pod6

Lines changed: 47 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Regular expressions, I<regexes> for short, are a sequence of characters that
1010
describe a pattern of text. Pattern matching is the process of matching
1111
those patterns to actual text.
1212
13-
=head1 X<Lexical Conventions|quote,/ /;quote,rx;quote,m>
13+
=head1 X<Lexical conventions|quote,/ /;quote,rx;quote,m>
1414
1515
Perl 6 has special syntax for writing regexes:
1616
@@ -242,14 +242,15 @@ Note that the character classes C«<same>», C«<wb>» and C«<ww>» are
242242
so called zero-width assertions, which do not really match a
243243
character.
244244
245-
=head2 X«Unicode Properties|regex,<:property>»
245+
=head2 X«Unicode properties|regex,<:property>»
246246
247247
The character classes mentioned so far are mostly for convenience; another
248248
approach is to use Unicode character properties. These come in the form
249249
C«<:property>», where C<property> can be a short or long Unicode General
250250
Category name. These use pair syntax.
251251
252-
To match against a Unicode property you can use either smartmatch or L<C<uniprop>|/routine/uniprop>:
252+
To match against a Unicode property you can use either smartmatch or
253+
L<C<uniprop>|/routine/uniprop>:
253254
254255
"a".uniprop('Script'); # OUTPUT: «Latin␤»
255256
"a" ~~ / <:Script<Latin>> /; # OUTPUT: «「a」␤»
@@ -325,7 +326,7 @@ parentheses; for example:
325326
326327
say $0 if 'perl6' ~~ /\w+(<:Ll+:N>)/ # OUTPUT: «「6」␤»
327328
328-
=head2 X«Enumerated Character Classes and Ranges|regex,<[ ]>;regex,<-[ ]>»
329+
=head2 X«Enumerated character classes and ranges|regex,<[ ]>;regex,<-[ ]>»
329330
330331
Sometimes the pre-existing wildcards and character classes are not
331332
enough. Fortunately, defining your own is fairly simple. Within C«<[ ]>»,
@@ -587,7 +588,7 @@ string of non-whitespace characters.
587588
Even in non-backtracking contexts, the alternation operator C<||> tries
588589
all the branches in order until the first one matches.
589590
590-
=head1 X<Longest Alternation: C<|>|regex,|>
591+
=head1 X<Longest alternation: C<|>|regex,|>
591592
592593
In short, in regex branches separated by C<|>, the longest token match wins,
593594
independent of the textual ordering in the regex. However, what C<|> really
@@ -649,7 +650,7 @@ Arrays can also be interpolated into a regex to achieve the same effect:
649650
my @increasingly-edible = <f fo foo food>;
650651
say 'food' ~~ /@increasingly-edible/; # OUTPUT: «「food」␤»
651652
652-
This is documented further under L<Regex Interpolation|#Regex_Interpolation>,
653+
This is documented further under L<Regex Interpolation|#Regex_interpolation>,
653654
below.
654655
655656
=head1 X<Conjunction: C<&&>|regex,&&>
@@ -686,7 +687,7 @@ Regexes search an entire string for matches. Sometimes this is not what
686687
you want. Anchors match only at certain positions in the string, thereby
687688
anchoring the regex match to that position.
688689
689-
=head2 X<Start of String and End of String|regex,^;regex,$>
690+
=head2 X<Start of string and end of string|regex,^;regex,$>
690691
691692
The C<^> anchor only matches at the start of the string:
692693
@@ -729,7 +730,7 @@ The following is a multi-line string:
729730
# 'and' is at the start of a line -- not the string
730731
say so $str ~~ /^and /; # OUTPUT: «False␤»
731732
732-
=head2 X<Start of Line and End of Line|regex,^^;regex,$$>
733+
=head2 X<Start of line and end of line|regex,^^;regex,$$>
733734
734735
The C<^^> anchor matches at the start of a logical line. That is, either
735736
at the start of the string, or after a newline character. However, it does not
@@ -774,7 +775,7 @@ two leading spaces each.
774775
# matched at the last line
775776
say so $str ~~ / '."' $$/; # OUTPUT: «True␤»
776777
777-
=head2 X«Word Boundary|regex, <|w>;regex, <!|w>»
778+
=head2 X«Word boundary|regex, <|w>;regex, <!|w>»
778779
779780
To match any word boundary, use C«<|w>» or C«<?wb>». This is similar to
780781
X«C<\b>|regex deprecated,\b» of other languages.
@@ -860,7 +861,7 @@ lookahead and lookbehind assertions.
860861
Technically, anchors are also zero-width assertions, and they can look
861862
both ahead and behind.
862863
863-
=head2 X<Lookahead Assertions|regex,before>
864+
=head2 X<Lookahead assertions|regex,before>
864865
865866
To check that a pattern appears before another pattern, use a
866867
lookahead assertion via the C<before> assertion. This has the form:
@@ -942,7 +943,9 @@ These are, as in the case of lookahead, zero-width assertions which do not I<con
942943
say "atfoobar" ~~ / (.**3) .**2 <?after foo> bar /;
943944
# OUTPUT: «「atfoobar」␤ 0 => 「atf」␤»
944945
945-
where we capture the first 3 of the 5 characters before bar, but only if C<bar> is preceded by C<foo>. The fact that the assertion is zero-width allows us to use part of the characters in the assertion for capture.
946+
where we capture the first 3 of the 5 characters before bar, but only if C<bar>
947+
is preceded by C<foo>. The fact that the assertion is zero-width allows us to
948+
use part of the characters in the assertion for capture.
946949
947950
948951
@@ -1065,7 +1068,9 @@ it in a variable first:
10651068
10661069
10671070
X<|:my>
1068-
C<:my> helps scoping the C<$c> variable within the regex and beyond; in this case we can use it in the next sentence to show what has been matched inside the regex. This can be used for debugging inside regular expressions, for instance:
1071+
C<:my> helps scoping the C<$c> variable within the regex and beyond; in this
1072+
case we can use it in the next sentence to show what has been matched inside the
1073+
regex. This can be used for debugging inside regular expressions, for instance:
10691074
10701075
my $paragraph="line\nline2\nline3";
10711076
$paragraph ~~ rx| :my $counter = 0; ( \V* { ++$counter } ) *%% \n |;
@@ -1086,7 +1091,8 @@ say HasOur.parse('Þor is mighty'); # OUTPUT: «「Þor is mighty」␤»
10861091
say $HasOur::our; # OUTPUT: «Þor␤»
10871092
=end code
10881093
1089-
Once the parsing has been done successfully, we use the FQN name of the C<$our> variable to access its value, that can be none other than C<Þor>
1094+
Once the parsing has been done successfully, we use the FQN name of the C<$our>
1095+
variable to access its value, that can be none other than C<Þor>.
10901096
10911097
=head2 X<Named captures|regex, Named captures>
10921098
@@ -1136,9 +1142,10 @@ C<\K>.
11361142
say 'abc' ~~ / a <( b )> c/; # OUTPUT: «「b」␤»
11371143
say 'abc' ~~ / <(a <( b )> c)>/; # OUTPUT: «「bc」␤»
11381144
1139-
As in the example above, you can see C«<(» sets the start point and C«)>» sets the
1140-
endpoint; since they are actually independent of each other, the inner-most start point
1141-
wins (the one attached to C<b>) and the outer-most end wins (the one attached to C<c>).
1145+
As in the example above, you can see C«<(» sets the start point and C«)>» sets
1146+
the endpoint; since they are actually independent of each other, the inner-most
1147+
start point wins (the one attached to C<b>) and the outer-most end wins (the one
1148+
attached to C<c>).
11421149
11431150
=head1 Substitution
11441151
@@ -1430,7 +1437,7 @@ list of predefined subrules is listed in
14301437
L<S05-regex|https://design.perl6.org/S05.html#Predefined_Subrules> of design
14311438
documents.
14321439
1433-
=head1 X<Regex Interpolation|regex, Regex Interpolation>
1440+
=head1 X<Regex interpolation|regex, Regex Interpolation>
14341441
14351442
If you want to build a regex using a pattern given at runtime, regex
14361443
interpolation is what you are looking for.
@@ -1575,7 +1582,7 @@ like C<:overlap> are appended to the match call:
15751582
}
15761583
# OUTPUT: «ba␤aA␤»
15771584
1578-
=head2 X<Regex Adverbs|regex adverb,:ignorecase;regex adverb,:i>
1585+
=head2 X<Regex adverbs|regex adverb,:ignorecase;regex adverb,:i>
15791586
15801587
Adverbs that appear at the time of a regex declaration are part of the
15811588
actual regex and influence how the Perl 6 compiler translates the regex into
@@ -2150,12 +2157,13 @@ my $string = 'PostgreSQL is an SQL database!';
21502157
say $string ~~ /(.+)(SQL) (.+) $1/; # OUTPUT: 「PostgreSQL is an SQL」
21512158
=end code
21522159
2153-
What happens in the above example is that the string has to be matched against the
2154-
second occurrence of the word I<SQL>, eating all characters before and leaving out
2155-
the rest.
2160+
What happens in the above example is that the string has to be matched against
2161+
the second occurrence of the word I<SQL>, eating all characters before and
2162+
leaving out the rest.
21562163
2157-
Since it is possible to execute a piece of code within a regular expression, it is also possible
2158-
to inspect the L<Match|/type/Match> object within the regular expression itself:
2164+
Since it is possible to execute a piece of code within a regular expression, it
2165+
is also possible to inspect the L<Match|/type/Match> object within the regular
2166+
expression itself:
21592167
21602168
=begin code :preamble<my $string = '';>
21612169
my $iteration = 0;
@@ -2186,10 +2194,10 @@ Capture 2 = is an
21862194
showing that the string has been split around the second occurrence of I<SQL>, that
21872195
is the repetition of the first capture (C<$/[1]>).
21882196
2189-
With that in place, it is now possible to see how the engine backtracks
2190-
to find the above match: it does suffice to move the C<show-captures>
2191-
in the middle of the regular expression, in particular before the repetition of the
2192-
first capture C<$1> to see it in action:
2197+
With that in place, it is now possible to see how the engine backtracks to find
2198+
the above match: it does suffice to move the C<show-captures> in the middle of
2199+
the regular expression, in particular before the repetition of the first capture
2200+
C<$1> to see it in action:
21932201
21942202
=begin code :preamble<my $string = '';>
21952203
my $iteration = 0;
@@ -2207,8 +2215,8 @@ sub show-captures( Match $m ){
22072215
$string ~~ / (.+)(SQL) (.+) { show-captures( $/ ); } $1 /;
22082216
=end code
22092217
2210-
The output will be much more verbose and will show several iterations, with the last one
2211-
being the I<winning>. The following is an excerpt of the output:
2218+
The output will be much more verbose and will show several iterations, with the
2219+
last one being the I<winning>. The following is an excerpt of the output:
22122220
22132221
=begin code :lang<text>
22142222
=== Iteration 1 ===
@@ -2260,11 +2268,11 @@ say $string ~~ /(.+)(SQL) (.+) $1/; # OUTPUT: 「PostgreSQL is an SQL」
22602268
say $string ~~ / :r (.+)(SQL) (.+) $1/; # OUTPUT: Nil
22612269
=end code
22622270
2263-
The fact is that, as shown in the I<iteration 1> output, the first match
2264-
of the regular expression engine will be C<PostgreSQL is an >, C<SQL>, C< database>
2265-
that does not leave out any room for matching another occurrence of the word I<SQL>
2266-
(as C<$1> in the regular expression). Since the engine is not able to get backward and change the
2267-
path to match, the regular expression fails.
2271+
The fact is that, as shown in the I<iteration 1> output, the first match of the
2272+
regular expression engine will be C<PostgreSQL is an >, C<SQL>, C< database>
2273+
that does not leave out any room for matching another occurrence of the word
2274+
I<SQL> (as C<$1> in the regular expression). Since the engine is not able to get
2275+
backward and change the path to match, the regular expression fails.
22682276
22692277
It is worth noting that disabling backtracking will not prevent the engine
22702278
to try several ways to match the regular expression.
@@ -2312,8 +2320,8 @@ Capture 1 = database!
23122320
[SQL][ database!]
23132321
=end code
23142322
2315-
Even using the L<:r|/language/regexes#ratchet> adverb to prevent backtracking will not
2316-
change things:
2323+
Even using the L<:r|/language/regexes#ratchet> adverb to prevent backtracking
2324+
will not change things:
23172325
23182326
=begin code :preamble<my $string = '';>
23192327
my $iteration = 0;
@@ -2345,8 +2353,9 @@ Capture 1 = database!
23452353
[SQL][ database!]
23462354
=end code
23472355
2348-
This demonstrate that disabling backtracking does not mean disabling possible multiple
2349-
iterations of the matching engine, but rather disabling the backward matching tuning.
2356+
This demonstrate that disabling backtracking does not mean disabling possible
2357+
multiple iterations of the matching engine, but rather disabling the backward
2358+
matching tuning.
23502359
23512360
23522361
=head1 C<$/> changes each time a regular expression is matched

doc/Language/traps.pod6

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -934,7 +934,7 @@ When there are multiple matching alternations, for those separated by
934934
C<||>, the first matching alternation wins; for those separated by C<|>,
935935
which to win is decided by LTM strategy. See also:
936936
L<documentation on C<||>|/language/regexes#Alternation:_||> and
937-
L<documentation on C<|>|/language/regexes#Longest_Alternation:_|>.
937+
L<documentation on C<|>|/language/regexes#Longest_alternation:_|>.
938938
939939
For simple regexes just using C<||> instead of C<|>
940940
will get you familiar semantics, but if writing grammars then it's useful to

doc/Language/variables.pod6

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -369,7 +369,7 @@ The C<:> twigil declares a formal named parameter to a block or subroutine.
369369
Variables declared using this form are a type of placeholder variable too.
370370
Therefore the same things that apply to variables declared using the C<^>
371371
twigil also apply here (with the exception that they are not positional and
372-
therefore not ordered using Unicode order, of course). So this:
372+
therefore not ordered using Unicode order, of course). For instance:
373373
374374
say { $:add ?? $^a + $^b !! $^a - $^b }( 4, 5 ) :!add
375375
# OUTPUT: «-1␤»
@@ -473,8 +473,8 @@ say $foo; # Exception! "Variable '$foo' is not declared"
473473
This dies because C<$foo> is only defined as long as we are in the same
474474
scope.
475475
476-
In order to create more than one variable with a lexical scope in the same sentence
477-
surround the variables with parentheses:
476+
In order to create more than one variable with a lexical scope in the same
477+
sentence surround the variables with parentheses:
478478
479479
my ( $foo, $bar );
480480

0 commit comments

Comments
 (0)