diff --git a/doc/Language/5to6-nutshell.pod6 b/doc/Language/5to6-nutshell.pod6 index 8bb1fb331..a1afbbb2a 100644 --- a/doc/Language/5to6-nutshell.pod6 +++ b/doc/Language/5to6-nutshell.pod6 @@ -12,7 +12,7 @@ features and idioms are not). Hence this should not be mistaken for a beginner tutorial or a promotional overview of Perl 6; it is intended as a technical reference for Perl 6 learners with a strong Perl 5 background and for anyone porting Perl 5 code -to Perl 6 (though note that L<#Automated Translation> might be more +to Perl 6 (though note that L<#Automated translation> might be more convenient). A note on semantics; when we say "now" in this document, we mostly just diff --git a/doc/Language/glossary.pod6 b/doc/Language/glossary.pod6 index 60ad14b23..c73c4eb98 100644 --- a/doc/Language/glossary.pod6 +++ b/doc/Language/glossary.pod6 @@ -865,7 +865,14 @@ is its usual acronym. =head1 property -In this context, it either refers to an L, which is the value of an instance variable, or an L which are codepoint features that allow programs to identify what kind of entity they represent, that is, if they are a letter, or a number, or something completely different like a control character. +In this context, it either refers to an +L, +which is the value of an instance variable, or an +L +which are codepoint features that +allow programs to identify what kind of entity they represent, that is, if they +are a letter, or a number, or something completely different like a control +character. X<|pugs> =head1 pugs diff --git a/doc/Language/regexes.pod6 b/doc/Language/regexes.pod6 index 87e686faa..81f6b7f09 100644 --- a/doc/Language/regexes.pod6 +++ b/doc/Language/regexes.pod6 @@ -10,7 +10,7 @@ Regular expressions, I for short, are a sequence of characters that describe a pattern of text. Pattern matching is the process of matching those patterns to actual text. -=head1 X +=head1 X Perl 6 has special syntax for writing regexes: @@ -242,14 +242,15 @@ Note that the character classes C«», C«» and C«» are so called zero-width assertions, which do not really match a character. -=head2 X«Unicode Properties|regex,<:property>» +=head2 X«Unicode properties|regex,<:property>» The character classes mentioned so far are mostly for convenience; another approach is to use Unicode character properties. These come in the form C«<:property>», where C can be a short or long Unicode General Category name. These use pair syntax. -To match against a Unicode property you can use either smartmatch or L|/routine/uniprop>: +To match against a Unicode property you can use either smartmatch or +L|/routine/uniprop>: "a".uniprop('Script'); # OUTPUT: «Latin␤» "a" ~~ / <:Script> /; # OUTPUT: «「a」␤» @@ -325,7 +326,7 @@ parentheses; for example: say $0 if 'perl6' ~~ /\w+(<:Ll+:N>)/ # OUTPUT: «「6」␤» -=head2 X«Enumerated Character Classes and Ranges|regex,<[ ]>;regex,<-[ ]>» +=head2 X«Enumerated character classes and ranges|regex,<[ ]>;regex,<-[ ]>» Sometimes the pre-existing wildcards and character classes are not enough. Fortunately, defining your own is fairly simple. Within C«<[ ]>», @@ -587,7 +588,7 @@ string of non-whitespace characters. Even in non-backtracking contexts, the alternation operator C<||> tries all the branches in order until the first one matches. -=head1 X|regex,|> +=head1 X|regex,|> In short, in regex branches separated by C<|>, the longest token match wins, independent of the textual ordering in the regex. However, what C<|> really @@ -649,7 +650,7 @@ Arrays can also be interpolated into a regex to achieve the same effect: my @increasingly-edible = ; say 'food' ~~ /@increasingly-edible/; # OUTPUT: «「food」␤» -This is documented further under L, +This is documented further under L, below. =head1 X|regex,&&> @@ -686,7 +687,7 @@ Regexes search an entire string for matches. Sometimes this is not what you want. Anchors match only at certain positions in the string, thereby anchoring the regex match to that position. -=head2 X +=head2 X The C<^> anchor only matches at the start of the string: @@ -729,7 +730,7 @@ The following is a multi-line string: # 'and' is at the start of a line -- not the string say so $str ~~ /^and /; # OUTPUT: «False␤» -=head2 X +=head2 X The C<^^> anchor matches at the start of a logical line. That is, either at the start of the string, or after a newline character. However, it does not @@ -774,7 +775,7 @@ two leading spaces each. # matched at the last line say so $str ~~ / '."' $$/; # OUTPUT: «True␤» -=head2 X«Word Boundary|regex, <|w>;regex, » +=head2 X«Word boundary|regex, <|w>;regex, » To match any word boundary, use C«<|w>» or C«». This is similar to X«C<\b>|regex deprecated,\b» of other languages. @@ -860,7 +861,7 @@ lookahead and lookbehind assertions. Technically, anchors are also zero-width assertions, and they can look both ahead and behind. -=head2 X +=head2 X To check that a pattern appears before another pattern, use a lookahead assertion via the C assertion. This has the form: @@ -942,7 +943,9 @@ These are, as in the case of lookahead, zero-width assertions which do not I bar /; # OUTPUT: «「atfoobar」␤ 0 => 「atf」␤» -where we capture the first 3 of the 5 characters before bar, but only if C is preceded by C. The fact that the assertion is zero-width allows us to use part of the characters in the assertion for capture. +where we capture the first 3 of the 5 characters before bar, but only if C +is preceded by C. The fact that the assertion is zero-width allows us to +use part of the characters in the assertion for capture. @@ -1065,7 +1068,9 @@ it in a variable first: X<|:my> -C<:my> helps scoping the C<$c> variable within the regex and beyond; in this case we can use it in the next sentence to show what has been matched inside the regex. This can be used for debugging inside regular expressions, for instance: +C<:my> helps scoping the C<$c> variable within the regex and beyond; in this +case we can use it in the next sentence to show what has been matched inside the +regex. This can be used for debugging inside regular expressions, for instance: my $paragraph="line\nline2\nline3"; $paragraph ~~ rx| :my $counter = 0; ( \V* { ++$counter } ) *%% \n |; @@ -1086,7 +1091,8 @@ say HasOur.parse('Þor is mighty'); # OUTPUT: «「Þor is mighty」␤» say $HasOur::our; # OUTPUT: «Þor␤» =end code -Once the parsing has been done successfully, we use the FQN name of the C<$our> variable to access its value, that can be none other than C<Þor> +Once the parsing has been done successfully, we use the FQN name of the C<$our> +variable to access its value, that can be none other than C<Þor>. =head2 X @@ -1136,9 +1142,10 @@ C<\K>. say 'abc' ~~ / a <( b )> c/; # OUTPUT: «「b」␤» say 'abc' ~~ / <(a <( b )> c)>/; # OUTPUT: «「bc」␤» -As in the example above, you can see C«<(» sets the start point and C«)>» sets the -endpoint; since they are actually independent of each other, the inner-most start point -wins (the one attached to C) and the outer-most end wins (the one attached to C). +As in the example above, you can see C«<(» sets the start point and C«)>» sets +the endpoint; since they are actually independent of each other, the inner-most +start point wins (the one attached to C) and the outer-most end wins (the one +attached to C). =head1 Substitution @@ -1430,7 +1437,7 @@ list of predefined subrules is listed in L of design documents. -=head1 X +=head1 X If you want to build a regex using a pattern given at runtime, regex interpolation is what you are looking for. @@ -1575,7 +1582,7 @@ like C<:overlap> are appended to the match call: } # OUTPUT: «ba␤aA␤» -=head2 X +=head2 X Adverbs that appear at the time of a regex declaration are part of the actual regex and influence how the Perl 6 compiler translates the regex into @@ -2150,12 +2157,13 @@ my $string = 'PostgreSQL is an SQL database!'; say $string ~~ /(.+)(SQL) (.+) $1/; # OUTPUT: 「PostgreSQL is an SQL」 =end code -What happens in the above example is that the string has to be matched against the -second occurrence of the word I, eating all characters before and leaving out -the rest. +What happens in the above example is that the string has to be matched against +the second occurrence of the word I, eating all characters before and +leaving out the rest. -Since it is possible to execute a piece of code within a regular expression, it is also possible -to inspect the L object within the regular expression itself: +Since it is possible to execute a piece of code within a regular expression, it +is also possible to inspect the L object within the regular +expression itself: =begin code :preamble my $iteration = 0; @@ -2186,10 +2194,10 @@ Capture 2 = is an showing that the string has been split around the second occurrence of I, that is the repetition of the first capture (C<$/[1]>). -With that in place, it is now possible to see how the engine backtracks -to find the above match: it does suffice to move the C -in the middle of the regular expression, in particular before the repetition of the -first capture C<$1> to see it in action: +With that in place, it is now possible to see how the engine backtracks to find +the above match: it does suffice to move the C in the middle of +the regular expression, in particular before the repetition of the first capture +C<$1> to see it in action: =begin code :preamble my $iteration = 0; @@ -2207,8 +2215,8 @@ sub show-captures( Match $m ){ $string ~~ / (.+)(SQL) (.+) { show-captures( $/ ); } $1 /; =end code -The output will be much more verbose and will show several iterations, with the last one -being the I. The following is an excerpt of the output: +The output will be much more verbose and will show several iterations, with the +last one being the I. The following is an excerpt of the output: =begin code :lang === Iteration 1 === @@ -2260,11 +2268,11 @@ say $string ~~ /(.+)(SQL) (.+) $1/; # OUTPUT: 「PostgreSQL is an SQL」 say $string ~~ / :r (.+)(SQL) (.+) $1/; # OUTPUT: Nil =end code -The fact is that, as shown in the I output, the first match -of the regular expression engine will be C, C, C< database> -that does not leave out any room for matching another occurrence of the word I -(as C<$1> in the regular expression). Since the engine is not able to get backward and change the -path to match, the regular expression fails. +The fact is that, as shown in the I output, the first match of the +regular expression engine will be C, C, C< database> +that does not leave out any room for matching another occurrence of the word +I (as C<$1> in the regular expression). Since the engine is not able to get +backward and change the path to match, the regular expression fails. It is worth noting that disabling backtracking will not prevent the engine to try several ways to match the regular expression. @@ -2312,8 +2320,8 @@ Capture 1 = database! [SQL][ database!] =end code -Even using the L<:r|/language/regexes#ratchet> adverb to prevent backtracking will not -change things: +Even using the L<:r|/language/regexes#ratchet> adverb to prevent backtracking +will not change things: =begin code :preamble my $iteration = 0; @@ -2345,8 +2353,9 @@ Capture 1 = database! [SQL][ database!] =end code -This demonstrate that disabling backtracking does not mean disabling possible multiple -iterations of the matching engine, but rather disabling the backward matching tuning. +This demonstrate that disabling backtracking does not mean disabling possible +multiple iterations of the matching engine, but rather disabling the backward +matching tuning. =head1 C<$/> changes each time a regular expression is matched diff --git a/doc/Language/traps.pod6 b/doc/Language/traps.pod6 index ad82200e9..f9af692ba 100644 --- a/doc/Language/traps.pod6 +++ b/doc/Language/traps.pod6 @@ -934,7 +934,7 @@ When there are multiple matching alternations, for those separated by C<||>, the first matching alternation wins; for those separated by C<|>, which to win is decided by LTM strategy. See also: L|/language/regexes#Alternation:_||> and -L|/language/regexes#Longest_Alternation:_|>. +L|/language/regexes#Longest_alternation:_|>. For simple regexes just using C<||> instead of C<|> will get you familiar semantics, but if writing grammars then it's useful to diff --git a/doc/Language/variables.pod6 b/doc/Language/variables.pod6 index e7940273d..dcca5f434 100644 --- a/doc/Language/variables.pod6 +++ b/doc/Language/variables.pod6 @@ -369,7 +369,7 @@ The C<:> twigil declares a formal named parameter to a block or subroutine. Variables declared using this form are a type of placeholder variable too. Therefore the same things that apply to variables declared using the C<^> twigil also apply here (with the exception that they are not positional and -therefore not ordered using Unicode order, of course). So this: +therefore not ordered using Unicode order, of course). For instance: say { $:add ?? $^a + $^b !! $^a - $^b }( 4, 5 ) :!add # OUTPUT: «-1␤» @@ -473,8 +473,8 @@ say $foo; # Exception! "Variable '$foo' is not declared" This dies because C<$foo> is only defined as long as we are in the same scope. -In order to create more than one variable with a lexical scope in the same sentence -surround the variables with parentheses: +In order to create more than one variable with a lexical scope in the same +sentence surround the variables with parentheses: my ( $foo, $bar );