@@ -921,6 +921,65 @@ of the assignment operators instead:
921
921
922
922
= head1 Regexes
923
923
924
+ = head1 Interpolation constructs
925
+
926
+ Perl 6 offers several constructs to generate regexes at runtime through
927
+ interpolation (see their detailed description
928
+ L < here|/language/regexes#Regex_interpolation > ). When a thus generated regex
929
+ contains only characters that match themselves, some of these constructs behave
930
+ identically, as if they are equivalent alternatives. As soon as the generated
931
+ regex contains metacharacters, however, they behave differently, which may come
932
+ as an unpleasant and confusing surprise.
933
+
934
+ The first two constructs that may easily be confused with each other are
935
+ C « $variable » and C « <$variable> » . The former causes the (stringified) variable to
936
+ match literally, while the latter causes the (stringified) variable to match as
937
+ a regex. As long as the variable comprises only characters that, in a regex,
938
+ match themselves (i.e. alphanumeric characters and the underscore), there is no
939
+ distinction between the constructs:
940
+
941
+ my $variable = 'camelia';
942
+ say ‘I ♥ camelia’ ~~ / $variable /; # OUTPUT: 「camelia」
943
+ say ‘I ♥ camelia’ ~~ / <$variable> /; # OUTPUT: 「camelia」
944
+
945
+ But when the variable is changed to comprise regex metacharacters, i.e.
946
+ characters that are neither alphanumeric nor the underscore C < _ > , the outputs
947
+ become different:
948
+
949
+ my $variable = '#camelia';
950
+ say ‘I ♥ #camelia’ ~~ / $variable /; # OUTPUT: 「#camelia」
951
+ say ‘I ♥ #camelia’ ~~ / <$variable> /; # !! Error: malformed regex
952
+
953
+ What happens here is that the string C < #camelia > contains the metacharacter
954
+ C < # > . In the context of a regex, this character should be quoted to match
955
+ literally; without quoting, the C < # > is parsed as the start of a comment that
956
+ runs until the end of the line, which in turn causes the regex not to be
957
+ terminated, and thus to be malformed.
958
+
959
+ Two other constructs that must similarly be distinguished from one another are
960
+ C « $(code) » and C « <{code}> » . The former construct runs user-specified code within
961
+ the regex and interpolates the (stringified) return value literally. The latter
962
+ also runs user-specified code within the regex, but interpolates the
963
+ (stringified) return value as a regex. So, like before, as long as the return
964
+ value comprises only characters that match literally in a regex, there is no
965
+ distinction between the two:
966
+
967
+ my $variable = 'ailemac;
968
+ say ‘I ♥ camelia’ ~~ / $($variable.flip) /; # OUTPUT: 「camelia」
969
+ say ‘I ♥ camelia’ ~~ / <{$variable.flip}> /; # OUTPUT: 「camelia」
970
+
971
+ But when the return value is changed to comprise regex metacharacters, the
972
+ outputs diverge:
973
+
974
+ my $variable = 'ailema.';
975
+ say ‘I ♥ camelia’ ~~ / $($variable.flip) /; # OUTPUT: Nil
976
+ say ‘I ♥ camelia’ ~~ / <{$variable.flip}> /; # OUTPUT: 「camelia」
977
+
978
+ In this case the return value of the code is the string C < .amelia > , which
979
+ contains the metacharacter C < . > . The above attempt by C « $(code) » to match the
980
+ dot literally fails; the attempt by C « <{code}> » to match the dot as a regex
981
+ wildcard succeeds. Hence the different outputs.
982
+
924
983
= head2 C < | > vs C < || > : which branch will win
925
984
926
985
To match one of several possible alternatives, C < || > or C < | > will be used. But
0 commit comments