Permalink
Fetching contributors…
Cannot retrieve contributors at this time
509 lines (365 sloc) 17.1 KB
=begin pod :tag<perl6>
=TITLE Quoting constructs
=SUBTITLE Writing strings, word lists, and regexes in Perl 6
=head1 The Q lang
Strings are usually represented in Perl 6 code using some form of quoting
construct. The most minimalistic of these is C<Q>, usable via the shortcut
C<「…」>, or via C<Q> followed by any pair of delimiters surrounding your
text. Most of the time, though, the most you'll need is C<'…'> or C<"…">,
described in more detail in the following sections.
=head2 X<Literal strings: Q|quote,Q;quote,「 」>
=for code :allow<B> :skip-test
B<Q[>A literal stringB<]>
B<>More plainly.B<>
B<Q ^>Almost any non-word character can be a delimiter!B<^>
B<Q 「「>Delimiters can be repeated/nested if they are adjacent.B<」」>
Delimiters can be nested, but in the plain C<Q> form, backslash escapes
aren't allowed. In other words, basic C<Q> strings are as literal as
possible.
Some delimiters are not allowed immediately after C<Q>, C<q>, or C<qq>. Any
characters that are allowed in L<identifiers|/language/syntax#Identifiers> are
not allowed to be used, since in such a case, the quoting construct together
with such characters are interpreted as an identifier. In addition, C<( )> is
not allowed because that is interpreted as a function call. If you still wish
to use those characters as delimiters, separate them from C<Q>, C<q>, or C<qq>
with a space. Please note that some natural languages use a left delimiting
quote on the right side of a string. C<Q> will not support those as it relies
on unicode properties to tell left and right delimiters apart.
=for code :allow<B> :skip-test
B<Q'>this will not work!B<'>
B<Q(>this won't work either!B<)>
The examples above will produce an error. However, this will work
=for code :allow<B> :skip-test
B<Q (>this is fine, because of space after QB<)>
B<Q '>and so is thisB<'>
Q<Make sure you B«<»matchB«>» opening and closing delimiters>
Q{This is still a closing curly brace → B<\>}
These examples produce:
=begin code :skip-test
this is fine, because of space after Q
and so is this
Make sure you <match> opening and closing delimiters
This is still a closing curly brace → \
=end code
X<|:x (quoting adverb)>X<|:exec (quoting adverb)>X<|:w (quoting adverb)>X<|:words (quoting adverb)>X<|:ww (quoting adverb)>X<|:quotewords (quoting adverb)>X<|:q (quoting adverb)>X<|:single (quoting adverb)>X<|:qq (quoting adverb)>X<|:double (quoting adverb)>X<|:s (quoting adverb)>X<|:scalar (quoting adverb)>X<|:a (quoting adverb)>X<|:array (quoting adverb)>X<|:h (quoting adverb)>X<|:hash (quoting adverb)>X<|:f (quoting adverb)>X<|:function (quoting adverb)>X<|:c (quoting adverb)>X<|:closure (quoting adverb)>X<|:b (quoting adverb)>X<|:backslash (quoting adverb)>X<|:to (quoting adverb)>X<|:heredoc (quoting adverb)>X<|:v (quoting adverb)>X<|:val (quoting adverb)>
The behaviour of quoting constructs can be modified with adverbs, as explained
in detail in later sections.
=begin table
Short Long Meaning
:x :exec Execute as command and return results
:w :words Split result on words (no quote protection)
:ww :quotewords Split result on words (with quote protection)
:q :single Interpolate \\, \qq[...] and escaping the delimiter with \
:qq :double Interpolate with :s, :a, :h, :f, :c, :b
:s :scalar Interpolate $ vars
:a :array Interpolate @ vars
:h :hash Interpolate % vars
:f :function Interpolate & calls
:c :closure Interpolate {...} expressions
:b :backslash Enable backslash escapes (\n, \qq, \$foo, etc)
:to :heredoc Parse result as heredoc terminator
:v :val Convert to allomorph if possible
=end table
X<|escaping quote>
=head2 X<Escaping: q|quote,q;quote,' '>
=begin code :skip-test
'Very plain';
q[This back\slash stays];
q[This back\\slash stays]; # Identical output
q{This is not a closing curly brace → \}, but this is → };
Q :q $There are no backslashes here, only lots of \$\$\$>!$;
'(Just kidding. There\'s no money in that string)';
'No $interpolation {here}!';
Q:q!Just a literal "\n" here!;
=end code
The C<q> form allows for escaping characters that would otherwise end the
string using a backslash. The backslash itself can be escaped, too, as in
the third example above. The usual form is C<'…'> or C<q> followed by a
delimiter, but it's also available as an adverb on C<Q>, as in the fifth and
last example above.
These examples produce:
=for code :allow<B> :skip-test
Very plain
This back\slash stays
This back\slash stays
This is not a closing curly brace → } but this is →
There are no backslashes here, only lots of $$$!
(Just kidding. There's no money in that string)
No $interpolation {here}!
Just a literal "\n" here
The C<\qq[...]> escape sequence enables
L«C<qq> interpolation|/language/quoting#Interpolation:_qq» for a portion
of the string. Using this escape sequence is handy when you have HTML markup
in your strings, to avoid interpretation of angle brackets as hash keys:
my $var = 'foo';
say '<code>$var</code> is <var>\qq[$var.uc()]</var>';
# OUTPUT: «<code>$var</code> is <var>FOO</var>␤»
=head2 X<Interpolation: qq|quote,qq;quote," ">
=begin code :allow<B L> :skip-test
my $color = 'blue';
L<say> B<">My favorite color is B<$color>!B<">
My favorite color is blue!
=end code
X<|\ (quoting)>
The C<qq> form – usually written using double quotes – allows for
interpolation of backslash sequences and variables, i.e., variables can be
written within the string so that the content of the variable is inserted into
the string. It is also possible to escape variables within a C<qq>-quoted
string:
=begin code :allow<B> :skip-test
say B<">The B<\>$color variable contains the value '$color'B<">;
The $color variable contains the value 'blue'
=end code
Another feature of C<qq> is the ability to interpolate Perl 6 code from
within the string, using curly braces:
=begin code :allow<B L> :skip-test
my ($x, $y, $z) = 4, 3.5, 3;
say "This room is B<$x> m by B<$y> m by B<$z> m.";
say "Therefore its volume should be B<{ $x * $y * $z }> m³!";
This room is 4 m by 3.5 m by 3 m.
Therefore its volume should be 42 m³!
=end code
By default, only variables with the C<$> sigil are interpolated normally.
This way, when you write C<"documentation@perl6.org">, you aren't
interpolating the C<@perl6> variable. If that's what you want to do, append
a C<[]> to the variable name:
=begin code :allow<B> :skip-test
my @neighbors = "Felix", "Danielle", "Lucinda";
say "@neighborsB<[]> and I try our best to coexist peacefully."
Felix Danielle Lucinda and I try our best to coexist peacefully.
=end code
Often a method call is more appropriate. These are allowed within C<qq>
quotes as long as they have parentheses after the call. Thus the following
code will work:
=begin code :allow<B L> :skip-test
say "@neighborsB<.L<join>(', ')> and I try our best to coexist peacefully."
Felix, Danielle, Lucinda and I try our best to coexist peacefully.
=end code
However, C<"@example.com"> produces C<@example.com>.
To call a subroutine use the C<&>-sigil.
X<|& (interpolation)>
say "abc&uc("def")ghi";
# OUTPUT: «abcDEFghi␤»
Postcircumfix operators and therefore L<subscripts|/language/subscripts> are
interpolated as well.
my %h = :1st; say "abc%h<st>ghi";
# OUTPUT: «abc1ghi␤»
To enter unicode sequences use C<\x> or C<\x[]> with the hex-code of the
character or a list of characters.
my $s = "I \x2665 Perl 6!";
say $s;
# OUTPUT: «I ♥ Perl 6!␤»
$s = "I really \x[2661,2665,2764,1f495] Perl 6!";
say $s;
# OUTPUT: «I really ♡♥❤💕 Perl 6!␤»
You can also use L«unicode names|/language/unicode#Entering_unicode_codepoints_and_codepoint_sequences»
, L<named sequences|/language/unicode#Named_sequences>
and L<name aliases|/language/unicode#Name_aliases> with L«\c[]|/language/unicode#Entering_unicode_codepoints_and_codepoint_sequences».
my $s = "Camelia \c[BROKEN HEART] my \c[HEAVY BLACK HEART]!";
say $s;
# OUTPUT: «Camelia 💔 my ❤!␤»
Interpolation of undefined values will raise a control exception that can be
caught in the current block with
L<CONTROL|/language/phasers#CONTROL>.
=begin code
sub niler {Nil};
my Str $a = niler;
say("$a.html", "sometext");
say "alive"; # this line is dead code
CONTROL { .die };
=end code
=head2 Word quoting: qw
X<|qw word quote>
=for code :allow<B L> :skip-test
B<qw|>! @ # $ % ^ & * \| < > B<|> eqv '! @ # $ % ^ & * | < >'.words.list
B<q:w {> [ ] \{ \} B<}> eqv ('[', ']', '{', '}')
B<Q:w |> [ ] { } B<|> eqv ('[', ']', '{', '}')
The C<:w> form, usually written as C<qw>, splits the string into
"words". In this context, words are defined as sequences of non-whitespace
characters separated by whitespace. The C<q:w> and C<qw> forms inherit the
interpolation and escape semantics of the C<q> and single quote string
delimiters, whereas C<Qw> and C<Q:w> inherit the non-escaping semantics of
the C<Q> quoter.
This form is used in preference to using many quotation marks and commas for
lists of strings. For example, where you could write:
my @directions = 'left', 'right,', 'up', 'down';
It's easier to write and to read this:
my @directions = qw|left right up down|;
=head2 Word quoting: < >
X«|< > word quote»
=for code :allow<B L>
say B«<»a b cB«>» L<eqv> ('a', 'b', 'c'); # OUTPUT: «True␤»
say B«<»a b 42B«>» L<eqv> ('a', 'b', '42'); # OUTPUT: «False␤», the 42 became an IntStr allomorph
say < 42 > ~~ Int; # OUTPUT: «True␤»
say < 42 > ~~ Str; # OUTPUT: «True␤»
The angle brackets quoting is like C<qw>, but with extra feature that lets you
construct L<allomorphs|/language/glossary#index-entry-Allomorph> or literals
of certain numbers:
say <42 4/2 1e6 1+1i abc>.perl;
# OUTPUT: «(IntStr.new(42, "42"), RatStr.new(2.0, "4/2"), NumStr.new(1000000e0, "1e6"), ComplexStr.new(<1+1i>, "1+1i"), "abc")␤»
To construct a L«C<Rat>|/type/Rat» or L«C<Complex>|/type/Complex» literal, use
angle brackets around the number, without any extra spaces:
say <42/10>.^name; # OUTPUT: «Rat␤»
say <1+42i>.^name; # OUTPUT: «Complex␤»
say < 42/10 >.^name; # OUTPUT: «RatStr␤»
say < 1+42i >.^name; # OUTPUT: «ComplexStr␤»
Compared to C<42/10> and C<1+42i>, there's no division (or addition) operation
involved. This is useful for literals in routine signatures, for example:
=begin code :skip-test
sub close-enough-π (<355/113>) {
say "Your π is close enough!"
}
close-enough-π 710/226; # OUTPUT: «Your π is close enough!␤»
# WRONG: can't do this, since it's a division operation
sub compilation-failure (355/113) {}
=end code
=head2 X<Word quoting with quote protection: qww|quote,qww>
The C<qw> form of word quoting will treat quote characters literally, leaving them in the
resulting words:
say qw{"a b" c}.perl; # OUTPUT: «("\"a", "b\"", "c")␤»
Thus, if you wish to preserve quoted sub-strings as single items in the resulting words
you need to use the C<qww> variant:
say qww{"a b" c}.perl; # OUTPUT: «("a b", "c")␤»
=head2 X<Word quoting with interpolation: qqw|quote,qqw>
The C<qw> form of word quoting doesn't interpolate variables:
my $a = 42; say qw{$a b c}; # OUTPUT: «$a b c␤»
Thus, if you wish for variables to be interpolated within the quoted string,
you need to use the C<qqw> variant:
my $a = 42;
my @list = qqw{$a b c};
say @list; # OUTPUT: «[42 b c]␤»
Note that variable interpolation happens before word splitting:
my $a = "a b";
my @list = qqw{$a c};
.say for @list; # OUTPUT: «a␤b␤c␤»
=head2 X<<<Word quoting with interpolation and quote protection: qqww|quote,qqww;>>>
The C<qqw> form of word quoting will treat quote characters literally, leaving them in the
resulting words:
my $a = 42; say qqw{"$a b" c}.perl; # OUTPUT: «("\"42", "b\"", "c")␤»
Thus, if you wish to preserve quoted sub-strings as single items in the resulting words
you need to use the C<qqww> variant:
my $a = 42; say qqww{"$a b" c}.perl; # OUTPUT: «("42 b", "c")␤»
Quote protection happens before interpolation, and interpolation happens before word splitting,
so quotes coming from inside interpolated variables are just literal quote characters:
my $a = "1 2";
say qqww{"$a" $a}.perl; # OUTPUT: «("1 2", "1", "2")␤»
my $b = "1 \"2 3\"";
say qqww{"$b" $b}.perl; # OUTPUT: «("1 \"2 3\"", "1", "\"2", "3\"")␤»
=head2 X<<<Word quoting with interpolation and quote protection: « »|quote,<< >>;quote,« »>>>
This style of quoting is like C<qqww>, but with the added benefit of
constructing L<allomorphs|/language/glossary#index-entry-Allomorph> (making it
functionally equivalent to L<qq:ww:v|#index-entry-:val_(quoting_adverb)>). The
ASCII equivalent to C<« »> are double angle brackets C«<< >>».
# Allomorph Construction
my $a = 42; say « $a b c ».perl; # OUTPUT: «(IntStr.new(42, "42"), "b", "c")␤»
my $a = 42; say << $a b c >>.perl; # OUTPUT: «(IntStr.new(42, "42"), "b", "c")␤»
# Quote Protection
my $a = 42; say « "$a b" c ».perl; # OUTPUT: «("42 b", "c")␤»
my $a = 42; say << "$a b" c >>.perl; # OUTPUT: «("42 b", "c")␤»
=head2 X<Shell quoting: qx|quote,qx>
To run a string as an external program, not only is it possible to pass the
string to the C<shell> or C<run> functions but one can also perform shell
quoting. There are some subtleties to consider, however. C<qx> quotes
I<don't> interpolate variables. Thus
my $world = "there";
say qx{echo "hello $world"}
prints simply C<hello>. Nevertheless, if you have declared an environment
variable before calling C<perl6>, this will be available within C<qx>, for
instance
=begin code :skip-test
WORLD="there" perl6
> say qx{echo "hello $WORLD"}
=end code
will now print C<hello there>.
The result of calling C<qx> is returned, so this information can be assigned
to a variable for later use:
my $output = qx{echo "hello!"};
say $output; # OUTPUT: «hello!␤»
See also L<shell|/routine/shell>, L<run|/routine/run> and L<Proc::Async> for
other ways to execute external commands.
=head2 X<Shell quoting with interpolation: qqx|quote,qqx>
If one wishes to use the content of a Perl 6 variable within an external
command, then the C<qqx> shell quoting construct should be used:
my $world = "there";
say qqx{echo "hello $world"}; # OUTPUT: «hello there␤»
Again, the output of the external command can be kept in a variable:
my $word = "cool";
my $option = "-i";
my $file = "/usr/share/dict/words";
my $output = qqx{grep $option $word $file};
# runs the command: grep -i cool /usr/share/dict/words
say $output; # OUTPUT: «Cooley␤Cooley's␤Coolidge␤Coolidge's␤cool␤...»
See also L<run|/routine/run> and L<Proc::Async> for better ways to
execute external commands.
=head2 X<Heredocs: :to|quote,heredocs :to>
A convenient way to write multi-line string literals are I<heredocs>, which
let you choose the delimiter yourself:
=begin code
say q:to/END/;
Here is
some multi-line
string
END
=end code
The contents of the heredoc always begin on the next line, so you can (and
should) finish the line.
=begin code :skip-test
my $escaped = my-escaping-function(q:to/TERMINATOR/, language => 'html');
Here are the contents of the heredoc.
Potentially multiple lines.
TERMINATOR
=end code
If the terminator is indented, that amount of indention is removed from the
string literals. Therefore this heredoc
=begin code
say q:to/END/;
Here is
some multi line
string
END
=end code
produces this output:
=begin code :skip-test
Here is
some multi line
string
=end code
Heredocs include the newline from before the terminator.
To allow interpolation of variables use the C<qq> form, but you will then have to escape meta characters
C<{\> as well as C<$> if it is not the sigil for a defined variable. For example:
my $f = 'db.7.3.8';
my $s = qq:to/END/;
option \{
file "$f";
};
END
say $s;
would produce:
=begin code :skip-test
option {
file "db.7.3.8";
};
=end code
You can begin multiple Heredocs in the same line.
=begin code
my ($first, $second) = qq:to/END1/, qq:to/END2/;
FIRST
MULTILINE
STRING
END1
SECOND
MULTILINE
STRING
END2
=end code
=head2 Unquoting
Literal strings permit interpolation of embedded quoting constructs by using the escape sequences such as these:
my $animal="quaggas";
say 'These animals look like \qq[$animal]'; # OUTPUT: «These animals look like quaggas␤»
say 'These animals are \qqw[$animal or zebras]'; # OUTPUT: «These animals are quaggas or zebras␤»
In this example, C<\qq> will do double-quoting interpolation, and C<\qqw> word quoting with interpolation. Escaping any other quoting construct as above will act in the same way, allowing interpolation in literal strings.
=head1 Regexes
For information about quoting as applied in regexes see the L<regular
expression documentation|/language/regexes>.
=end pod
# vim: expandtab softtabstop=4 shiftwidth=4 ft=perl6