/
S06-routines.pod
3553 lines (2623 loc) · 138 KB
/
S06-routines.pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
=encoding utf8
=head1 TITLE
Synopsis 6: Subroutines
=head1 AUTHORS
Damian Conway <damian@conway.org>
Allison Randal <al@shadowed.net>
Larry Wall <larry@wall.org>
Daniel Ruoso <daniel@ruoso.com>
=head1 VERSION
Created: 21 Mar 2003
Last Modified: 23 Jan 2012
Version: 154
This document summarizes Apocalypse 6, which covers subroutines and the
new type system.
=head1 Subroutines and other code objects
C<Routine> is the parent type of all keyword-declared code blocks.
All routines are born with undefined values of C<$_>, C<$!>,
and C<$/>, unless the routine declares them otherwise explicitly.
A compilation unit is also considered a routine, or you would not be
able to reference C<$!> or C<$/> in them. (Non-routine code C<Block>s,
declared with C<< -> >> or with bare curlies, are born only with C<$_>,
which is aliased to its OUTER::<$_> unless bound as a parameter.
A block generally uses the C<$!> and C<$/> defined by the innermost
enclosing routine, unless C<$!> or C<$/> is explicitly declared in
the block. A conditional thunk follows the same rules, except that
a thunk has no scope to declare a new variable. Note however that
any and all lazy constructs, whether block-based or thunk-based,
such as gather or async or C<< ==> >> should declare their own C<$/>
and C<$!> so that the user's values for those variables cannot be
clobbered asynchronously. And this parenthetical remark is starting to
be seriously misplaced...)
B<Subroutines> (keyword: C<sub>) are non-inheritable routines with
parameter lists.
B<Methods> (keyword: C<method>) are inheritable routines which always
have an associated object (known as their invocant) and belong to a
particular kind or class.
B<Submethods> (keyword: C<submethod>) are non-inheritable methods, or
subroutines masquerading as methods. They have an invocant and belong to
a particular kind or class.
B<Regexes> (keyword: C<regex>) are methods (of a grammar) that perform
pattern matching. Their associated block has a special syntax (see
Synopsis 5). (We also use the term "regex" for anonymous patterns
of the traditional form.)
B<Tokens> (keyword: C<token>) are regexes that perform low-level
non-backtracking (by default) pattern matching.
B<Rules> (keyword: C<rule>) are regexes that perform non-backtracking
(by default) pattern matching (and also enable rules to do whitespace
dwimmery).
B<Macros> (keyword: C<macro>) are routines whose calls execute as soon
as they are parsed (i.e. at compile-time). Macros may return another
source code string or a parse-tree.
=head1 Routine modifiers
B<Multis> (keyword: C<multi>) are routines that can have multiple
variants that share the same name, selected by arity, types, or some
other constraints.
B<Prototypes> (keyword: C<proto>) specify the commonalities (such
as parameter names, fixity, and associativity) shared by all multis
of that name in the scope of the C<proto> declaration. A C<proto>
also adds an implicit C<multi> to all routines of the same short
name within its scope, unless they have an explicit modifier.
(This is particularly useful when adding to rule sets or when attempting
to compose conflicting methods from roles.) Abstractly, the C<proto>
is a generic wrapper around the dispatch to the C<multi>s. Each C<proto>
is instantiated into an actual dispatcher for each scope that
needs a different candidate list.
B<Only> (keyword: C<only>) routines do not share their short names
with other routines. This is the default modifier for all routines,
unless a C<proto> of the same name was already in scope. (For subs,
the governing C<proto> must have been declared in the same file, so
C<proto> declarations from the setting or other modules don't have
this effect unless explicitly imported.)
A modifier keyword may occur before the routine keyword in a named routine:
only sub foo {...}
proto sub foo {...}
dispatch sub foo {...} # internal
multi sub foo {...}
only method bar {...}
proto method bar {...}
dispatch method bar {...} # internal
multi method bar {...}
If the routine keyword is omitted, it defaults to C<sub>.
Modifier keywords cannot apply to anonymous routines.
A C<proto> is a generic dispatcher, which any given scope with a unique
candidate list will instantiate into a C<dispatch> routine. Hence
a C<proto> is never called directly, much like a C<role> can't be
used as an instantiated object.
When you call any routine (or method, or rule) that may have multiple
candidates, the basic dispatcher is really only calling an "only"
sub or method--but if there are multiple candidates, the "only" that
will be found is really a dispatcher. This instantiated C<dispatch>
is always called first (at least in the abstract--this can often be
optimized away). In essence, a C<dispatch> is dispatched exactly
like an C<only> sub, but the C<dispatch> itself may delegate to any
of the candidates it is "managing".
It is the C<dispatch>'s responsibility to first vet the arguments for all the
candidates; any call that does not successfully bind the C<dispatch>'s signature fails outright.
(Its signature is a copy of one belonging to the C<proto> from which it was instantiated.)
The C<dispatch> does not necessarily send the original capture to its candidates, however.
Named arguments that bind to positionals in the C<dispatch> sig will become positionals
for all subsequent calls to its managed multis.
The dispatch then considers its list of managed candidates from the
viewpoint of the caller or object, sorts them into some order, and
dispatches them according to the rules of multiple dispatch as defined
for each of the various dispatchers. In the case of multi subs, the
candidate list is known at compile time. In the case of multi methods,
it may be necessary to generate (or regenerate) the candidate list at
run time, depending on what is known when about the inheritance tree.
This default dispatch behavior is symbolized within the original
C<proto> by a block containing of a single C<*> (that is, a
"whatever"). Hence the typical C<proto> will simply have a body
of C<{*}>.
proto method bar {*}
(We don't use C<...> for that because it would fail at run time,
and the proto's instantiated C<dispatch> blocks are not stubs, but
are intended to be executed.)
Other statements may be inserted before and after the C<{*}>
statement to capture control before or after the multi dispatch:
proto foo ($a,$b) { say "Called with $a $b"; {*}; say "Returning"; }
(That C<proto> is only good for C<multi>s with side effects and no return
value, since it returns the result of C<say>, which might not be what
you want. See below for how to fix that.)
The syntactic form C<&foo> (without a modifying signature) can never
refer to a C<multi> candidate or a generic C<proto>. It may only
refer to the single C<only> or C<dispatch> routine that would first
be called by C<foo()>. Individual C<multi>s may be named by appending
a signature to the noun form: C<&foo:($,$,*@)>.
We used the term "managed" loosely above to indicate the set of C<multi>s in
question; the "managed set" is more accurately defined as the intersection
of all the C<multi>s in the C<proto>'s downward scope with all the C<multi>s that
are visible to the caller's upward-looking scope. For ordinary routines
this means looking down lexical scopes and looking up lexical scopes. [This
is more or less how C<multi>s already behave.]
For methods this means looking down or up the inheritance tree; "managed set"
in this case translates to the intersection of all methods in the C<proto>'s
class or its subclasses with all C<multi> methods visible to the object in its
parent classes, that is, the parent classes of the object's actual type on
whose behalf the method was called. [Note, this is a change from prior
multi method semantics, which restricted multimethods to a single class;
the old semantics is equivalent to defining a C<proto> in every class that has
multimethods. The new way gives the user the ability to intermix C<multi>s at
different inheritance levels.
Also, the old semantics of C<proto> providing the most-default C<multi> body
is hereby deprecated. Default C<multi>s should be marked with "C<is default>".
It is still possible to provide default behavior in the C<proto>, however, by
using it as a wrapper:
my proto sub foo (@args) {
do-something-before(@args);
{*} # call into the managed set, then come back
do-something-after(@args);
}
Note that this returns the value of do-something-after(), not the C<multi>.
There are two ways to get around that. Here's one way:
my proto sub foo (@args) {
ENTER do-something-before(@args);
{*}
LEAVE do-something-after(@args);
}
Alternately, you can spell out what C<{*}> is actually sugar for,
which would be some dispatcher macro such as:
my proto sub foo (|$cap (@args)) {
do-something-before(@args);
my |$retcap := MULTI-DISPATCH-CALLWITH(&?ROUTINE,$cap);
do-something-after(@args);
return |$retcap;
}
which optimizes (we hope) to an inlined multidispatcher to locate all
the candidates for these arguments (hopefully memoized), create the dynamic
scope of a dispatch, start the dispatch, manage C<callnext> and C<lastcall>
semantics, and return the result of whichever C<multi> succeeded, if any.
Which is why we have C<{*}> instead.
Another common variant would be to propagate control to the
outer/higher routine that would have been found if this one didn't
exist:
my proto method foo { {*}; UNDO nextsame; } # failover to super foo
Note that, in addition to making C<multi>s work similarly to each other,
the new C<proto> semantics greatly simplify top-level dispatchers, which
never have to worry about C<multi>s, because C<multi>s are always in the
second half of the double dispatch (again, just in the abstract, since
the first dispatch can often be optimized away, as if the C<proto> were
inlined). So in the abstract, C<foo()> only ever calls a single
C<only>/C<proto> routine, and we know which one it is at compile time.
This is less of a shift for method dispatch, which already assumed that there
is something like a single proto in each class that redispatches inside
the class. Here the change is that multi-method dispatcher needs to look
more widely for its candidates than the current class. But note that our
semantics were inconsistent before, insofar as regex methods already had to
look for this larger managed set in order to do transitive LTM correctly.
Now the semantics of normal method C<proto>s and regex C<proto>s are nearly
identical, apart from the fact that regex candidate lists naturally have
fancier tiebreaking rules involving longest token matching.
A C<dispatch> must be generated for every scope that contains one or more C<multi>
declaration. This is done by searching backwards and outwards (or up the
inheritance chain for methods) for a C<proto> to instantiate. If no such
C<proto> is found, a "most generic" C<proto> will be generated, something like:
proto sub foo (*@, *%) {*}
proto method foo (*@, *%) {*}
Obviously, no named-to-positional remapping can be done in this case.
[Conjecture: we could instead autogen a more specific signature for
each such autogenerated C<dispatch> once we know its exact candidate
set, such that consistent use of positional parameter names is rewarded
with positional names in the generated signature, which could remap
named parameters.]
=head2 Named subroutines
The general syntax for named subroutines is any of:
my RETTYPE sub NAME ( PARAMS ) TRAITS {...} # lexical only
sub NAME ( PARAMS ) TRAITS {...} # same as "my"
our RETTYPE sub NAME ( PARAMS ) TRAITS {...} # package-scoped
The return type may also be put inside the parentheses:
sub NAME (PARAMS --> RETTYPE) {...}
Unlike in Perl 5, named subroutines are considered expressions,
so this is valid Perl 6:
my @subs = (sub foo { ... }, sub bar { ... });
Another difference is that subroutines default to C<my> scope rather
than C<our> scope. However, subroutine dispatch searches lexical
scopes outward, and subroutines are also allowed to be I<postdeclared>
after their use, so you won't notice this much. A subroutine that is
not declared yet may be called using parentheses around the arguments,
in the absence of parentheses, the subroutine call is assumed to take
multiple arguments in the form of a list operator.
=head2 Anonymous subroutines
The general syntax for anonymous subroutines is:
sub ( PARAMS ) TRAITS {...}
But one can also use the C<anon> scope modifier to introduce the return type first:
anon RETTYPE sub ( PARAMS ) TRAITS {...}
When an anonymous subroutine will be assigned to a scalar variable,
the variable can be declared with the signature of the routines that
will be assigned to it:
my $grammar_factory:(Str, int, int --> Grammar);
$grammar_factory = sub (Str $name, int $n, int $x --> Grammar) { ... };
Covariance allows a routine (that has a more derived return type than what is
defined in the scalar's signature) to be assigned to that scalar.
Contravariance allows a routine (with parameter types that are less derived
than those in the scalar's signature) to be assigned to that scalar. The
compiler may choose to enforce (by type-checking) such assignments at
compile-time, if possible. Such type annotations are intended to help the
compiler optimize code to the extent such annotations are included and/or to
the extent they aid in type inference.
The same signature can be used to mark the type of a closure parameter to
another subroutine:
sub (int $n, &g_fact:(Str, int, int --> Grammar) --> Str) { ... }
B<Trait> is the name for a compile-time (C<is>) property.
See L<"Properties and traits">.
=head2 Perl5ish subroutine declarations
You can declare a sub without parameter list, as in Perl 5:
sub foo {...}
This is equivalent to one of:
sub foo () {...}
sub foo (*@_) {...}
sub foo (*%_) {...}
sub foo (*@_, *%_) {...}
depending on whether either or both of those variables are used in the body of the routine.
Positional arguments implicitly come in via the C<@_> array, but
unlike in Perl 5 they are C<readonly> aliases to actual arguments:
sub say { print qq{"@_[]"\n}; } # args appear in @_
sub cap { $_ = uc $_ for @_ } # Error: elements of @_ are read-only
Also unlike in Perl 5, Perl 6 has true named arguments, which come in
via C<%_> instead of C<@_>. (To construct pseudo-named arguments that
come in via C<@_> as in Perl 5, the p5-to-p6 translator will define and use the ugly
C<< p5=> >> operator instead of Perl 6's C<< => >> Pair constructor.)
If you need to modify the elements of C<@_> or C<%_>, declare the
array or hash explicitly with the C<is rw> trait:
sub swap (*@_ is rw, *%_ is rw) { @_[0,1] = @_[1,0]; %_<status> = "Q:S"; }
Note: the C<rw> container trait is automatically distributed to the
individual elements by the slurpy star even though there is no
actual array or hash passed in. More precisely, the slurpy star
means the declared formal parameter is I<not> considered readonly; only
its elements are. See L</Parameters and arguments> below.
Note also that if the sub's block contains placeholder variables
(such as C<$^foo> or C<$:bar>), those are considered to be formal
parameters already, so in that case C<@_> or C<%_> fill the role of
sopping up unmatched arguments. That is, if those containers are
explicitly mentioned within the body, they are added as slurpy
parameters. This allows you to easily customize your error message
on unrecognized parameters. If they are not mentioned in the body,
they are not added to the signature, and normal dispatch rules will
simply fail if the signature cannot be bound.
=head2 Blocks
Raw blocks are also executable code structures in Perl 6.
Every block defines an object of type C<Block> (which C<does Callable>), which may either be
executed immediately or passed on as a C<Block> object. How a block is
parsed is context dependent.
A bare block where an operator is expected terminates the current
expression and will presumably be parsed as a block by the current
statement-level construct, such as an C<if> or C<while>. (If no
statement construct is looking for a block there, it's a syntax error.)
This form of bare block requires leading whitespace because a bare
block where a postfix is expected is treated as a hash subscript.
A bare block where a term is expected merely produces a C<Block> object.
If the term bare block occurs in a list, it is considered the final
element of that list unless followed immediately by a comma or colon
(intervening C<\h*> or "unspace" is allowed).
=head2 "Pointy blocks"
Semantically the arrow operator C<< -> >> is almost a synonym for the
C<sub> keyword as used to declare an anonymous subroutine, insofar as
it allows you to declare a signature for a block of code. However,
the parameter list of a pointy block does not require parentheses,
and a pointy block may not be given traits. In most respects,
though, a pointy block is treated more like a bare block than like
an official subroutine. Syntactically, a pointy block may be used
anywhere a bare block could be used:
my $sq = -> $val { $val**2 };
say $sq(10); # 100
my @list = 1..3;
for @list -> $elem {
say $elem; # prints "1\n2\n3\n"
}
It also behaves like a block with respect to control exceptions.
If you C<return> from within a pointy block, the block is transparent
to the return; it will return from the innermost enclosing C<sub> or
C<method> (et al.), not from the block itself. It is referenced by C<&?BLOCK>,
not C<&?ROUTINE>.
A normal pointy block's parameters default to C<readonly>, just like
parameters to a normal sub declaration. However, the double-pointy variant
defaults parameters to C<rw>:
for @list <-> $elem {
$elem++;
}
This form applies C<rw> to all the arguments:
for @kv <-> $key, $value {
$key ~= ".jpg";
$value *= 2 if $key ~~ :e;
}
=head2 Stub declarations
To predeclare a subroutine without actually defining it, use a "stub block":
sub foo {...} # Yes, those three dots are part of the actual syntax
The old Perl 5 form:
sub foo;
is a compile-time error in Perl 6 (because it would imply that the body of the
subroutine extends from that statement to the end of the file, as C<class> and
C<module> declarations do). The only allowed use of the semicolon form is to
declare a C<MAIN> sub--see L</Declaring a MAIN subroutine> below.
Redefining a stub subroutine does not produce an error, but redefining
an already-defined subroutine does. If you wish to redefine a defined sub,
you must explicitly use the "C<supersede>" declarator. (The compiler may
refuse to do this if it has already committed to the previous definition.)
The C<...> is the "yadayadayada" operator, which is executable but
returns a failure. You can also use C<???> to produce a warning,
or C<!!!> to always die. These also officially define stub blocks.
Any of these yada operators will be taken as a stub if used as the main
operator of the first statement in the block. (Statement modifiers
are allowed on that statement.) The yada operators differ from their
respective named functions in that they all default to a message
such as: "Unimplemented stub of sub foo was executed".
It has been argued that C<...> as literal syntax is confusing when
you might also want to use it for metasyntax within a document.
Generally this is not an issue in context; it's never an issue in the
program itself, and the few places where it could be an issue in the
documentation, a comment will serve to clarify the intent, as above.
The rest of the time, it doesn't really matter whether the reader
takes C<...> as literal or not, since the purpose of C<...> is to
indicate that something is missing whichever way you take it.
=head2 Globally scoped subroutines
Subroutines and variables can be declared in the global namespace
(or any package in the global namespace), and are thereafter visible
everywhere in the program via the GLOBAL package (or one of its
subpackages). They may be made directly visible by importation,
but may not otherwise be called with a bare identifier, since subroutine
dispatch only looks in lexical scopes.
Global subroutines and variables are normally referred to by prefixing
their identifiers with the C<*> twigil, to allow dynamically scoped overrides.
GLOBAL::<$next_id> = 0;
sub GLOBAL::saith($text) { say "Yea verily, $text" }
module A {
my $next_id = 2; # hides any global or package $next_id
&*saith($next_id); # print the lexical $next_id;
&*saith($*next_id); # print the dynamic $next_id;
}
To disallow dynamic overrides, you must access the globals directly:
GLOBAL::saith($GLOBAL::next_id);
The fact that this is verbose is construed to be a feature. Alternately,
you may play aliasing tricks like this:
module B {
import GLOBAL <&saith $next_id>;
saith($next_id); # Unambiguously the global definitions
}
Despite the fact that subroutine dispatch only looks in lexical scopes, you
can always call a package subroutine directly if there's a lexical alias
to it, as the C<our> declarator does:
module C;
our sub saith($text) { say "Yea verily, $text" }
saith("I do!") # okay
C::saith("I do!") # also okay
=head2 Dynamically scoped subroutines
Similarly, you may define dynamically scoped subroutines:
my sub myfunc ($x) is dynamic { ... }
my sub &*myfunc ($x) { ... } # same thing
This may then be invoked via the syntax for dynamic variables:
&*myfunc(42);
=head2 Lvalue subroutines
Lvalue subroutines return a "proxy" object that can be assigned to.
It's known as a proxy because the object usually represents the
purpose or outcome of the subroutine call.
Subroutines are specified as being lvalue using the C<is rw> trait.
An lvalue subroutine may return a variable:
my $lastval;
sub lastval () is rw { return $lastval }
or the result of some nested call to an lvalue subroutine:
sub prevval () is rw { return lastval() }
or a specially tied proxy object, with suitably programmed
C<FETCH> and C<STORE> methods:
sub checklastval ($passwd) is rw {
return Proxy.new:
FETCH => method {
return lastval();
},
STORE => method ($val) {
die unless check($passwd);
lastval() = $val;
};
}
Other methods may be defined for specialized purposes such as temporizing
the value of the proxy.
=head2 Operator overloading
Operators are just subroutines with special names and scoping.
An operator name consists of a grammatical category name followed by
a single colon followed by an operator name specified as if it were
a one or more strings. So any of these indicates the same binary addition operator:
infix:<+>
infix:«+»
infix:<<+>>
infix:['+']
infix:["+"]
Use the C<&> sigil just as you would on ordinary subs.
Unary operators are defined as C<prefix> or C<postfix>:
sub prefix:<OPNAME> ($operand) {...}
sub postfix:<OPNAME> ($operand) {...}
Binary operators are defined as C<infix>:
sub infix:<OPNAME> ($leftop, $rightop) {...}
Bracketing operators are defined as C<circumfix> where a term is expected
or C<postcircumfix> where a postfix is expected. A two-element slice
containing the leading and trailing delimiters is the name of the
operator.
sub circumfix:<LEFTDELIM RIGHTDELIM> ($contents) {...}
sub circumfix:['LEFTDELIM','RIGHTDELIM'] ($contents) {...}
Contrary to Apocalypse 6, there is no longer any rule about splitting an even
number of characters. You must use a two-element slice. Such names
are canonicalized to a single form within the symbol table, so you
must use the canonical name if you wish to subscript the symbol table
directly (as in C<< PKG::{'infix:<+>'} >>). Otherwise any form will
do. (Symbolic references do not count as direct subscripts since they
go through a parsing process.) The canonical form always uses angle
brackets and a single space between slice elements. The elements
are escaped on brackets, so C<< PKG::circumfix:['<','>'] >> is canonicalized
to C<<< PKG::{'circumfix:<\< \>>'} >>>, and decanonicalizing may always
be done left-to-right.
Operator names can be any sequence of non-whitespace characters
including Unicode characters. For example:
sub infix:<(c)> ($text, $owner) { return $text but Copyright($owner) }
method prefix:<±> (Num $x --> Num) { return +$x | -$x }
multi sub postfix:<!> (Int $n) { $n < 2 ?? 1 !! $n*($n-1)! }
macro circumfix:«<!-- -->» ($text) is parsed / .*? / { "" }
my $document = $text (c) $me;
my $tolerance = ±7!;
<!-- This is now a comment -->
Whitespace may never be part of the name (except as separator
within a C<< <...> >> or C<«...»> slice subscript, as in the example above).
A null operator name does not define a null or whitespace operator, but
a default matching subrule for that syntactic category, which is useful when
there is no fixed string that can be recognized, such as tokens beginning
with digits. Such an operator I<must> supply an C<is parsed> trait.
The Perl grammar uses a default subrule for the C<:1st>, C<:2nd>, C<:3rd>,
etc. regex modifiers, something like this:
sub regex_mod_external:<> ($x) is parsed(token { \d+[st|nd|rd|th] }) {...}
Such default rules are attempted in the order declared. (They always follow
any rules with a known prefix, by the longest-token-first rule.)
Although the name of an operator can be installed into any package or
lexical namespace, the syntactic effects of an operator declaration are
always lexically scoped. Operators other than the standard ones should
not be installed into the C<*> namespace. Always use exportation to make
non-standard syntax available to other scopes.
=head1 Parameters and arguments
Perl 6 subroutines may be declared with parameter lists.
By default, all parameters are readonly aliases to their corresponding
arguments--the parameter is just another name for the original
argument, but the argument can't be modified through it. This is
vacuously true for value arguments, since they may not be modified in
any case. However, the default forces any container argument to also
be treated as an immutable value. This extends down only one level;
an immutable container may always return an element that is mutable if
it so chooses. (For this purpose a scalar variable is not considered
a container of its singular object, though, so the top-level object
within a scalar variable is considered immutable by default. Perl 6
does not have references in the same sense that Perl 5 does.)
To allow modification, use the C<is rw> trait. This requires a mutable
object or container as an argument (or some kind of type object that
can be converted to a mutable object, such as might be returned
by an array or hash that knows how to autovivify new elements).
Otherwise the signature fails to bind, and this candidate routine
cannot be considered for servicing this particular call. (Other multi
candidates, if any, may succeed if they don't require C<rw> for this
parameter.) In any case, failure to bind does not by itself cause
an exception to be thrown; that is completely up to the dispatcher.
To pass-by-copy, use the C<is copy> trait. An object container will
be cloned whether or not the original is mutable, while an (immutable)
value will be copied into a suitably mutable container. The parameter
may bind to any argument that meets the other typological constraints
of the parameter.
If you have a readonly parameter C<$ro>, it may never be passed on to
a C<rw> parameter of a subcall, whether or not C<$ro> is currently
bound to a mutable object. It may only be rebound to readonly or
copy parameters. It may also be rebound to a parcel parameter (see
"C<is parcel>" below), but modification will fail as in the case where
an immutable value is bound to a C<parcel> parameter.
Aliases of C<$ro> are also readonly, whether generated explicitly with C<:=>
or implicitly within a C<Capture> object (which are themselves immutable).
Also, C<$ro> may not be returned from an lvalue subroutine or method.
Parameters may be required or optional. They may be passed by position,
or by name. Individual parameters may confer an item or list context
on their corresponding arguments, but unlike in Perl 5, this is decided
lazily at parameter binding time.
Arguments destined for required positional parameters must come before
those bound to optional positional parameters. Arguments destined
for named parameters may come before and/or after the positional
parameters. (To avoid confusion it is highly recommended that all
positional parameters be kept contiguous in the call syntax, but
this is not enforced, and custom arg list processors are certainly
possible on those arguments that are bound to a final slurpy or
arglist variable.)
A signature containing a name collision is considered a compile time
error. A name collision can occur between positional parameters, between
named parameters, or between a positional parameter and a named one.
The sigil is not considered in such a comparison, except in the case of
two positional parameters -- in other words, a signature in which two
or more parameters are identical except for the sigil is still OK (but
you won't be able to pass values by that name).
:($a, $a) # wrong, two $a
:($a, @a) # OK (but don't do that)
:($a, :a($b)) # wrong, one a, one a through renaming
:($a, :a(@b)) # wrong
:(:$a, :@a) # wrong
=head2 Named arguments
Named arguments are recognized syntactically at the "comma" level.
Since parameters are identified using identifiers, the recognized
syntaxes are those where the identifier in question is obvious.
You may use either the adverbial form, C<:name($value)>, or the
autoquoted arrow form, C<< name => $value >>. These must occur at
the top "comma" level, and no other forms are taken as named pairs
by default. Pairs intended as positional arguments rather than named
arguments may be indicated by extra parens or by explicitly quoting
the key to suppress autoquoting:
doit :when<now>,1,2,3; # always a named arg
doit (:when<now>),1,2,3; # always a positional arg
doit when => 'now',1,2,3; # always a named arg
doit (when => 'now'),1,2,3; # always a positional arg
doit 'when' => 'now',1,2,3; # always a positional arg
Only bare keys with valid identifier names are recognized as named arguments:
doit when => 'now'; # always a named arg
doit 'when' => 'now'; # always a positional arg
doit 123 => 'now'; # always a positional arg
doit :123<now>; # always a positional arg
Going the other way, pairs intended as named arguments that don't look
like pairs must be introduced with the C<|> prefix operator:
$pair = :when<now>;
doit $pair,1,2,3; # always a positional arg
doit |$pair,1,2,3; # always a named arg
doit |get_pair(),1,2,3; # always a named arg
doit |('when' => 'now'),1,2,3; # always a named arg
Note the parens are necessary on the last one due to precedence.
Likewise, if you wish to pass a hash and have its entries treated as
named arguments, you must dereference it with a C<|>:
%pairs = (:when<now>, :what<any>);
doit %pairs,1,2,3; # always a positional arg
doit |%pairs,1,2,3; # always named args
doit |%(get_pair()),1,2,3; # always a named arg
doit |%('when' => 'now'),1,2,3; # always a named arg
Variables with a C<:> prefix in rvalue context autogenerate pairs, so you
can also say this:
$when = 'now';
doit $when,1,2,3; # always a positional arg of 'now'
doit :$when,1,2,3; # always a named arg of :when<now>
In other words C<:$when> is shorthand for C<:when($when)>. This works
for any sigil:
:$what :what($what)
:@what :what(@what)
:%what :what(%what)
:&what :what(&what)
Ordinary hash notation will just pass the value of the hash entry as a
positional argument regardless of whether it is a pair or not.
To pass both key and value out of hash as a positional pair, use C<:p>
instead:
doit %hash<a>:p,1,2,3;
doit %hash{'b'}:p,1,2,3;
The C<:p> stands for "pairs", not "positional"--the C<:p> adverb may be
placed on any C<Associative> access subscript to make it mean "pairs" instead of "values".
If you want the pair (or pairs) to be interpreted as named arguments,
you may do so by prefixing with the C<< prefix:<|> >> operator:
doit |(%hash<a>:p),1,2,3;
doit |(%hash{'b'}:p),1,2,3;
(The parens are required to keep the C<:p> adverb from attaching to C<< prefix:<|> >> operator.)
C<Pair> constructors are recognized syntactically at the call level and
put into the named slot of the C<Capture> structure. Hence they may be
bound to positionals only by name, not as ordinary positional C<Pair>
objects. Leftover named arguments can be slurped into a slurpy hash.
Because named and positional arguments can be freely mixed, the
programmer always needs to disambiguate pairs literals from named
arguments with parentheses or quotes:
# Named argument "a"
push @array, 1, 2, :a<b>;
# Pair object (a=>'b')
push @array, 1, 2, (:a<b>);
push @array, 1, 2, 'a' => 'b';
Perl 6 allows multiple same-named arguments, and records the relative
order of arguments with the same name. When there are more than one
argument, the C<@> sigil in the parameter list causes the arguments
to be concatenated:
sub fun (Int @x) { ... }
fun( x => 1, x => 2 ); # @x := (1, 2)
fun( x => (1, 2), x => (3, 4) ); # @x := (1, 2, 3, 4)
Other sigils bind only to the I<last> argument with that name:
sub fun (Int $x) { ... }
fun( x => 1, x => 2 ); # $x := 2
fun( x => (1, 2), x => (3, 4) ); # $x := (3, 4)
This means a hash holding default values must come I<before> known named
parameters, similar to how hash constructors work:
# Allow "x" and "y" in %defaults to be overridden
f( |%defaults, x => 1, y => 2 );
=head2 Invocant parameters
A method invocant may be specified as the first parameter in the parameter
list, with a colon (rather than a comma) immediately after it:
method get_name ($self:) {...}
method set_name ($_: $newname) {...}
The corresponding argument (the invocant) is evaluated in item context
and is passed as the left operand of the method call operator:
print $obj.get_name();
$obj.set_name("Sam");
The invocant is actually stored as the first positional argument of a C<Capture>
object. It is special only to the dispatcher, otherwise it's just a normal
positional argument.
Single-dispatch semantics may also be requested by using the indirect object syntax, with a colon
after the invocant argument. The colon is just a special form of the comma, and has the
same precedence:
set_name $obj: "Sam";
$obj.set_name("Sam"); # same as the above
An invocant is the topic of the corresponding method if that formal
parameter is declared with the name C<$_>. A method's invocant
always has the alias C<self>. Other styles of self can be declared
with the C<self> pragma.
If you have a call of the form:
foo(|$capture)
the compiler must defer the decision on whether to treat it as a method
or function dispatch based on whether the supplied C<Capture>'s first
argument is marked as an invocant. For ordinary calls this can
always be determined at compile time, however.
=head2 Longname parameters
A routine marked with C<multi> can mark part of its parameters to
be considered in the multi dispatch. These are called I<longnames>;
see S12 for more about the semantics of multiple dispatch.
You can choose part of a C<multi>'s parameters to be its longname,
by putting a double semicolon after the last one:
multi sub handle_event ($window, $event;; $mode) {...}
multi method set_name ($self: $name;; $nick) {...}
A parameter list may have at most one double semicolon; parameters
after it are never considered for multiple dispatch (except of course
that they can still "veto" if their number or types mismatch).
[Conjecture: It might be possible for a routine to advertise multiple
long names, delimited by single semicolons. See S12 for details.]
If the parameter list for a C<multi> contains no semicolons to delimit
the list of important parameters, then all positional parameters are
considered important. If it's a C<multi method> or C<multi submethod>,
an additional implicit unnamed C<self> invocant is added to the
signature list unless the first parameter is explicitly marked with a colon.
=head2 Required parameters
Required parameters are specified at the start of a subroutine's parameter
list:
sub numcmp ($x, $y) { return $x <=> $y }
Required parameters may optionally be declared with a trailing C<!>,
though that's already the default for positional parameters:
sub numcmp ($x!, $y!) { return $x <=> $y }
The corresponding arguments are evaluated in item context and may be
passed positionally or by name. To pass an argument by name,
specify it as a pair: C<< I<parameter_name> => I<argument_value> >>.
$comparison = numcmp(2,7);
$comparison = numcmp(x=>2, y=>7);
$comparison = numcmp(y=>7, x=>2);
Pairs may also be passed in adverbial pair notation:
$comparison = numcmp(:x(2), :y(7));
$comparison = numcmp(:y(7), :x(2));
Passing the wrong number of required arguments to a normal subroutine
is a fatal error. Passing a named argument that cannot be bound to a normal
subroutine is also a fatal error. (Methods are different.)
The number of required parameters a subroutine has can be determined by
calling its C<.arity> method:
$args_required = &foo.arity;
=head2 Optional parameters
Optional positional parameters are specified after all the required
parameters and each is marked with a C<?> after the parameter:
sub my_substr ($str, $from?, $len?) {...}
Alternately, optional fields may be marked by supplying a default value.
The C<=> sign introduces a default value:
sub my_substr ($str, $from = 0, $len = Inf) {...}
Default values can be calculated at run-time. They may even use the values of
preceding parameters:
sub xml_tag ($tag, $endtag = matching_tag($tag) ) {...}
Arguments that correspond to optional parameters are evaluated in
item context. They can be omitted, passed positionally, or passed by
name:
my_substr("foobar"); # $from is 0, $len is infinite
my_substr("foobar",1); # $from is 1, $len is infinite
my_substr("foobar",1,3); # $from is 1, $len is 3
my_substr("foobar",len=>3); # $from is 0, $len is 3
Missing optional arguments default to their default values, or to
an undefined value if they have no default. (A supplied argument that is
undefined is not considered to be missing, and hence does not trigger
the default. Use C<//=> within the body for that.)
You may check whether an optional parameter was bound to anything
by calling C<VAR($param).defined>.
=head2 Named parameters
Named-only parameters follow any required or optional parameters in the
signature. They are marked by a prefix C<:>:
sub formalize($text, :$case, :$justify) {...}
This is actually shorthand for:
sub formalize($text, :case($case), :justify($justify)) {...}
If the longhand form is used, the label name and variable name can be
different:
sub formalize($text, :case($required_case), :justify($justification)) {...}
so that you can use more descriptive internal parameter names without
imposing inconveniently long external labels on named arguments.
Multiple name wrappings may be given; this allows you to give both a
short and a long external name:
sub globalize (:g(:global($gl))) {...}
Or equivalently:
sub globalize (:g(:$global)) {...}
Arguments that correspond to named parameters are evaluated in item
context. They can only be passed by name, so it doesn't matter what
order you pass them in:
$formal = formalize($title, case=>'upper');
$formal = formalize($title, justify=>'left');
$formal = formalize($title, :justify<right>, :case<title>);
See S02 for the correspondence between adverbial form and arrow notation.
While named and position arguments may be intermixed, it is suggested
that you keep all the positionals in one place for clarity unless you
have a good reason not to. This is likely bad style:
$formal = formalize(:justify<right>, $title, :case<title>, $date);
Named parameters are optional unless marked with a following C<!>.