-
Notifications
You must be signed in to change notification settings - Fork 292
/
syntax.pod6
595 lines (407 loc) · 17.9 KB
/
syntax.pod6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
=begin pod
=TITLE Syntax
=SUBTITLE General rules of Perl 6 syntax
Perl 6 borrows many concepts from human language. Which is not surprising,
considering it was designed by a linguist.
It reuses common elements in different contexts, has the notion of nouns
(terms) and verbs (operators), is context-sensitive (in the every day sense, not
necessarily in the Computer Science interpretation), so a symbol can have a
different meaning depending on whether a noun or a verb is expected.
It is also self-clocking, so that the parser can detect most of the common
errors and give good error messages.
=head1 Lexical Conventions
Perl 6 code is Unicode text, current implementations support UTF-8 as the
input encoding. More detail on Perl 6 syntax and Unicode is discussed
in L<Unicode versus Texas symbols|/language/unicode_texas> as well as
parts of other documents related to those topics.
=head2 Free Form
It is free-form, in the sense that you are mostly free to
chose the amount of whitespace you chose, though in some cases, the presence
or absence of whitespace carries meaning.
So you can write
=begin code
if True {
say "Hello";
}
=end code
or
=begin code
if True {
say "Hello";
}
=end code
or
=begin code
if True { say "Hello" }
=end code
or even
=begin code
if True {say "Hello"}
=end code
though you can't leave out any of the remaining whitespace.
=head2 X<Unspace|syntax,\>
In many places where the compiler would not allow a space you can use any
whitespace that is quoted with a backslash. Unspaces in tokens are not
supported. Newlines that are unspaced still count when the compiler produces
line numbers. Use cases for unspace are separation of postfix operators and
routine argument lists.
sub alignment(+@l) { +@l };
sub long-name-alignment(+@l) { +@l };
alignment\ (1,2,3,4).say;
long-name-alignment(3,5)\ .say;
=head2 Separating Statements
A Perl 6 program is a list of statements, separated by semicolons C<;>.
A semicolon after the final statement (or after the final statement inside a
block) is optional, though it's good form to include it.
A closing curly brace followed by a newline character implies a statement
separator, which is why you don't need to write a semicolon after an C<if>
statement block.
=begin code
if True {
say "Hello";
}
say "world";
=end code
Both semicolons are optional here, but leaving them out increases the chance
of syntax errors when adding more lines later.
You do need to include a semicolon between the C<if> block and the say statement if you want them all on one line.
=begin code
if True { say "Hello" }; say "world";
# ^^^ this ; is required
=end code
=head2 Comments
Comments are parts of the program text only intended for human readers, and
the Perl 6 compilers does not evaluate them as program text.
Comments count as whitespace in places where the absence or presence of
whitespace disambiguates possible parses.
=head3 Single-line comments
The most common form of comments in Perl 6 starts with a single hash character
C<#> and goes until the end of the line.
=begin code
if $age > 250 { # catch obvious outliers
# this is another comment!
die "That doesn't look right"
}
=end code
=head3 Multi-line / embedded comments
Multi-line and embedded comments start with a hash character, followed by a
backtick, and then some opening bracketing character, and end with the matching
closing bracketing character. The content can not only span multiple lines, but
can also be embedded inline.
=begin code
if #`( why would I ever write an inline comment here? ) True {
say "something stupid";
}
=end code
Brackets inside the comment can be nested, so in C<#`{ a { b } c }>, the
comment goes until the very end of the string. You may also use more complex
brackets, such as C<#`{{ double-curly-brace }}>, which might help disambiguate
from nested brackets.
=head3 Pod comments
Pod syntax can be used for multi-line comments
=begin code
say "this is code";
=begin comment
Here are several
lines
of comment
=end comment
say 'code again';
=end code
=head2 Identifiers
Identifiers are a grammatical building block that occur in several places. An
identifier is a primitive name, and must start with an alphabetic character
(or an underscore), followed by zero or more word characters (alphabetic,
underscore or number). You can also embed dashes C<-> or single quotes C<'>
in the middle, but not two in a row, and only if followed immediately by an
alphabetic character.
# valid identifiers:
x
something-longer
with-numbers1234
don't
fish-food
π
# not valid identifiers:
with-numbers1234-5
42
is-prime?
fish--food
Names of constants, types (including classes and modules) and routines (subs
and methods) are identifiers, and they also appear in variable names (usually
proceeded by a sigil; see L<variables|/language/variables> for more details.)
Namespaces are provided by L<packages|/language/packages>. By separating
identifiers with double colons, the right most name is inserted into existing
or automatically created packages.
my Int $Foo::Bar::buzz = 42;
dd $Foo::Bar::buzz; # OUTPUT«Int $v = 42»
Identifiers can contain colon pairs. The entire colon pair becomes part of the
name of the identifier.
my $foo:bar = 1;
my $foo:bar<2> = 2;
dd MY::.keys;
# OUTPUT«("\$=pod", "!UNIT_MARKER", "EXPORT", "\$_", "\$!", "::?PACKAGE",
# "GLOBALish", "\$¢", "\$=finish", "\$/", "\$foo:bar<1>", "\$foo:bar<2>",
# "\$?PACKAGE").Seq»
Unicode superscript numerals are exponents and not part of an identifier;
e.g. C<$x²> does the square of variable C<$x>. Subscript numerals are TBD.
=head1 Statements and Expressions
Perl 6 programs are made of lists of statements. A special case of a statement
is an I<expression>, which returns a value. For example C<if True { say 42 }>
is syntactically a statement, but not an expression, whereas C<1 + 2> is an
expression (and thus also a statement).
The C<do> prefix turns statements into expressions. So while
my $x = if True { 42 }; # Syntax error!
is an error,
my $x = do if True { 42 };
assigns the return value of the if statement (here C<42>) to the variable
C<$x>.
=head1 Terms
Terms are the basic nouns that, optionally together with operators, can form
expressions. Examples for terms are variables (C<$x>), barewords such as type
names (C<Int>), literals (C<42>), declarations (C<sub f() { }>) and calls
(C<f()>).
For example, in the expression C<2 * $salary>, C<2> and C<$salary> are two
terms (an L<integer|/type/Int> literal and a L<variable|/language/variables>).
=head2 Variables
Variables typically start with a special character called the I<sigil>, and
are followed by an identifier. Variables must be declared before you can use
them.
# declaration:
my $number = 21;
# usage:
say $number * 2;
See the L<documentation on variables|/language/variables> for more details.
=head2 Barewords (Constants, Type Names)
Pre-declared identifiers can be terms on their own. Those are typically type
names or constants, but also the term C<self> which refers to an object that
a method was called on (see L<objects|/language/objects>), and sigilless
variables:
say Int; # (Int)
# ^^^ type name (built in)
constant answer = 42;
say answer;
# ^^^^^^ constant
class Foo {
method type-name {
self.^name;
# ^^^^ built-in term 'self'
}
}
say Foo.type-name; # Foo
# ^^^ type name
=head2 Packages and Qualified Names
Named entities, such as variables, constants, classes, modules, subs, etc, are
part of a namespace. Nested parts of a name use C<::> to separate the
hierarchy. Some examples:
$foo # simple identifiers
$Foo::Bar::baz # compound identifiers separated by ::
$Foo::($bar)::baz # compound identifiers that perform interpolations
Foo::Bar::bob(23) # function invocation given qualified name
See the L<documentation on packages|/language/packages> for more details.
=head2 Literals
A L<literal|https://en.wikipedia.org/wiki/Literal_%28computer_programming%29>
is a representation of a constant value in source code. Perl 6 has literals
for several built-in types, like L<strings|/type/Str>, several numeric types,
L<pairs|/type/Pair> and more.
=head3 String literals
String literals are surrounded by quotes:
say 'a string literal';
say "a string literal\nthat interprets escape sequences";
See L<quoting|/language/quoting> for many more options.
=head3 X<Number literals|0b (radix form);0d (radix form);0o (radix form);0x (radix form)>
Number literals are generally specified in base ten, unless a prefix like
C<0x> (heB<x>adecimal, base 16), C<0o> (B<o>ctal, base 8) or C<0b>
(B<b>inary, base 2) or an explicit base in adverbial notation like
C<< :16<A0> >> specifies it otherwise. Unlike other programming languages,
leading zeros do I<not> indicate base 8; instead a compile-time warning is
issued.
In all literal formats, you can use underscores to group digits;
they don't carry any semantic; the following literals all evaluate to the same
number:
1000000
1_000_000
10_00000
100_00_00
=head4 Int literals
Integers default to signed base-10, but you can use other bases. For details,
see L<Int|/type/Int>.
-2 # actually not a literal, but unary - operator applied to numeric literal 2
12345
0xBEEF # base 16
0o755 # base 8
:3<1201> # arbitrary base, here base 3
=head4 Rat literals
L<Rat|/type/Rat> literals (rationals) are very common, and take the place of decimals or floats in many other languages. Integer division also results in a C<Rat>.
1.0
3.14159
-2.5 # Not actually a literal, but still a Rat
:3<21.0012> # Base 3 rational
⅔
2/3 # Not actually a literal, but still a Rat
=head4 Num literals
Scientific notation with an integer exponent to base ten after an C<e> produces
L<floating point number|/type/Num>:
1e0
6.022e23
1e-9
-2e48
2e2.5 # error
=head4 Complex literals
L<Complex|/type/Complex> numbers are written either as an imaginary number
(which is just a rational number with postfix C<i> appended), or as a sum of
a real and an imaginary number:
1+2i
6.123e5i # note that this is 6.123e5 * i and not 6.123 * 10 ** (5i)
=head3 Pair literals
L<Pairs|/type/Pair> are made of a key and a value, and there are two basic forms for
constructing them: C<< key => 'value' >> and C<:key('value')>.
=head4 Arrow pairs
Arrow pairs can have an expression or an identifier on the left-hand side:
identifier => 42
"identifier" => 42
('a' ~ 'b') => 1
=head4 Adverbial pairs (colon pairs)
Short forms without explicit values:
my $thing = 42;
:$thing # same as thing => $thing
:thing # same as thing => True
:!thing # same as thing => False
The variable form also works with other sigils, like C<:&callback> or
C<:@elements>.
Long forms with explicit values:
:thing($value) # same as thing => $value
:thing<quoted list> # same as thing => <quoted list>
:thing['some', 'values'] # same as thing => ['some', 'values']
:thing{a => 'b'} # same as thing => { a => 'b' }
=head3 Array literals
A pair of square brackets can surround an expression to form an itemized
L<Array|/type/Array> literal; typically there is a comma-delimited list
inside:
say ['a', 'b', 42].join(' '); # a b 42
# ^^^^^^^^^^^^^^ Array constructor
The array constructor flattens non-itemized arrays and lists, but not itemized
arrays themselves:
my @a = 1, 2;
# flattens:
say [@a, 3, 4].elems; # 4
# does not flatten:
say [[@a], [3, 4]].elems; # 2
=head3 Hash literals
A pair of curly braces can surround a list of pairs to form a
L<Hash|/type/Hash> literal; typically there is a comma-delimited list of pairs
inside. If a non-pair is used, it is assumed to be a key and the next element
is the value. Most often this is used with simple arrow pairs.
say { a => 3, b => 23, :foo, :dog<cat>, "french", "fries" };
# a => 3, b => 23, dog => cat, foo => True, french => fries
say {a => 73, foo => "fish"}.keys.join(" "); # a foo
# ^^^^^^^^^^^^^^^^^^^^^^^^ Hash constructor
When assigning to a C<%> sigil variable, the curly braces are optional.
my %ages = fred => 23, jean => 87, ann => 4;
By default keys in C<{ }> are forced to strings. To compose a hash with
non-string keys, use a colon prefix:
my $when = :{ (now) => "Instant", (DateTime.now) => "DateTime" };
Note that with objects as keys, you cannot access non-string keys as strings:
:{ -1 => 41, 0 => 42, 1 => 43 }<0>; # Any
:{ -1 => 41, 0 => 42, 1 => 43 }{0}; # 42
=head3 Regex literals
A L<Regex|/type/Regex> is declared with slashes like C</foo/>. Note that this C<//> syntax is shorthand for the full C<rx//> syntax.
/foo/ # Short version
rx/foo/ # Longer version
Q :regex /foo/ # Even longer version
my $r = /foo/; # Regexes can be assigned to variables
=head3 Signature literals
Signatures can be used standalone for pattern matching, in addition to the
typical usage in sub and block declarations. A standalone signature is declared
starting with a colon:
say "match!" if 5, "fish" ~~ :(Int, Str); #=> match!
my $sig = :(Int $a, Str);
say "match!" if (5, "fish") ~~ $sig; #=> match!
given "foo", 42 {
when :(Str, Str) { "This won't match" }
when :(Str, Int $n where $n > 20) { "This will!" }
}
See the L<Signatures|/type/Signature> documentation for more about signatures.
=head2 Declarations
=head3 Variable declaration
my $x; # simple lexical variable
my $x = 7; # initialize the variable
my Int $x = 7; # declare the type
my Int:D $x = 7; # specify that the value must be defined (not undef)
my Int $x where { $_ > 3 } = 7; # constrain the value based on a function
my Int $x where * > 3 = 7; # same constraint, but using L<Whatever> short-hand
See L<Variable Declarators and
Scope|/language/variables#Variable_Declarators_and_Scope>
for more details on other scopes (our, has).
=head3 Subroutine declaration
# The signature is optional
sub foo { say "Hello!" }
sub say-hello($to-whom) { say "Hello $to-whom!" }
You can also assign subroutines to variables.
my &f = sub { say "Hello!" } # Un-named sub
my &f = -> { say "Hello!" } # Lambda style syntax. The & sigil indicates the variable holds a function
my $f = -> { say "Hello!" } # Functions can also be put into scalars
=head3 Module, Class, Role, and Grammar declaration
There are several types of compilation units (packages), each declared with a
keyword, a name, some optional traits, and a body of functions.
module Gar { }
class Foo { }
role Bar { }
grammar Baz { }
You can declare a unit of things without explicit curly brackets.
unit module Gar;
# ... stuff goes here instead of in {}'s
=head3 Multi-dispatch declaration
See also L<Multi-dispatch|/language/functions#Multi-dispatch>.
Subroutines can be declared with multiple signatures.
multi sub foo() { say "Hello!" }
multi sub foo($name) { say "Hello $name!" }
Inside of a class, you can also declare multi-dispatch methods.
multi method greet { }
multi method greet(Str $name) { }
=head1 Subroutine calls
See L<functions|/language/functions>.
=comment TODO
foo; # Invoke the function foo with no arguments
foo(); # Invoke the function foo with no arguments
&f(); # Invoke &f, which contains a function
&f.(); # Same as above, needed to make the following work
my @functions = ({say 1}, {say 2}, {say 3});
@functions>>.();
# Method invocation. Object (instance) is $person, method is set-name-age
$person.set-name-age('jane', 98); # Most common way
$person.set-name: 'jane', 98; # Precedence drop
set-name($person: 'jane', 98); # Invocant marker
=head1 Operators
See L<Operators|/language/operators> for lots of details.
Operators are functions with a more symbol heavy and composable syntax. Like
other functions, operators can be multi-dispatch to allow for context-specific
usage.
There are five types (arrangements) for operators, each taking either one or two arguments.
++$x # prefix, operator comes before single input
5 + 3 # infix, operator is between two inputs
$x++ # postfix, operator is after single input
<the blue sky> # circumfix, operator surrounds single input
%foo<bar> # postcircumfix, operator comes after first input and surrounds second
=head2 Meta Operators
Operators can be composed. A common example of this is combining an infix
(binary) operator with assignment. You can combine assignment with any binary
operator.
$x += 5 # Adds 5 to $x, same as $x = $x + 5
$x min= 3 # Sets $x to the smaller of $x and 3, same as $x = $x min 3
$x .= child # Equivalent to $x = $x.child
Wrap an infix operator in C<[ ]> to create a new reduction operator that works
on a single list of inputs, resulting in a single value.
[+] <1 2 3 4 5> # 15
(((1 + 2) + 3) + 4) + 5 # equivalent expanded version
Wrap an infix operator in C<« »> (or the ASCII equivalent C<<< >>>) to create a
new hyper operator that works pairwise on two lists.
<1 2 3> «+» <4 5 6> # <5 7 9>
The direction of the arrows indicates what to do when the lists are not the same size.
@a «+« @b # Result is the size of @b, elements from @a will be re-used
@a »+» @b # Result is the size of @a, elements from @b will be re-used
@a «+» @b # Result is the size of the biggest input, the smaller one is re-used
@a »+« @b # Exception if @a and @b are different sizes
You can also wrap a unary operator with a hyper operator.
-« <1 2 3> # <-1 -2 -3>
=end pod