Skip to content

Commit 04ff529

Browse files
committed
[advent] add article, possibly for tomorrow: predictive parsing
1 parent 9c670f8 commit 04ff529

File tree

1 file changed

+39
-0
lines changed

1 file changed

+39
-0
lines changed
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
=head1 Why Perl syntax does what you want
2+
3+
Opening the fifth door of our advent calendar, we don't find a recipe of how
4+
to do something cool with Perl 6 - rather an explanation of how some of the
5+
intuitiveness of the language works.
6+
7+
As an example, consider these two lines of code:
8+
9+
say 6 / 3;
10+
say 'Price: 15 Euro' ~~ /\d+/;
11+
12+
They print out C<2> and C<15>, respectively. For a Perl programmer this is not
13+
surprising. But look closer: the forward slash C</> serves two very different
14+
purposes, the numerical devision in the first line, and delimits a regex in
15+
the second line.
16+
17+
How can Perl know when a C</> means what? It certainly doesn't look at the
18+
text after the slash to decide, because a regex can look just like normal
19+
code.
20+
21+
The answer is that Perl keeps track of what it expects. Most important are
22+
two things to expects: I<terms> and I<operators>.
23+
24+
A I<term> can be literal like C<23> or C<"a string">. After parser finds such
25+
a literal, there can either be the end of a statement (indicated by a
26+
semicolon), or an I<operator> like C<+>, C<*> or C</>. After an operator, the
27+
parser expects a term again.
28+
29+
And that's already the answer: When the parser expects a term, a slash is
30+
recognized as the start of a regex. When it expects an operator, it counts as
31+
a numerical division operator.
32+
33+
This has far reaching consequences. Subroutines can be called without
34+
parenthesis, and after a subroutine name an argument list is expected,
35+
which starts with a term. On the other hand type names are followed by
36+
operators, so at parse time all type names must be known.
37+
38+
On the upside, many characters can be reused for two different syntaxes in a
39+
very convenient way.

0 commit comments

Comments
 (0)