Skip to content

Latest commit

 

History

History
573 lines (393 loc) · 16.3 KB

basics.pod

File metadata and controls

573 lines (393 loc) · 16.3 KB

Perl originated as a programming language intended to gather and summarize information from text files. It's still strong in text processing, but Perl 5 is also a powerful general-purpose programming language. Perl 6 is even better.

Suppose that you host a table tennis tournament. The referees tell you the results of each game in the format Player1 Player2 | 3:2, which means that Player1 won against Player2 by 3 to 2 sets. You need a script that sums up how many matches and sets each player has won to determine the overall winner.

The input data looks like this:

Beth Ana Charlie Dave
Ana Dave | 3:0
Charlie Beth | 3:1
Ana Beth | 2:3
Dave Charlie | 3:0
Ana Charlie | 3:1
Beth Dave | 0:3

The first line is the list of players. Every subsequent line records a result of a match.

Here's one way to solve that problem in Perl 6:

This produces the output:

Every Perl 6 program should begin with use v6;. This line tells the compiler which version of Perl the program expects. Should you accidentally run the file with Perl 5, you'll get a helpful error message.

A Perl 6 program consists of zero or more statements. A statement ends with a semicolon or a curly bracket at the end of a line:

my declares a lexical variable. Lexical variables are visible only in the current block from the point of declaration to the end of the block. If there's no enclosing block, it's visible throughout the remainder of the file. A block is any part of the code enclosed between curly braces { }.

A variable name begins with a sigil, which is non-alpha-numeric symbol such as $, @, %, or &--or occasionally the double colon ::. Sigils indicate the structural interface for the variable, such as whether it should be treated as a single value or a compound value or a subroutine, etc. After the sigil comes an identifier, which may consist of letters, digits and the underscore. Between letters you can also use a dash - or an apostrophe ', so isn't and double- click are valid identifiers.

The $ sigil indicates a scalar variable, which indicates that the variable stores a single value.

The built-in function open opens a file, here named scores, and returns a file handle--an object representing that file. The equality sign = assigns that file handle to the variable on the left, which means that $file now stores the file handle.

'scores' is a string literal. A string is a piece of text, and a string literal is a string which appears directly in the program. In this line, it's the argument provided to open.

The right-hand side calls a method --a named group of behavior-- named get on the file handle stored in $file. The get method reads and returns one line from the file, removing the line ending. words is also a method, called on the string returned from get. words decomposes its invocant--the string on which it operates--into a list of words, which here means strings without whitespace. It turns the single string 'Beth Ana Charlie Dave' into the list of strings 'Beth', 'Ana', 'Charlie', 'Dave'.

Finally, this list gets stored in the array @names. The @ sigil marks the declared variable as an Array. Arrays store ordered lists.

These two lines of code declare two hashes. The % sigil marks each variable as a Hash. A Hash is an unordered collection of pairs of keys and values. Other programming languages call that a hash table, dictionary, or map. You can query a hash table for the value that corresponds to a certain $key with %hash{$key}Unlike Perl 5, in Perl 6 the sigil does not change when accessing an array or hash with [ ] or { }. This is called sigil invariance..

In the score counting program, %matches stores the number of matches each player has won. %sets stores the number of sets each player has won.

The following paragraph seems out of place.

Sigils indicate the default access method for a variable. Variables with the @ sigil are accessed positionally; variables with the % sigil are accessed by string key. The $ sigil, however, indicates a general container that can hold anything and be accessed in any manner. A scalar can even contain a compound object like an Array or a Hash; the $ sigil signifies that it should be treated as a single value, even in a context that expects multiple values (as with an Array or Hash).

for produces a loop that runs the block delimited by curly brackets and containing ... once for each item of the list, setting the variable $line to the current value of each iteration. $file.lines produces a list of the lines read from the file scores, starting with the line where the previous calls to $file.get left off, and going all the way to the end of the file.

During the first iteration, $line will contain the string Ana vs Dave | 3:0. During the second, Charlie vs Beth | 3:1, and so on.

my can declare multiple variables simultaneously. The right-hand side of the assignment is a call to a method named split, passing along the string ' | ' as an argument.

split decomposes its invocant into a list of strings, so that joining the list items with the separator ' | ' produces the original string.

$pairing gets the first item of the returned list, and $result the second.

After processing the first line, $pairing will hold the string Ana vs Dave and $result 3:0.

The next two lines follow the same pattern:

The first extracts and stores the names of the two players in the variables $p1 and $p2. The second extracts the results for each player and stores them in $r1 and $r2.

After processing the first line of the file, the variables contain the values:

The program then counts the number of sets each player has won:

This is a shortcut for:

+= $r1 means increase the value in the variable on the left by $r1. In the first iteration %sets{$p1} is not yet set, so it defaults to a special value called Any. The addition and incrementing operators treat Any as a number with the value of zero.

Before these two lines execute, %sets is empty. Adding to an entry not in the hash will cause that entry to spring into existence just-in-time, with a value starting at zero. (This is autovivification). After these two lines have run for the first time, %sets contains 'Ana' => 3, 'Dave' => 0. (The fat arrow => separates key and value in a Pair.)

If $r1 has a larger value than $r2, %matches{$p1} increments by one. If $r1 is not larger than $r2, %matches{$p2} increments. Just as in the case of +=, if either hash value did not exist previously, it is autovivified by the increment operation.

$thing++ is short for $thing += 1 or $thing = $thing + 1, with the small exception that the return value of the expression is $thing before the increment, not the incremented value. If, as you can do in many other programming languages, you can use ++ as a prefix, it returns the incremented value; my $x = 1; say ++$x prints 2.

This line consists of three individually simple steps. An array's sort method returns a sorted version of the array's contents. However, the default sort on an array sorts by its contents. To print player names in winner-first order, the code must sort the array by the scores of the players, not their names. The sort method's argument is a block used to transform the array elements (the names of players) to the data by which to sort. The array items are passed in through the topic variable $_.

You have seen blocks before: both the for loop -> $line { ... } and the if statement worked on blocks. A block is a self-contained piece of Perl 6 code with an optional signature (the -> $line part). See signatures for more information.

The simplest way to sort the players by score would be @names.sort({ %matches{$_} }), which sorts by number of matches won. However Ana and Dave have both won two matches. That simple sort doesn't account for the number of sets won, which is the secondary criterion to decide who has won the tournament.

When two array items have the same value, sort leaves them in the same order as it found them. Computer scientists call this a stable sort. The program takes advantage of this property of Perl 6's sort to achieve the goal by sorting twice: first by the number of sets won (the secondary criterion), then by the number of matches won.

After the first sorting step, the names are in the order Beth Charlie Dave Ana. After the second sorting step, it's still the same, because no one has won fewer matches but more sets than someone else. Such a situation is entirely possible, especially at larger tournaments.

sort sorts in ascending order, from smallest to largest. This is the opposite of the desired order. Therefore, the code calls the .reverse method on the result of the second sort, and stores the final list in @sorted.

To print out the players and their scores, the code loops over @sorted, setting $n to the name of each player in turn. Read this code as "For each element of sorted, set $n to the element, then execute the contents of the following block." say prints its arguments to the standard output (the screen, normally), followed by a newline. (Use print if you don't want the newline at the end.)

When you run the program, you'll see that say doesn't print the contents of that string verbatim. In place of $n it prints the contents of the variable $n-- the names of players stored in $n. This automatic substitution of code with its contents is interpolation. This interpolation happens only in strings delimited by double quotes "...". Single quoted strings '...' do not interpolate:

Double quoted strings in Perl 6 can interpolate variables with the $ sigil as well as blocks of code in curly braces. Since any arbitrary Perl code can appear within curly braces, Arrays and Hashes may be interpolated by placing them within curly braces.

Arrays within curly braces are interpolated with a single space character between each item. Hashes within curly braces are interpolated as a series of lines. Each line will contain a key, followed by a tab character, then the value associated with that key, and finally a newline.

When array and hash variables appear directly in a double-quoted string (and not inside curly brackets), they are only interpolated if their name is followed by a postcircumfix -- a bracketing pair that follows a statement. It's also ok to have a method call between the variable name and the postcircumfix.

Exercises

1. The input format of the example program is redundant: the first line containing the name of all players is not necessary, because you can find out which players participated in the tournament by looking at their names in the subsequent rows.

How can you change the program if the first input line is omitted? Hint: %hash.keys returns a list of all keys stored in %hash.

Answer: Remove the line my @names = $file.get.words;, and change:

... into:

2. Instead of removing the redundancy, you can also use it to warn if a player appears that wasn't mentioned in the first line, for example due to a typo. How would you modify your program to achieve that?

Answer: Introduce another hash with the names of the legitimate players as keys, and look in this hash when the name of a player is read:

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 1:

Unknown directive: =head0

Around line 166:

Deleting unknown formatting code N<>

Deleting unknown formatting code M<>

Around line 182:

'=end' without a target?