Skip to content

Latest commit

 

History

History
345 lines (265 loc) · 14.4 KB

multi-dispatch.pod

File metadata and controls

345 lines (265 loc) · 14.4 KB

JSON, short for Javascript Object Notation, is a simple data exchange format that is often used for communicating with web services. It supports arrays, hashes, numbers, strings, boolean values and null, the undefined value.

The example below presents a section of JSON::Tiny, a minimal library for converting Perl 6 data structures to JSON. The other part of that module, which parses JSON and turns it into Perl 6 data structures, will be presented later (TODO: reference grammar chapter).

The full code, containing additional documentation and tests, can be found at http://github.com/moritz/json/.

# TODO: Clarify numeric types. Only need one of the following two, but
# maybe s/Num/Numeric/.
multi to-json(Num  $d) { ~$d }
multi to-json(Int  $d) { ~$d }
multi to-json(Bool $d) { $d ?? 'true' !! 'false'; }
multi to-json(Str  $d) {
    '"'
    ~ $d.trans(['"',  '\\',   "\b", "\f", "\n", "\r", "\t"]
            => ['\"', '\\\\', '\b', '\f', '\n', '\r', '\t'])
    ~ '"'
}
multi to-json(Array $d) {
    return  '[ '
            ~ $d.values.map({ to-json($_) }).join(', ')
            ~ ' ]';
}
multi to-json(Hash  $d) {
    return '{ '
            ~ $d.pairs.map({ to-json(.key) ~ ' : ' ~ to-json(.value) }).join(', ')
            ~ ' }';
}

multi to-json($d where undef) { 'null' }
multi to-json($d) {
    die "Can't serialize an object of type " ~ $d.WHAT.perl
}

This code defines a single multi sub named to-json, which takes one argument and turns that into a string. However there are many candidates of that sub: subs with the same name, but different signatures.

The various candidates all look like

multi to-json(Bool $data) { code here }
multi to-json(Num  $data) { code here }

Which one is actually called depends on the type of the data passed to the subroutine. So if you call to-json(Bool::True), the first one is called. If you pass a Num to it, the second one is called.

The candidates for handling Num and Int are very simple; since JSON's and Perl 6's number formats coincide, the JSON converter can simply rely on Perl's conversion of these numbers to strings. The Bool candidate returns the literal strings 'true' or 'false'.

The Str candidate does a bit more work: it add quotes at the begin and the end, and substitutes those characters that the JSON spec does not allow in strings - a tabulator by \t, a newline by \n and so on.

The next one is to-json(Array $d), which converts all elements of the array to JSON, joins them with commas and surrounds them with square brackets. The part that converts the individual elements calls to-json again. This doesn't necessarily call the candidate it was called from, but again the one that fits to the type of the argument.

The candidate that processes hashes turns them into the form { "key1" : "value1", "key2" : [ "second", "value" ] } or so. It does this again by recursing into to-json.

Constraints

to-json($d where undef) { 'null' }

This candidate adds something new: it doesn't contain a type defintion, and thus the type of the parameter defaults to Any, which is the root of the "normal" branch of the type hierarchy (more on that later). More interestingly there's the where undef clause, which is a so-called subset type: it matches only some values of the type Any.

The where undef part is actually translated to a smart match. We could write it out explicitly as $d where { $d ~~ undef }. And yes, the curly braces can contain arbitrary code. Whenever the compiler performs a type check on the parameter $d, it first checks the nominal type (which is Any here), and if that check succeeds, it calls the code block. The whole type check will only be considered successful if that returns a true value.

You can abuse this to count how often a type check is performed:

my $counter = 0;
multi a(Int $x) { };
multi a($x) { }
multi a($x where { $counter++; True }) { };

a(3);
say $counter;
a('str');
say $counter;

This piece of code defines three multis, one of which increases a counter whenever its where clause, also called constraint, is executed. Any Perl 6 compiler is free to optimize away type checks it knows will succeed, but if it does not, the second line with say prints a higher number than the first.

(TODO: insert side note "Don't do that at home, kids! Type checks with side effects are a really bad idea in real world code")

Narrowness

Back to the JSON example, there's one candidate not yet explained.

multi to-json($d) {
    die "Can't serialize an object of type " ~ $d.WHAT.perl
}

This has not explicit types or constraint on its parameter at all, so it defaults to Any - and thus matches any object we might pass to it. The code just complains that it doesn't know what to do with the argument, because JSON is just defined for some basic structures.

That might look simple at first, but if you look closer you find that it doesn't just match for objects of all type for which no other candidate is defined - it matches for all objects. Including Int, Bool, Num and so on. So for a call like to-json(2) there are two matching candidates - Bool and Any.

If you try it out, you'll find that the dispatcher (which is the part of the compiler that decides which candidate to call) still calls the Int candidate. It's not about going through them in order either - you can even move the last catch-all candidate to be first and the code still works.

The explanation is that since Int is a type that conforms to Any, it is considered a narrower match for an integer. More generally speaking if you have two types A and B, and A conforms to B (or A ~~ B, as the Perl 6 programmer says), then an object which conforms to A does so more narrowly than to B. And in the case of a multi dispatch the narrowest match always wins.

Also a successfully evaluated constraint makes a match narrower than the absence of a constraint, so in the case of

multi to-json($d) { ... }
multi to-json($d where undef) { ... }

an undefined value is dispatched to the second candidate.

However a matching constraint always contributes less to narrowness than a more specific match in the nominal type.

TODO: Better example
multi a(Any $x where { $x > 0 }) { 'Constraint'   }
multi a(Int $x)                  { 'Nominal type' }
say a(3), ' wins';       # says "Nominal type wins"

This is a restriction that allows a clever optimization in the compiler: It can sort all candidates by narrowness once, and can quickly find the candidate with the best matching signature just by looking at nominal type constraints, which are very cheap to check. Only then will it run the constraint checks (which tend to be far more expensive), and if they fail it considers candidates that are less narrow by nominal types.

Multiple arguments

Multi dispatch is not limited to one parameter and argument. Candidate signatures may contain any number of positional and named arguments, both explicit and slurpy. However only positional parameters contribute to the narrowness of a match.

# RAKUDO currently doesn't like this with two 
# anonymous arguments, see RT #69798
# as a workaround one can add name to one of them.
enum Symbol <Rock Paper Scissors>;
multi wins(Scissors $, Paper    $) { +1 }
multi wins(Paper    $, Rock     $) { +1 }
multi wins(Rock     $, Scissors $) { +1 }
multi wins(::T      $, T        $) {  0 }
multi wins(         $,          $) { -1 }

sub play($a, $b) {
    given wins($a, $b) {
        when +1 { say 'Player One wins' }
        when  0 { say 'Draw'            }
        when -1 { say 'Player Two wins' }
    }
}

play(Scissors, Paper);
play(Paper,    Paper);
play(Rock,     Paper);

This example demonstrates how a popular game can be decided completely by multi dispatch. Boths players independently select a symbol (either rock, paper, or scissors), and scissors win against paper, paper wraps rock, and scissors can't cut rock, but go blunt trying. If both players select the same item, it's a draw.

The example code above creates a type for each possible symbol by declaring an enumerated type, or enum. For each combination of chosen symbols for which Player One wins there's a candidate of the form

multi wins(Scissors $, Paper $) { +1 }

The only new thing here is that the parameters don't have names. Since they are not used in the body of the subroutine anywhere, there's no use in forcing the programmer to come up with a clever name. A $ in a signature just stands for a single, anonymous scalar variable.

The fourth candidate, multi wins(::T $, T $) { 0 } uses ::T, which is a type capture (similar to generics or templates in other programming languages). It binds the nominal type of the first argument to T, which can then act as a type constraint. That means that if you pass a Rock as the first argument, T is an alias for Rock inside the rest of the signature, and also in the body of the routine. So the signature (::T $, T $) is bindable only by two objects of the same type, or if the second is of a subtype of the first.

So in the case of our game it matches only for two objects of the same type, and the routine returns 0 to indicate a draw.

The final candidate is just a fallback for the cases not covered yet - which is when Player Two wins.

If the (Scissors, Paper) candidate matches the supplied argument list, it is two steps narrower than the (Any, Any) fallback, because both Scissors and Paper are direct subtypes of Any, so both contribute one step.

If the (::T, T) candidate matches, the type capture in the first parameter does not contribute any narrowness - it is not a constraint after all. However T is used as a constraint for the second parameter, and accounts for some many step of narrowness as the number of inheritance steps between T and Any. So passing two Rocks means that ::T, T matches one step narrower than Any, Any. So a possible candidate

multi wins(Rock $, Rock $) {
    say "Two rocks? What is this, 20,000 years ago?"
}

would win against (::T, T).

Bindability checks

Implicit constraints can be applied through traits:

multi swap($a is rw, $b is rw) {
    ($a, $b) = ($b, $a);
}

This routine simply exchanges the contents of its two arguments. To do that is has to bind the two arguments as rw, that is as both readable and writable. Trying to call that with an immutable value (for example a number literal) fails.

The built-in function substr can not only extract parts of strings, but also modify them:

# substr(String, Start, Length)
say substr('Perl 5', 0, 4);         # prints Perl
my $p = 'Perl 5';
# substr(String, Start, Length, Substitution)
substr($p, 6, 1, '6');
# now $p contains the string 'Perl 6'

Seeing these two use cases you already know that the three-argument version and the four-argument version are handled by different candidates: the latter binds its first argument as rw:

multi substr($str, $start = 0, $length = *) { ... }
multi substr($str is rw, $start, $length, $substitution) { ... }

This is also an example of candidates with different arity (that is, expecting a different number of arguments). This is seldom really necessary, because it is often a better alternative to make parameters optional. Cases where an arbitrary number of arguments are allowed are handled with slurpy parameters instead:

sub mean(*@values) {
    ([+] @values) / @values;
}

Protos

You have two options to write multi subs: either you start every candidate with multi sub ... or multi ..., or you declare once and for all that the compiler shall view every sub of a given name as a multi candidate. You can do that by installing a proto routine:

proto to-json($) { ... }       # literal ... here
# automatically a multi
sub to-json(Bool $d) { $d ?? 'true' !! 'false' }

Nearly all Perl 6 built-in functions and operators export a proto definition, preventing accidental overriding. =for footnote One of the very rare exceptions is the smart match operator infix:<~~> which is not overloadable easily, but which redispatches to overloadable multi methods. =for footnote To hide all candidates of a multi and replace them by another sub, you can declare it as only sub YourSub, though at the time of writing it is not supported by any compiler.

Multi Methods

Not only subroutines can act as multis, but also methods. For multi method dispatch the invocant participates just like a positional parameter.

The main difference between sub and method calls is where the dispatcher searches for the routines: It looks for subroutines in the current and outer lexical scopes, whereas it looks for methods in the class of the invocant, and recursively in any parent classes.

# XXX should this explanation moved to the OO tutorial?
# XXX     jnthn: in my opinion, yes

# TODO: Multi method dispatch example

With methods you are not limited to calling only one candidate. With the syntax $object.?method it is no error if no matching candidate was found, $object.*method calls all matching candidates and considers it OK to not to dispatch to any candidate if none matched, and $object.+method calls at least one matching candidate.

Toying with the candidate list

For each multi dispatch a list of candidates is built, all of which satisfy the nominal type constraints. For a normal sub or method call, the dispatcher just invokes the first candidate which also passes all additional constraint checks.

A routine can choose to delegate its work to the others candidates in that list. By calling the callsame primitive it calls the next candidate, passing along the arguments it had received itself. If it wants to pass different arguments, it can do so by calling callwith instead. After the called routine has done its work, the callee can continue its work.

If there's no further work to be done, the routine can decide to hand control completely to the next candidate by calling nextsame or nextwith.

This is often used if an object has to clean up after itself. A sub class then can provide its own cleanup method for cleaning the own backyard, and then delegate to its parent class method by calling nextsame to do the rest of the un-dirtying work.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 1:

Unknown directive: =head0