Permalink
Browse files

Adjust proto semantics to address various concerns

The concerns in question are admirably laid out in:

    http://6guts.wordpress.com/2010/10/17/wrestling-with-dispatch/

With the new design, proto routines are no longer thought of as being
called directly, but are generic.  Instead they are instantiated
into "dispatch" routines (where "dispatch" is the same semantic
slot as "only", distinguished only to differentiate them from true
"only" routine so that we can calculate candidate sets correctly
(to which true "only" routines are opaque but "dispatch" routines
are transparent).  In all other respects a dispatch routine is just
an autogenerated "only".  (It is not anticipated that a user would
ever want to write a dispatch directly, but I could be wrong.)

Each instantiated dispatch routine manages its own candidate list.

We also allow for a proto to be autogenerated if none is found in
the outer context.  This should fix complaints about required "proto"
declarations, I hope.
  • Loading branch information...
1 parent 32511f7 commit 60aef3acd56f47b5a78721ca886b9fd3e22b366e @TimToady TimToady committed Oct 26, 2010
Showing with 141 additions and 83 deletions.
  1. +10 −8 S02-bits.pod
  2. +63 −24 S06-routines.pod
  3. +3 −3 S11-modules.pod
  4. +62 −45 S12-objects.pod
  5. +3 −3 S13-overloading.pod
View
@@ -14,7 +14,7 @@ Synopsis 2: Bits and Pieces
Created: 10 Aug 2004
Last Modified: 25 Oct 2010
- Version: 226
+ Version: 227
This document summarizes Apocalypse 2, which covers small-scale
lexical items and typological issues. (These Synopses also contain
@@ -2325,12 +2325,14 @@ then dereference the result of that.
=item *
-With multiple dispatch, C<&foo> is actually the name of the C<proto> controlling a set
-of candidate functions (which you can use as if it were an ordinary function, because
-a C<proto> is really an C<only> function with pretentions to management of a dispatcher).
-However, in that case C<&foo> by itself is not sufficient to uniquely
-name a specific function. To do that, the type may be refined by
-using a signature literal as a postfix operator:
+With multiple dispatch, C<&foo> is actually the name of a C<dispatch>
+routine (instantiated from a C<proto>) controlling a set of candidate
+functions (which you can use as if it were an ordinary function,
+because a C<dispatch> is really an C<only> function with pretentions
+to management of a dispatcher). However, in that case C<&foo>
+by itself is not sufficient to uniquely name a specific function.
+To do that, the type may be refined by using a signature literal as
+a postfix operator:
&foo:(Int,Num)
@@ -4073,7 +4075,7 @@ be multis, or a compile-time error is declared and you must predeclare,
even if one postdeclaration is obviously "closer". A single
C<proto> predeclaration may make all postdeclared C<multi> work fine,
since that's a run-time dispatch, and all multis are effectively
-visible at the point of the controlling C<proto> declaration.
+visible by the time a C<dispatch>'s candidate list is generated.
Parsing of a bareword function as a provisional call is always done
the same way list operators are treated. If a postdeclaration
View
@@ -16,8 +16,8 @@ Synopsis 6: Subroutines
Created: 21 Mar 2003
- Last Modified: 6 Oct 2010
- Version: 144
+ Last Modified: 25 Oct 2010
+ Version: 145
This document summarizes Apocalypse 6, which covers subroutines and the
new type system.
@@ -64,50 +64,72 @@ also adds an implicit C<multi> to all routines of the same short
name within its scope, unless they have an explicit modifier.
(This is particularly useful when adding to rule sets or when attempting
to compose conflicting methods from roles.) Abstractly, the C<proto>
-is a wrapper around the dispatch to the C<multi>s.
+is a generic wrapper around the dispatch to the C<multi>s. Each C<proto>
+is instantianted into an actual dispatcher for each scope that
+needs a different candidate list.
B<Only> (keyword: C<only>) routines do not share their short names
with other routines. This is the default modifier for all routines,
-unless a C<proto> of the same name was already in scope.
+unless a C<proto> of the same name was already in scope. (For subs,
+the governing C<proto> must have been declared in the same file, so
+C<proto> declarations from the setting or other modules don't have
+this effect unless explicitly imported.)
A modifier keyword may occur before the routine keyword in a named routine:
only sub foo {...}
proto sub foo {...}
+ dispatch sub foo {...} # internal
multi sub foo {...}
only method bar {...}
proto method bar {...}
+ dispatch method bar {...} # internal
multi method bar {...}
If the routine keyword is omitted, it defaults to C<sub>.
Modifier keywords cannot apply to anonymous routines.
+A C<proto> is a generic dispatcher, which any given scope with a unique
+candidate list will instantiate into a C<dispatch> routine. Hence
+a C<proto> is never called directly, much like a C<role> can't be
+used as an instantiated object.
+
When you call any routine (or method, or rule) that may have multiple
-candidates, the C<proto> is always called first (at least in the abstract--this
-can often be optimized away). In essence, a C<proto> is dispatched exactly like
-an C<only> sub, but the C<proto> itself may delegate to any of the candidates
-it is "managing".
-
-It is the C<proto>'s responsibility to first vet the arguments for all the
-candidates; any call that does not match the proto's signature fails outright.
-Named arguments that bind to positionals in the C<proto> sig will become positionals
+candidates, the basic dispatcher is really only calling an "only"
+sub or method--but if there are multiple candidates, the "only" that
+will be found is really a dispatcher. This instantiated C<dispatch>
+is always called first (at least in the abstract--this can often be
+optimized away). In essence, a C<dispatch> is dispatched exactly
+like an C<only> sub, but the C<dispatch> itself may delegate to any
+of the candidates it is "managing".
+
+It is the C<dispatch>'s responsibility to first vet the arguments for all the
+candidates; any call that does not successfully bind the C<dispatch>'s signature fails outright.
+(Its signature is a copy of one belonging to the C<proto> from which it was instantiated.)
+The C<dispatch> does not necessarily send the original capture to its candidates, however.
+Named arguments that bind to positionals in the C<dispatch> sig will become positionals
for all subsequent calls to its managed multis.
-The proto then builds (or otherwise acquires) a list of its managed candidates
-from the viewpoint of the caller or object, sorts them into some order,
-and dispatches them according to the rules of multiple dispatch as defined
-for each of the various dispatchers.
+The dispatch then considers its list of managed candidates from the
+viewpoint of the caller or object, sorts them into some order, and
+dispatches them according to the rules of multiple dispatch as defined
+for each of the various dispatchers. In the case of multi subs, the
+candidate list is known at compile time. In the case of multi methods,
+it may be necessary to generate (or regenerate) the candidate list at
+run time, depending on what is known when about the inheritance tree.
-This default behavior is implied by a block containing of a single
-C<*> (that is, a "whatever"). Hence the typical C<proto> will simply
-have a body of C<{*}>.
+This default dispatch behavior is symbolized within the original
+C<proto> by a block containing of a single C<*> (that is, a
+"whatever"). Hence the typical C<proto> will simply have a body
+of C<{*}>.
proto method bar {*}
(We don't use C<...> for that because it would fail at run time,
-and C<proto> blocks are not stubs, but are intended to be executed.)
+and the proto's instantiated C<dispatch> blocks are not stubs, but
+are intended to be executed.)
Other statements may be inserted before and after the C<{*}>
statement to capture control before or after the multi dispatch:
@@ -118,10 +140,11 @@ statement to capture control before or after the multi dispatch:
value, since it returns the result of C<say>, which might not be what
you want. See below for how to fix that.)
-The syntactic form C<&foo> (without a modifying signature) can never refer to
-a C<multi> candidate. It may only refer to the single C<only> or C<proto> routine
-that would first be called by C<foo()>. Individual C<multi>s may be named by
-appending a signature to the noun form: C<&foo:($,$,*@)>.
+The syntactic form C<&foo> (without a modifying signature) can never
+refer to a C<multi> candidate or a generic C<proto>. It may only
+refer to the single C<only> or C<dispatch> routine that would first
+be called by C<foo()>. Individual C<multi>s may be named by appending
+a signature to the noun form: C<&foo:($,$,*@)>.
We used the term "managed" loosely above to indicate the set of C<multi>s in
question; the "managed set" is more accurately defined as the intersection
@@ -203,6 +226,22 @@ Now the semantics of normal method C<proto>s and regex C<proto>s are nearly
identical, apart from the fact that regex candidate lists naturally have
fancier tiebreaking rules involving longest token matching.
+A C<dispatch> must be generated for every scope that contains one or more C<multi>
+declaration. This is done by searching backwards and outwards (or up the
+inheritance chain for methods) for a C<proto> to instantiate. If no such
+C<proto> is found, a "most generic" C<proto> will be generated, something like:
+
+ proto sub foo (*@, *%) {*}
+ proto method foo (*@, *%) {*}
+
+Obviously, no named-to-positional remapping can be done in this case.
+
+[Conjecture: we could instead autogen a more specific signature for
+each such autogenerated C<dispatch> once we know its exact candidate
+set, such that consistent use of positional parameter names is rewarded
+with positional names in the generated signature, which could remap
+named parameters.]
+
=head2 Named subroutines
The general syntax for named subroutines is any of:
View
@@ -13,8 +13,8 @@ Synopsis 11: Modules
Created: 27 Oct 2004
- Last Modified: 9 Jul 2010
- Version: 33
+ Last Modified: 25 Oct 2010
+ Version: 34
=head1 Overview
@@ -110,7 +110,7 @@ to the caller's package.
Any C<proto> declaration that is not declared C<my> is exported by default.
Any C<multi> that depends on an exported C<proto> is also automatically exported.
-(It is not currently allowed to have a C<multi> without a C<proto>.)
+Any autogenerated C<proto> is assumed to be exported by default.
=head1 Dynamic exportation
View
@@ -13,8 +13,8 @@ Synopsis 12: Objects
Created: 27 Oct 2004
- Last Modified: 24 Jul 2010
- Version: 108
+ Last Modified: 25 Oct 2010
+ Version: 109
=head1 Overview
@@ -976,37 +976,68 @@ want to do something with ordered side effects, such as I/O.
=head1 Multisubs and Multimethods
-The "long name" of a subroutine or method includes the type signature
-of its invocant arguments. The "short name" doesn't. If you put
-C<multi> in front of any sub (or method) declaration, it allows
+The "long name" of a subroutine or method includes the type
+signature of its invocant arguments. The "short name" doesn't.
+
+If you put C<multi> in front of any sub declaration, it allows
multiple long names to share a short name, provided all of them are
-declared C<multi>, and there is a single C<proto> that manages them.
-If a sub (or method) is not marked
-with C<multi> and it is not within the package or lexical scope of
-a C<proto> of the same short name, it is considered unique, an I<only>
-sub. You may mark a sub explicitly as C<only> if you're worried it
-might be within the scope of a C<proto>, and you want to suppress
-any other declarations within this scope. An C<only> sub (or method)
-doesn't share with anything outside of it or declared prior to it.
-Only one such sub (or method) can inhabit a given namespace, and it
-hides any outer subs (or less-derived methods) of the same short name.
-
-The default C<proto> declarations provided by Perl from the global
-scope are I<not> automatically propagated to the user's scope
-unless explicitly imported, so a C<sub> declaration there that
-happens to be the same as a global multi is considered C<only> unless
-explicitly marked C<multi>. In the absence of such an explicit C<sub>
-declaration, however, the global proto is used by the compiler in
-the analysis of any calls to that short name. (Since only list
-operators may be post-declared, as soon as the compiler sees a
-non-listop operator it is free to apply the global C<proto> since
-any user-defined C<only> version of it must of necessity be declared
-earlier in the user's lexical scope or not at all.)
-
-A C<proto> always functions as a dispatcher around any C<multi>s declared after it in the same scope,
+declared C<multi>, or there is a single prior or outer C<proto> in
+the same file that causes all unmarked subs to default to multi in
+that lexical scope. If a sub is not marked with C<multi> and it is
+not governed within that same file by a C<proto> of the same short
+name, it is considered unique, an I<only> sub. (An imported C<proto>
+can function as such a governing declaration.)
+
+For method declaratoins, the C<proto>, C<multi>, and C<only>
+declarations work similarly but not identically. The explicit
+declarations work the same, except that calculation of of governance
+and candidate sets proceeds via the inheritance tree rather than via
+lexical scoping. The other difference is that a proto method of a
+given short name forcing all unmarked method declarations to assume
+multi in all subclasses regardless of which file they are declared in,
+unless explicitly overridden via C<only method>.
+
+An C<only> sub (or method) doesn't share with anything outside of it
+or declared prior to it. Only one such sub (or method) can inhabit a
+given namespace (lexical scope or class), and it hides any outer subs
+(or less-derived methods) of the same short name. It is illegal for
+a C<multi> or C<proto> declaration to share the same scope with an
+C<only> declaration of the same short name.
+
+Since they come from a different file, the default C<proto>
+declarations provided by Perl from the setting scope do I<not>
+automatically set the defaults in the user's scope unless explicitly
+imported, so a C<sub> declaration there that happens to be the same
+as a setting C<proto> is considered C<only> unless explicitly marked
+C<multi>. (This allows us to add new C<proto> declarations in the
+setting without breaking the user's old code.) In the absence of
+such an explicit C<sub> declaration, however, the C<proto> from the
+innermost outer lexical scope is used by the compiler in the analysis
+of any calls to that short name. (Since only list operators may be
+post-declared, as soon as the compiler sees a non-listop operator it
+is free to apply the setting's C<proto> since any user-defined C<only>
+version of it must of necessity be declared or imported earlier in
+the user's file or not at all.)
+
+A C<proto> always functions as a dispatcher around any C<multi>s
+declared after it in the same scope, More specifically, it is the
+generic prototype of a dispatcher, which must be instantiated anew
+in each scope that has a different candidate list. (This works
+much like type punning from roles to classes. Or you can think of
+this dispatcher as a currying of the proto's code with the candidate
+list appropriate to the scope.) For the sake of discussion, let us
+say that there is a declarator equivalent to C<only> that is instead
+spelled C<dispatch>. Generally a user never writes a C<dispatch> sub
+(it might not even be allowed); a C<dispatch> is always instantiated
+from the governing C<proto>. A new C<dispatch> sub or method is
+autogenerated in any scope that contains C<multi> declarations.
+Since C<dispatch> is nearly identical to C<only>, saying C<&foo>
+always refers to the innermost visible C<dispatch> or C<only> sub,
+never to a C<proto> or C<multi>. Likewise, C<$obj.can('foo')> will
+return the most-derived C<dispatch> or C<only> method.
Within its scope, the signature of a C<proto> also nails down the presumed
-order and naming of positional parameters, so that any C<multi> call with
+order and naming of positional parameters, so that any call to that short name with
named arguments in that scope can presume to rearrange those arguments
into positional parameters based on that information. (Unrecognized names
remain named arguments.) Any other type information or traits attached
@@ -1200,20 +1231,6 @@ A C<method> or C<submethod> doesn't ordinarily participate in any
subroutine-dispatch process. However, they can be made to do so if
prefixed with a C<my> or C<our> declarator.
-Conjecture: In order to specify dispatch that includes the return
-type context, it is necessary to place the return type before the double
-semicolon:
-
- multi infix:<..>(Int $min, Int $max --> Iterator;; Int $by = 1) {...}
- multi infix:<..>(Int $min, Int $max --> Selector;; Int $by = 1) {...}
-
-Note that such a declaration might have to delay dispatch until the
-actual desired type is known! (Generally, you might just consider
-returning a flexible C<Range> object instead of an anonymous partial
-dispatch that may or may not be resolved at compile time via type
-inferencing. Therefore return-type tiebreaking need not be supported
-in 6.0.0 unless some enterprising soul decides to make it work.)
-
=head2 Method call vs. Subroutine call
The caller indicates whether to make a method call or subroutine
@@ -1270,7 +1287,7 @@ to an exact type match on the invocant, just as ordinary submethods are.
Perl 6.0.0 is not required to support multiple dispatch on named
parameters, only on positional parameters. Note however that any
-C<proto> will map named arguments to known declared positional
+dispatcher derived from C<proto> will map named arguments to known declared positional
parameters and call the C<multi> candidates with positionals for
those arguments rather than named arguments.
View
@@ -13,8 +13,8 @@ Synopsis 13: Overloading
Created: 2 Nov 2004
- Last Modified: 9 Jul 2010
- Version: 14
+ Last Modified: 25 Oct 2010
+ Version: 15
=head1 Overview
@@ -49,7 +49,7 @@ existing built-in sub, say something like:
multi sub uc (TurkishStr $s) {...}
-A C<multi> is automatically exported if goverened by a proto that is exported.
+A C<multi> is automatically exported if governed by a proto that is exported.
It may also be explicitly exported:
multi sub uc (TurkishStr $s) is exported {...}

0 comments on commit 60aef3a

Please sign in to comment.