Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Add initial draft of S07-lists.pod, the new Synopsis 7.
- Loading branch information
Showing
1 changed file
with
268 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,268 @@ | ||
|
|
||
| =encoding utf8 | ||
|
|
||
| =head1 TITLE | ||
|
|
||
| Synopsis 7: Lists and Iteration | ||
|
|
||
| =head1 AUTHORS | ||
|
|
||
| Patrick R. Michaud <pmichaud@pobox.com | ||
|
|
||
| =head1 VERSION | ||
|
|
||
| Created: 12 July 2012 | ||
|
|
||
| Last Modified: 12 July 2012 | ||
| Version: 1 | ||
|
|
||
| =head1 Overview | ||
|
|
||
| Lists and arrays have always been one of Perl's fundamental data types, | ||
| and Perl 6 is no different. However, lists in Perl 6 have been greatly | ||
| extended to accommodate lazy lists, infinite lists, lists of mutable | ||
| and immutable elements, typed lists, flattening behaviors, and so on. | ||
| So where lists and arrays in Perl 5 tended to be finite sequences of | ||
| scalar values, in Perl 6 we have additional dimensions of behavior | ||
| that must be addressed. | ||
|
|
||
| =head2 The C<List> type | ||
|
|
||
| The C<List> class is the base class for dealing with other types of | ||
| lists, including C<Array>. To the programmer, a C<List> is a potentially | ||
| lazy and infinite sequence of elements. A C<List> is mutable, in that | ||
| one can manipulate the sequence via operations such as C<push>, C<pop>, | ||
| C<shift>, C<unshift>, C<splice>, etc. A C<List>'s elements may be | ||
| either mutable or immutable. | ||
|
|
||
| C<List> objects are C<Positional>, meaning they can be bound to | ||
| array variables and support the postfix C<.[]> operator. | ||
|
|
||
| Lists are also lazy, in that the elements of a C<List> may | ||
| come from generator functions (called C<Iterator>s) that produce | ||
| elements on demand. | ||
|
|
||
| =head2 The C<Array> type | ||
|
|
||
| An C<Array> is simply a C<List> in which all of the elements are held | ||
| in scalar containers. This allows assignment to the elements of the | ||
| array. | ||
|
|
||
| =head2 The C<Parcel> type | ||
|
|
||
| The comma operator (C<< infix:<,> >>) creates C<Parcel> objects. | ||
| These should not be confused with lists; a C<Parcel> represents | ||
| a raw syntactic sequence of elements. A C<Parcel> is immutable, | ||
| although the elements of a C<Parcel> may be either mutable or | ||
| immutable. | ||
|
|
||
| The name "Parcel" is derived from the phrase "parenthesis cell", | ||
| since many C<Parcel> objects appear inside of parentheses. | ||
| However, except for the empty parcel, it's the comma operator | ||
| that creates C<Parcel> objects. | ||
|
|
||
| () # empty Parcel | ||
| (1) # an Int | ||
| (1,2) # a Parcel with two Ints | ||
| (1,) # a Parcel with one Int | ||
|
|
||
| A C<Parcel> is also C<Positional>, and uses flattening context for | ||
| list operations such as C<.[]> and C<.elems>. See "Flattening contexts" | ||
| below. | ||
|
|
||
| =head2 Flattening contexts, the C<Iterable> type | ||
|
|
||
| C<List> and C<Parcel> objects can have other container objects | ||
| as elements. In some contexts we want to interpolate the values | ||
| of container objects into the surrounding C<List> or C<Parcel>, | ||
| while in other contexts we want any subcontainers to be preserved. | ||
| Such interpolation is known as "flattening". | ||
|
|
||
| The C<Iterable> type is performed by container and generator objects | ||
| that will interpolate their values in flattening contexts. Both | ||
| C<List> and C<Parcel> are C<Iterable>. C<Iterable> implies support | ||
| for C<.iterator>, which returns an object capable of producing the | ||
| elements of the C<Iterable>. | ||
|
|
||
| Objects held in scalar containers are never interpolated in | ||
| flattening context, even if the object is C<Iterable>. | ||
|
|
||
| my @a = 3,4,5; | ||
| for 1,2,@a { ... } # five iterations | ||
|
|
||
| my $s = @a; | ||
| for 1,2,$s { ... } # three iterations | ||
|
|
||
| Here, both C<$s> and C<@a> refer to the same underlying C<Array> object, | ||
| but the presence of the scalar container prevents C<$s> from being | ||
| flattened into the C<for> loop. The C<.list> or C<.flat> method | ||
| may be used to restore the flattening behavior: | ||
|
|
||
| for 1,2,$s.list { ... } # five iterations | ||
| for 1,2,@($s) { ... } # five iterations | ||
|
|
||
| Conversely, the C<.item> or C<$()> contextualizer can be used to | ||
| prevent an C<Iterable> from being interpolated: | ||
|
|
||
| my @b = 1,2,@a; # @b has five elements | ||
| my @c = 1,2,@a.item; # @c has three elements | ||
| my @c = 1,2,$(@a); # same thing | ||
|
|
||
| =head2 Iterators | ||
|
|
||
| An C<Iterator> is an object that is able to generate or produce | ||
| elements of a list (called "reification"). Because a single | ||
| C<Iterator> may be directly or indirectly referenced by | ||
| multiple container objects simultaneously, C<Iterator>s provide | ||
| an immutable view of the sequence of elements produced, leaving | ||
| it up to containers to provide any mutability. | ||
|
|
||
| =head3 The C<.reify> method | ||
|
|
||
| The C<.reify($n)> method requests an C<Iterator> to return a C<Parcel> | ||
| consisting of at least C<$n> reified elements, followed by any additional | ||
| iterators needed for the remaining elements in the sequence. For example: | ||
|
|
||
| my $r = 10..50; | ||
| say $r.iterator.reify(5).perl; # (10, 11, 12, 13, 14, 15..50) | ||
|
|
||
| Iterators are allowed to return more or fewer elements than requested | ||
| by C<$n>; it's up to the caller to properly handle any shortage or | ||
| excess. (In general an iterator is expected to always reify at | ||
| least one element, however.) If C<$n> is C<*>, then the iterator | ||
| is asked to generate elements according to whatever is most natural | ||
| for the thing being iterated. For a C<Range> this might be a substantial | ||
| number (or even all) of the elements of the range; for a filehandle | ||
| it might read a single record; for a C<List> it may return only the | ||
| non-lazy portion of the list. | ||
|
|
||
| Once C<.reify> is called on an iterator, the iterator is expected | ||
| to return the same results for all subsequent calls to C<.reify>. | ||
| This is true even if the C<$n> argument is different from one call | ||
| to the next. The general pattern is often something like: | ||
|
|
||
| class SomeIter is Iterator { | ||
| has $!reified; | ||
|
|
||
| method reify($n = 1) { | ||
| unless $!reified.defined { | ||
| # generate return Parcel and bind to $!reified | ||
| } | ||
| $!reified; | ||
| } | ||
| } | ||
|
|
||
| =head3 The C<.infinite> method | ||
|
|
||
| Because lists in Perl 6 can be lazy, they can also be infinite. | ||
| Each C<Iterator> must provide an C<.infinite> method, which returns | ||
| a value indicating the knowable finiteness of the iteration: | ||
|
|
||
| .infinite Meaning | ||
| ---------------------------------------------- | ||
| True iteration is known to be infinite | ||
| False iteration is known to be finite | ||
| Mu finiteness isn't currently known | ||
|
|
||
| As an example, C<Range> iterators can generally know finiteness simply | ||
| by looking at the endpoint of the C<Range>. The iterator for the | ||
| C<< infix:<...> >> sequence operator treats any sequence ending in | ||
| C<*> as being "known infinite", all other C<...> sequences have unknown | ||
| finiteness. In the general case it's not possible for loop iterators | ||
| to definitively know if the sequence will be finite or infinite. | ||
| (Conjecture: There will be a syntax or mechanism for the programmer | ||
| to indicate that such sequences are to be treated as known finite | ||
| or known infinite.) | ||
|
|
||
| =head2 Levels of laziness | ||
|
|
||
| Laziness is one of the primary virtues of a Perl programmer; it is | ||
| also one of the virtues of Perl 6. However, it's also possible to | ||
| succumb to false laziness, so there are times when Perl 6 chooses | ||
| to be eager instead of lazy. | ||
|
|
||
| Perl 6 defines four levels of laziness: | ||
|
|
||
| =over | ||
|
|
||
| =item Strictly Lazy | ||
|
|
||
| Does not evaluate anything unless explicitly required by the caller, | ||
| including not traversing non-lazy objects. This behavior generally | ||
| occurs only by a pragma or by explicit lazy primitives. | ||
|
|
||
| =item Mostly Lazy | ||
|
|
||
| Try to obtain available elements without causing eager evaluation | ||
| of other lazy objects. However, the implementation is allowed to | ||
| do batching for efficiency. The programmer must not rely on | ||
| circular side effects when the implementation is working ahead | ||
| like this, or the results will be indeterminate. (However, this | ||
| does not rule out pure I<definitional> circularity; earlier array | ||
| values may be used in the definition of subsequent values.) | ||
|
|
||
| =item Mostly Eager | ||
|
|
||
| Obtain all leading items that are not known to be infinite. | ||
| The remainder of the items are left for lazy evaluation. | ||
| As with the "mostly lazy" case, the programmer must not depend | ||
| on circular side effects in any situation where the implementation | ||
| is allowed to work ahead. Note that there are no language-defined | ||
| limits on the amount of conjectural evaluation allowed, up to the | ||
| size of available memory; however, an implementation may allow | ||
| the arbitrary limitation of workahead by use of pragmas. | ||
|
|
||
| In any case there should be no profound semantic differences | ||
| between "mostly lazy" and "mostly eager". These levels are | ||
| primarily just hints to the implementation as to whether you are | ||
| likely to want to use all the values in the list. Nothing | ||
| transactional should be allowed to depend on the difference. | ||
|
|
||
| =item Strictly Eager | ||
|
|
||
| Obtain all items, failing immediately with an appropriate message | ||
| if asked to evaluate any data structures known to be infinite. | ||
| (Data structures that are effectively infinite but not provably | ||
| so will likely exhaust memory instead.) | ||
|
|
||
| =back | ||
|
|
||
| =head2 The laziness level of some common operations | ||
|
|
||
| =over | ||
|
|
||
| =item List assignment; my @a = @something; | ||
|
|
||
| In order to provide p5-like behavior in list assignment, it is performed | ||
| at the Mostly Eager level. This means that if you do | ||
|
|
||
| my @a = foo(); | ||
|
|
||
| it will eagerly evaluate the return value from C<foo()> to place | ||
| elements into C<@a>, stopping only when encountering something | ||
| that is "known infinite" or upon reaching the end of the sequence. | ||
|
|
||
| =item C<for>, C<map>, C<grep>, etc. | ||
|
|
||
| The C<for> loop and looping routines such as C<map>, C<grep>, etc. are | ||
| Mostly Lazy by default, but will be eager when evaluated in sink context. | ||
|
|
||
| =item Slurpy parameters | ||
|
|
||
| Slurpy parameters are Mostly Lazy. | ||
|
|
||
| =item Feed operators | ||
|
|
||
| The feed operator is strictly lazy, meaning that no operation | ||
| should be performed before the user requests any element. That's how | ||
|
|
||
| my @a <== grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3 | ||
|
|
||
| is completely lazy, even if 1,2,3 is a fairly small known compact | ||
| list. | ||
|
|
||
| =back | ||
|
|
||
| =head2 Common list operations | ||
|
|
||
|
|