Skip to content
This repository
Browse code

Add initial draft of S07-lists.pod, the new Synopsis 7.

  • Loading branch information...
commit 89b9891ea16cbabd2beccf5decc50a770cd2fbb6 1 parent a1a0394
Patrick R. Michaud authored July 25, 2012

Showing 1 changed file with 268 additions and 0 deletions. Show diff stats Hide diff stats

  1. 268  S07-lists.pod
268  S07-lists.pod
Source Rendered
... ...
@@ -0,0 +1,268 @@
  1
+
  2
+=encoding utf8
  3
+
  4
+=head1 TITLE
  5
+
  6
+Synopsis 7: Lists and Iteration
  7
+
  8
+=head1 AUTHORS
  9
+
  10
+    Patrick R. Michaud <pmichaud@pobox.com
  11
+
  12
+=head1 VERSION
  13
+
  14
+    Created: 12 July 2012
  15
+
  16
+    Last Modified: 12 July 2012
  17
+    Version: 1
  18
+
  19
+=head1 Overview
  20
+
  21
+Lists and arrays have always been one of Perl's fundamental data types,
  22
+and Perl 6 is no different.  However, lists in Perl 6 have been greatly
  23
+extended to accommodate lazy lists, infinite lists, lists of mutable
  24
+and immutable elements, typed lists, flattening behaviors, and so on.  
  25
+So where lists and arrays in Perl 5 tended to be finite sequences of 
  26
+scalar values, in Perl 6 we have additional dimensions of behavior
  27
+that must be addressed.
  28
+
  29
+=head2 The C<List> type
  30
+
  31
+The C<List> class is the base class for dealing with other types of
  32
+lists, including C<Array>.  To the programmer, a C<List> is a potentially
  33
+lazy and infinite sequence of elements.  A C<List> is mutable, in that
  34
+one can manipulate the sequence via operations such as C<push>, C<pop>,
  35
+C<shift>, C<unshift>, C<splice>, etc.  A C<List>'s elements may be
  36
+either mutable or immutable.
  37
+
  38
+C<List> objects are C<Positional>, meaning they can be bound to
  39
+array variables and support the postfix C<.[]> operator.
  40
+
  41
+Lists are also lazy, in that the elements of a C<List> may 
  42
+come from generator functions (called C<Iterator>s) that produce
  43
+elements on demand.
  44
+
  45
+=head2 The C<Array> type
  46
+
  47
+An C<Array> is simply a C<List> in which all of the elements are held
  48
+in scalar containers.  This allows assignment to the elements of the
  49
+array.
  50
+
  51
+=head2 The C<Parcel> type
  52
+
  53
+The comma operator (C<< infix:<,> >>) creates C<Parcel> objects.  
  54
+These should not be confused with lists; a C<Parcel> represents
  55
+a raw syntactic sequence of elements.  A C<Parcel> is immutable,
  56
+although the elements of a C<Parcel> may be either mutable or
  57
+immutable.
  58
+
  59
+The name "Parcel" is derived from the phrase "parenthesis cell", 
  60
+since many C<Parcel> objects appear inside of parentheses.  
  61
+However, except for the empty parcel, it's the comma operator 
  62
+that creates C<Parcel> objects.  
  63
+
  64
+    ()       # empty Parcel
  65
+    (1)      # an Int
  66
+    (1,2)    # a Parcel with two Ints
  67
+    (1,)     # a Parcel with one Int
  68
+
  69
+A C<Parcel> is also C<Positional>, and uses flattening context for
  70
+list operations such as C<.[]> and C<.elems>.  See "Flattening contexts"
  71
+below.
  72
+
  73
+=head2 Flattening contexts, the C<Iterable> type
  74
+
  75
+C<List> and C<Parcel> objects can have other container objects
  76
+as elements.  In some contexts we want to interpolate the values
  77
+of container objects into the surrounding C<List> or C<Parcel>,
  78
+while in other contexts we want any subcontainers to be preserved.
  79
+Such interpolation is known as "flattening".
  80
+
  81
+The C<Iterable> type is performed by container and generator objects
  82
+that will interpolate their values in flattening contexts.  Both
  83
+C<List> and C<Parcel> are C<Iterable>.  C<Iterable> implies support
  84
+for C<.iterator>, which returns an object capable of producing the
  85
+elements of the C<Iterable>.
  86
+
  87
+Objects held in scalar containers are never interpolated in
  88
+flattening context, even if the object is C<Iterable>.
  89
+
  90
+    my @a = 3,4,5;
  91
+    for 1,2,@a { ... }        # five iterations
  92
+
  93
+    my $s = @a;
  94
+    for 1,2,$s { ... }        # three iterations
  95
+
  96
+Here, both C<$s> and C<@a> refer to the same underlying C<Array> object,
  97
+but the presence of the scalar container prevents C<$s> from being
  98
+flattened into the C<for> loop.  The C<.list> or C<.flat> method
  99
+may be used to restore the flattening behavior:
  100
+
  101
+    for 1,2,$s.list { ... }   # five iterations
  102
+    for 1,2,@($s) { ... }     # five iterations
  103
+
  104
+Conversely, the C<.item> or C<$()> contextualizer can be used to
  105
+prevent an C<Iterable> from being interpolated:
  106
+
  107
+    my @b = 1,2,@a;           # @b has five elements
  108
+    my @c = 1,2,@a.item;      # @c has three elements
  109
+    my @c = 1,2,$(@a);        # same thing
  110
+
  111
+=head2 Iterators
  112
+
  113
+An C<Iterator> is an object that is able to generate or produce
  114
+elements of a list (called "reification").  Because a single
  115
+C<Iterator> may be directly or indirectly referenced by
  116
+multiple container objects simultaneously, C<Iterator>s provide 
  117
+an immutable view of the sequence of elements produced, leaving
  118
+it up to containers to provide any mutability.
  119
+
  120
+=head3 The C<.reify> method
  121
+
  122
+The C<.reify($n)> method requests an C<Iterator> to return a C<Parcel>
  123
+consisting of at least C<$n> reified elements, followed by any additional
  124
+iterators needed for the remaining elements in the sequence.  For example:
  125
+
  126
+    my $r = 10..50;
  127
+    say $r.iterator.reify(5).perl;  # (10, 11, 12, 13, 14, 15..50)
  128
+
  129
+Iterators are allowed to return more or fewer elements than requested
  130
+by C<$n>; it's up to the caller to properly handle any shortage or 
  131
+excess.  (In general an iterator is expected to always reify at
  132
+least one element, however.)  If C<$n> is C<*>, then the iterator
  133
+is asked to generate elements according to whatever is most natural 
  134
+for the thing being iterated.  For a C<Range> this might be a substantial
  135
+number (or even all) of the elements of the range; for a filehandle
  136
+it might read a single record; for a C<List> it may return only the
  137
+non-lazy portion of the list.
  138
+
  139
+Once C<.reify> is called on an iterator, the iterator is expected
  140
+to return the same results for all subsequent calls to C<.reify>.
  141
+This is true even if the C<$n> argument is different from one call
  142
+to the next.  The general pattern is often something like:
  143
+
  144
+    class SomeIter is Iterator {
  145
+        has $!reified;
  146
+
  147
+        method reify($n = 1) {
  148
+            unless $!reified.defined {
  149
+                # generate return Parcel and bind to $!reified
  150
+            }
  151
+            $!reified;
  152
+        }
  153
+    }
  154
+
  155
+=head3 The C<.infinite> method
  156
+
  157
+Because lists in Perl 6 can be lazy, they can also be infinite.
  158
+Each C<Iterator> must provide an C<.infinite> method, which returns
  159
+a value indicating the knowable finiteness of the iteration:
  160
+
  161
+    .infinite     Meaning
  162
+    ----------------------------------------------
  163
+    True          iteration is known to be infinite
  164
+    False         iteration is known to be finite
  165
+    Mu            finiteness isn't currently known
  166
+
  167
+As an example, C<Range> iterators can generally know finiteness simply
  168
+by looking at the endpoint of the C<Range>.  The iterator for the
  169
+C<< infix:<...> >> sequence operator treats any sequence ending in 
  170
+C<*> as being "known infinite", all other C<...> sequences have unknown
  171
+finiteness.  In the general case it's not possible for loop iterators
  172
+to definitively know if the sequence will be finite or infinite.
  173
+(Conjecture:  There will be a syntax or mechanism for the programmer
  174
+to indicate that such sequences are to be treated as known finite
  175
+or known infinite.)
  176
+
  177
+=head2 Levels of laziness
  178
+
  179
+Laziness is one of the primary virtues of a Perl programmer; it is
  180
+also one of the virtues of Perl 6.  However, it's also possible to
  181
+succumb to false laziness, so there are times when Perl 6 chooses 
  182
+to be eager instead of lazy.
  183
+
  184
+Perl 6 defines four levels of laziness:
  185
+
  186
+=over
  187
+
  188
+=item Strictly Lazy
  189
+
  190
+Does not evaluate anything unless explicitly required by the caller,
  191
+including not traversing non-lazy objects.  This behavior generally
  192
+occurs only by a pragma or by explicit lazy primitives.
  193
+
  194
+=item Mostly Lazy
  195
+
  196
+Try to obtain available elements without causing eager evaluation
  197
+of other lazy objects.  However, the implementation is allowed to
  198
+do batching for efficiency.  The programmer must not rely on
  199
+circular side effects when the implementation is working ahead
  200
+like this, or the results will be indeterminate.  (However, this
  201
+does not rule out pure I<definitional> circularity; earlier array
  202
+values may be used in the definition of subsequent values.)
  203
+
  204
+=item Mostly Eager
  205
+
  206
+Obtain all leading items that are not known to be infinite.
  207
+The remainder of the items are left for lazy evaluation.  
  208
+As with the "mostly lazy" case, the programmer must not depend 
  209
+on circular side effects in any situation where the implementation 
  210
+is allowed to work ahead.  Note that there are no language-defined 
  211
+limits on the amount of conjectural evaluation allowed, up to the 
  212
+size of available memory; however, an implementation may allow 
  213
+the arbitrary limitation of workahead by use of pragmas.
  214
+
  215
+In any case there should be no profound semantic differences
  216
+between "mostly lazy" and "mostly eager".  These levels are
  217
+primarily just hints to the implementation as to whether you are
  218
+likely to want to use all the values in the list.  Nothing
  219
+transactional should be allowed to depend on the difference.
  220
+
  221
+=item Strictly Eager
  222
+
  223
+Obtain all items, failing immediately with an appropriate message
  224
+if asked to evaluate any data structures known to be infinite.
  225
+(Data structures that are effectively infinite but not provably
  226
+so will likely exhaust memory instead.)
  227
+
  228
+=back
  229
+
  230
+=head2 The laziness level of some common operations
  231
+
  232
+=over
  233
+
  234
+=item List assignment; my @a = @something;
  235
+
  236
+In order to provide p5-like behavior in list assignment, it is performed
  237
+at the Mostly Eager level.  This means that if you do
  238
+
  239
+    my @a = foo();
  240
+
  241
+it will eagerly evaluate the return value from C<foo()> to place
  242
+elements into C<@a>, stopping only when encountering something
  243
+that is "known infinite" or upon reaching the end of the sequence.
  244
+
  245
+=item C<for>, C<map>, C<grep>, etc.
  246
+
  247
+The C<for> loop and looping routines such as C<map>, C<grep>, etc. are
  248
+Mostly Lazy by default, but will be eager when evaluated in sink context.
  249
+
  250
+=item Slurpy parameters
  251
+
  252
+Slurpy parameters are Mostly Lazy.
  253
+
  254
+=item Feed operators
  255
+
  256
+The feed operator is strictly lazy, meaning that no operation
  257
+should be performed before the user requests any element.  That's how
  258
+
  259
+  my @a <== grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3
  260
+
  261
+is completely lazy, even if 1,2,3 is a fairly small known compact
  262
+list.
  263
+
  264
+=back
  265
+
  266
+=head2 Common list operations
  267
+
  268
+

0 notes on commit 89b9891

Please sign in to comment.
Something went wrong with that request. Please try again.