Skip to content

Commit 5893559

Browse files
jstuder-ghzoffixznet
authored andcommitted
Rewrite, remove inaccurate Range truncation info (#1681)
* Rewrite, remove inaccurate Range trucation info Previously, this section stated that: * All ranges (including non-lazy ones) will truncate the output/ omit undefined elements. Rewrote the section to remove that inaccuracy and address the following: * If the element at index [x] does not exist then the result will be an undefined element (when subscripting Positionals). * This behavior can be modified using the :v adverb. * Using lazy Iterables as subscripts yields a different behavior. Only returns defined elements (most of the time; see bullets below). * Behind the scenes the elements of the lazy subscript are being reified up until the point that the element in the Positional is undefined. * Continued use of lazy Lists will begin to return undefined elements as new elements in the lazy List are reified. * Include justification for these behaviors. Sometimes returning undefined elements is desirable. * DESIRABLE: assignment. eg @A[^10] = 1..10; * NOT DESIRABLE: Out-of-memory due to uncontrolled indexing. This removes some information about lack of protection against runaway slices and reification of subscript elements in instances where the undefined values are not being returned from the collection (eg, instances where the same index is being used over and over again). This seemed a little out of place in this section and probably fits better elsewhere. Perhaps it would be appropriate in the Traps section? SEE ALSO: [Issue #1679](#1679) https://irclog.perlgeek.de/perl6/2017-11-18#i_15466865 * Incorporate feedback on "Truncating Slices" Incorporate feedback by zoffix. Includes: * Replace "defined" with "exists". * Remove unnecessarily detailed explanation of reification and Rakudo's slicing implementation. * Move indexing "drift" explanation to Traps. See <#1681>. * Revise Lazy Lists section and add index Added and index for the "Lazy List" section in list.pod6. Revised it a to address these points: * Fix error with .elems (outdated behavior?). * Show small example of laziness and it's semantics. * Outline the common use case of infinite lists Also linked to this section from "Truncating Slices".
1 parent 564aae2 commit 5893559

File tree

3 files changed

+71
-52
lines changed

3 files changed

+71
-52
lines changed

doc/Language/list.pod6

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -194,16 +194,35 @@ value, but unlike the above options, it will break L<Scalars|/type/Scalar>.
194194
say (1, slip($(2, 3)), 4) eqv (1, 2, 3, 4); # OUTPUT: «False␤»
195195
196196
=head1 Lazy Lists
197+
X<|lazy (property of List)>
197198
198199
Lists can be lazy, what means that their values are computed on demand and
199200
stored for later use. To create a lazy list use
200201
L<gather/take|/language/control#gather/take> or the
201202
L<sequence operator|/language/operators#infix_...>. You can also write a class that
202203
implements the role L<Iterable|/type/Iterable> and returns C<True> on a call to
203-
L<is-lazy|/routine/is-lazy>. Please note that some methods like C<elems> may cause the
204-
entire list to be computed what will fail if the list is also infinite.
204+
L<is-lazy|/routine/is-lazy>. Please note that some methods like C<elems> cannot be
205+
called on a lazy List and will result in a thrown L<Exception|/type/Exception>.
205206
206-
my @l = 1,2,4,8...Inf;
207+
# This list is lazy and elements will not be available
208+
# until explicitly requested.
209+
210+
my @l = lazy 0..5;
211+
say @l.is-lazy; # OUTPUT: «True␤»
212+
say @l[]; # OUTPUT: «[...]␤»
213+
214+
# Once all elements have been retrieved, the List
215+
# is no longer considered lazy.
216+
217+
eager @l; # Forcing eager evaluation
218+
say @l.is-lazy; # OUTPUT: «False␤»
219+
say @l[]; # OUTPUT: «[0 1 2 3 4 5]␤»
220+
221+
A common use case for lazy Lists are the processing of infinite sequences of numbers,
222+
whose values have not been computed yet and cannot be computed in their entirety.
223+
Specific values in the List will only be computed when they are needed.
224+
225+
my @l = 1, 2, 4, 8 ... Inf;
207226
say @l[0..16];
208227
# OUTPUT: «(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536)␤»
209228

doc/Language/subscripts.pod6

Lines changed: 29 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,7 @@ Be aware that slices are controlled by the I<type> of what is passed to
208208
(L<one dimension of|#Multiple dimensions>) the subscript, not its length.
209209
In particular the I<type> can be any of the following:
210210
211-
=item a Range or a lazy Iterable, that L<truncates|#Truncating slices> in [ ]
211+
=item a lazy Iterable, that L<truncates|#Truncating slices> in [ ]
212212
213213
=item '*' (whatever-star), that returns the full slice (as if all keys/indices were specified)
214214
@@ -251,68 +251,48 @@ dimensions>) the subscript is preserved across the slice operation
251251
252252
=head2 Truncating slices
253253
254-
Normally, referring to nonexistent elements in a slice subscript causes the
255-
output list to contain undefined values (or L<whatever else|
254+
Referring to nonexistent elements in a slice subscript causes the
255+
output C<List> to contain undefined values (or L<whatever else|
256256
#Nonexistent elements> the collection in question chooses to return for
257-
nonexistent elements). However if the outer object passed to
258-
(L<one dimension of|#Multiple dimensions>) of a positional subscript is a
259-
L<Range>, it will be automatically truncated to the actual size of the
260-
collection:
257+
nonexistent elements):
261258
262-
my @letters = <a b c d e f>;
263-
say @letters[3, 4, 5, 6, 7]; # OUTPUT: «(d e f (Any) (Any))␤»
264-
say @letters[3 .. 7]; # OUTPUT: «(d e f)␤»
265-
266-
L<From-the-end|#From the end> indices are allowed as range end-points.
267-
268-
=begin code :preamble<my @array;>
269-
say @array[*-3 .. *]; # select the last three elements
259+
=begin code
260+
my @letters = <a b c d e f>;
261+
say @letters[3..7]; # OUTPUT: «(d e f (Any) (Any))␤»
270262
=end code
271263
272-
A similar thing is done for lazy sequences, but it is often impossible to
273-
determine whether the sequence is infinite. Just as often, the first part
274-
of the sequence is already known, and it would be silly to pretend we
275-
did not know it. As a stopgap measure to prevent runaway generation of huge
276-
lists, a lazy subscript will not truncate as long as it does not have to
277-
lazily generate values, but once it starts generating values lazily, it
278-
will stop if it generates a value that points to a nonexistent index.
264+
This behavior, while at first glance may seem unintuitive, is desirable in
265+
instances when you want to assign a value at an index in which a value does
266+
not currently exist.
279267
280-
=begin code :skip-test
281-
say @letters[0, 2, 4 ... *]; # Every other element of the array.
282-
=end code
283-
284-
This feature is more for protection against accidental out-of-memory
285-
problems than for actual use. Since some lazy sequences cache their
286-
results, every time they are used in a truncation, they accumulate one
287-
more known element. Things like this should probably be avoided rather
288-
than used for effect:
268+
=begin code
269+
my @letters;
270+
say @letters; # OUTPUT: «[]␤»
289271
290-
=begin code :skip-test
291-
my @a = 2, 3 ... *;
292-
say flat @letters[0, 7, @a]; # OUTPUT: «(a (Any) c d e f)␤»
293-
say flat @letters[0, 7, @a]; # OUTPUT: «(a (Any) c d e f (Any))␤»
272+
@letters[^10] = 'a'..'z';
273+
say @letters; # OUTPUT: «[a b c d e f g h i j]␤»
294274
=end code
295275
296-
The runaway protection is not perfect. The indices are eagerly evaluated,
297-
with the only stop condition being truncation. This is to provide
298-
mostly consistent results when there is self-reference/mutation inside
299-
the indices. As such, the following will most likely hang until all
300-
memory has been consumed:
276+
If you want the resulting slice to only include existing elements, you can
277+
silently skip the non-existent elements using the L<#:v> adverb.
301278
302-
=begin code :preamble<my @letters;>
303-
@letters[0 xx *];
279+
=begin code
280+
my @letters = <a b c d e f>;
281+
say @letters[3..7]:v; # OUTPUT: «(d e f)␤»
304282
=end code
305283
306-
So, to safely use lazy indices, they should be one-shot things which
307-
are guaranteed to overrun the array. The following alternate formulation
308-
will produce a fully lazy result (but will not truncate):
284+
The behavior when indexing a collection via L<lazy|/language/list#Lazy_Lists>
285+
subscripts is different than when indexing with their eager counterparts.
286+
When accessing via a lazy subscript, the resulting slice will be truncated.
309287
310-
=begin code :preamble<my @letters;>
311-
my $a = (0 xx *).map({ @letters[$_] }); # "a", "a", "a" ... forever
288+
=begin code :preamble<my @letters = <a b c d e f>;>
289+
say @letters[lazy 3..7]; # OUTPUT: «(d e f)␤»
290+
say @letters[ 3..*]; # OUTPUT: «(d e f)␤»
312291
=end code
313292
314-
If you I<don't> want to specify your slice as a range/sequence but still want
315-
to silently skip nonexistent elements, you can use the L<#:v> adverb.
293+
This behavior exists as a precaution to prevent runaway generation of massive,
294+
potentially infinite C<Lists> and the out-of-memory issues that occur as a
295+
result.
316296
317297
=head2 Zen slices
318298

doc/Language/traps.pod6

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,26 @@ default behavior to unsafe variant, so just by forgetting some quotes
414414
you are risking to introduce either a bug or maybe even a security
415415
hole. To stay on the safe side, refrain from using C<«»>.
416416
417+
=head2 "Drift" when indexing with lazy Iterable
418+
419+
If you are L<indexing with a lazy Iterable|/language/subscripts#Truncating_slices>,
420+
be aware that (in the Rakudo implementation) each subsequent use
421+
L<reifies|/language/glossary#index-entry-Reify> and caches an additional element in
422+
the lazy C<Iterable>, introducing undefined elements into the resulting slice. Some
423+
lazy C<Iterables> (such as C<List>) will cache each newly produced element while
424+
others (such as C<Seq>) will discard them. Being aware of how each C<Iterable> deals
425+
with it's lazily produced elements will help you to avoid unexpected results.
426+
427+
my @letters = <a b c d e f>;
428+
my @i = 3..*;
429+
.say for @letters[@i] xx 3;
430+
431+
#`( OUTPUT:
432+
(d e f)
433+
(d e f (Any))
434+
(d e f (Any) (Any))
435+
)
436+
417437
=head1 Strings
418438
419439
=head2 Quotes and interpolation

0 commit comments

Comments
 (0)