documentation on containers, assignment and binding

moritz · moritz · commit 3118235c64b2 · 2013-03-05T22:11:02.000+01:00
diff --git a/lib/containers.pod b/lib/containers.pod
@@ -0,0 +1,162 @@
+=begin pod
+
+=head1 Perl 6 Containers: a low-level approach
+
+This article started as a conversion on IRC explaining the difference between
+the C<Array> and the C<List> type in Perl 6. It explains the levels of
+indirection involved in dealing with variables and container elements.
+
+=head2 What is a variable?
+
+Some people like to say "everything is an object", but in fact a variable is
+not a user-exposed object in Perl 6.
+
+When the compiler encounters a variable declaration like C<my $x>, it
+registers it in some internal symbol table. This internal symbol table is used
+to detect undeclared variables, and to tie the code generation for the
+variable to the correct scope.
+
+At run time, a variable appears as an entry in a I<lexical pad>, short
+I<lexpad>. This is a per-scope data structure that stores a pointer for each
+variable.
+
+In the case of C<my $x>, the lexpad entry for the variable C<$x> is a pointer
+to an object of type C<Scalar>, usually just called I<the container>.
+
+=head2 Scalar containers
+
+Although objects of type C<Scalar> are everywhere in Perl 6, you usually never
+see them directly as objects, because most operations I<decontainerize>, which
+means they act on the C<Scalar> container's contents instead of the container
+itself.
+
+In a code like
+
+    my $x = 42;
+    say $x;
+
+the assignment C<$x = 42> stores a pointer to the C<Int> object 42 in the
+scalar container to which the lexpad entry for C<$x> points.
+
+The assignment operator asks the container on the left to store the value on
+its right. What exactly that means is up to the container type. For C<Scalar>
+it means "replace the previously stored value with the new one".
+
+Note that subroutine signatures allow passing around of containers:
+
+    sub f($a is rw) {
+        $a = 23;
+    }
+    my $x = 42;
+    f($x);
+    say $x;         # 23
+
+Inside the subroutine, the lexpad entry for C<$a> points to the same
+container that C<$x> points to outside the subroutine. Which is why assignment
+to C<$a> also modifies the contents of C<$x>.
+
+=head2 Binding
+
+Next to assignment, Perl 6 also supports I<binding> with the C<:=> operator.
+When binding a value or a
+container to a variable, the lexpad entry of the variable is modified (and not
+just the container it points to). If you write
+
+    my $x := 42;
+
+then the lexpad entry for C<$x> directly points to the C<Int> 42. Which means
+that you cannot assign to it anymore:
+
+    $ perl6 -e 'my $x := 42; $x = 23'
+    Cannot modify an immutable value
+       in block  at -e:1
+
+You can also bind variables to other variables:
+
+    my $a = 0;
+    my $b = 0;
+    $a := $b;
+    $b = 42;
+    say $a;         # 42
+
+Here after the initial binding, the lexpad entries for C<$a> and C<$b> both
+point to the same scalar container, so assigning to one variable also changes
+the contents of the other.
+
+You've seen this situation before: it is exactly what happened with the
+signature parameter marked as C<is rw>.
+
+=head2 Scalar Containers and Listy Things
+
+There are a number of positional container types with slightly different
+semantics in Perl 6. The most basic one is I<Parcel>, short for I<Parenthesis
+cell>. It is created by the comma operator, and often delimited by round
+parenthesis -- hence the name.
+
+    say (1, 2, 3).WHAT;     # (Parcel)
+
+A parcel is immutable, which means you cannot change the number of elements in
+a parcel. But if one of the elements happens to be a scalar container, you can
+still assign to it:
+
+    my $x = 42;
+    ($x, 1, 2)[0] = 23;
+    say $x;                 # 23
+    ($x, 1, 2)[1] = 23;     # Error: Cannot modify an immutable value
+
+So the parcel doesn't care about whether its elements are values or
+containers, they just store and retrieve whatever was given to them.
+
+A C<List> has the same attitude of indifference towards containers. But it
+allows modifying the length (for example with C<push>, C<pop>, C<shift> and
+C<unshift>), and it is also lazy.
+
+An C<Array> is just like a list, except that it forces all its elements to be
+containers. Thus you can say
+
+    my @a = 1, 2, 3;
+    @a[0] = 42;
+    say @a;         # 42 2 3
+
+and C<@a> actually stores three scalar containers. C<@a[0]> returns one of
+them, and the assignment operator replaces the integer value stored in that
+container with the new one, C<42>.
+
+=head2 Assigning and Binding to Array Variables
+
+Assigning to a scalar variable and to an array variable both do basically the
+same thing: discard the old value(s), and enter some new value(s).
+
+Still it's easy to observe how different they are:
+
+    my $x = 42; say $x.WHAT;    # (Int)
+    my @a = 42; say @a.WHAT;    # (Array)
+
+This is because the C<Scalar> container type hides itself well, but C<Array>
+makes no such effort.
+
+To place a non-C<Array> into an array variable, binding works:
+
+    my @a := (1, 2, 3);
+    say @a.WHAT;                # (Capture)
+
+=head2 Binding to Array Elements
+
+As a curious side note, Perl 6 supports binding to array elements:
+
+    my @a = (1, 2, 3);
+    @a[0] := my $x;
+    $x = 42;
+    say @a;                     # 42 2 3
+
+If you've read and understood the previous explanations, it is now time to
+wonder how this can possibly work. After all binding to a variable requires a
+lexpad entry for that variable, and while there is one for an array, there
+aren't lexpad entries for each array element (you cannot expand the lexpad at
+run time).
+
+The answer is that binding to array elements is recognized at the syntax
+level, and instead of emitting code for a normal binding operation, a special
+method on the array is called that knows how to do the binding itself.
+
+=end pod