doc/Language/containers.pod6

=begin pod

=TITLE Containers

=SUBTITLE A low-level explanation of Perl 6 containers

This article started as a conversation on IRC explaining the difference
between the C<Array> and the C<List> type in Perl 6. It explains the levels
of indirection involved in dealing with variables and container elements.

=head1 What is a variable?

Some people like to say "everything is an object", but in fact a variable is
not a user-exposed object in Perl 6.

When the compiler encounters a variable declaration like C<my $x>, it
registers it in some internal symbol table. This internal symbol table is
used to detect undeclared variables and to tie the code generation for the
variable to the correct scope.

At run time, a variable appears as an entry in a I<lexical pad>, or
I<lexpad> for short. This is a per-scope data structure that stores a
pointer for each variable.

In the case of C<my $x>, the lexpad entry for the variable C<$x> is a
pointer to an object of type C<Scalar>, usually just called I<the
container>.

=head1 Scalar containers

Although objects of type C<Scalar> are everywhere in Perl 6, you rarely
see them directly as objects, because most operations
I<decontainerize>, which means they act on the C<Scalar> container's
contents instead of the container itself.

In code like

    my $x = 42;
    say $x;

the assignment C<$x = 42> stores a pointer to the C<Int> object 42 in the
scalar container to which the lexpad entry for C<$x> points.

The assignment operator asks the container on the left to store the value on
its right. What exactly that means is up to the container type. For
C<Scalar> it means "replace the previously stored value with the new one".

Note that subroutine signatures allow passing around of containers:

    sub f($a is rw) {
        $a = 23;
    }
    my $x = 42;
    f($x);
    say $x;         # 23

Inside the subroutine, the lexpad entry for C<$a> points to the same
container that C<$x> points to outside the subroutine. Which is why
assignment to C<$a> also modifies the contents of C<$x>.

Likewise a routine can return a container if it is marked as C<is rw>:

    my $x = 23;
    sub f() is rw { $x };
    f() = 42;
    say $x;         # 42

For explicit returns, C<return-rw> instead of C<return> must be used.

Returning a container is how C<is rw> attribute accessors work. So

    class A {
        has $.attr is rw;
    }

is equivalent to

    class A {
        has $!attr;
        method attr() is rw { $!attr }
    }

Scalar containers are transparent to type checks and most kinds of read-only
accesses. A C<.VAR> makes them visible:

    my $x = 42;
    say $x.^name;       # Int
    say $x.VAR.^name;   # Scalar

And C<is rw> on a parameter requires the presence of a writable Scalar
container:

    sub f($x is rw) { say $x };
    f 42;   # Parameter '$x' expected a writable container, but got Int value

=head1 Binding

Next to assignment, Perl 6 also supports I<binding> with the C<:=> operator.
When binding a value or a container to a variable, the lexpad entry of the
variable is modified (and not just the container it points to). If you write

    my $x := 42;

then the lexpad entry for C<$x> directly points to the C<Int> 42. Which
means that you cannot assign to it anymore:

    $ perl6 -e 'my $x := 42; $x = 23'
    Cannot modify an immutable value
       in block  at -e:1

You can also bind variables to other variables:

    my $a = 0;
    my $b = 0;
    $a := $b;
    $b = 42;
    say $a;         # 42

Here, after the initial binding, the lexpad entries for C<$a> and C<$b> both
point to the same scalar container, so assigning to one variable also
changes the contents of the other.

You've seen this situation before: it is exactly what happened with the
signature parameter marked as C<is rw>.

X<|\,container binding
Sigilless variables also bind by default and so do parameters with the trait C<is raw>.

    my $a = 42;
    my \b = $a;
    b++;
    say $a;         # 43

    sub f($c is raw) { $c++ }
    f($a);
    say $a;         # 44

=head1 Scalar containers and listy things

There are a number of positional container types with slightly different
semantics in Perl 6. The most basic one is L<List|/type/List>
It is created by the comma operator.

    say (1, 2, 3).^name;    # List

A list is immutable, which means you cannot change the number of elements
in a list. But if one of the elements happens to be a scalar container,
you can still assign to it:

    my $x = 42;
    ($x, 1, 2)[0] = 23;
    say $x;                 # 23
    ($x, 1, 2)[1] = 23;     # Error: Cannot modify an immutable value

So the list doesn't care about whether its elements are values or
containers, they just store and retrieve whatever was given to them.

Lists can also be lazy, so elements at the end are generated on demand from an
iterator.

An C<Array> is just like a list, except that it forces all its elements to
be containers, which means that you can always assign to elements:

    my @a = 1, 2, 3;
    @a[0] = 42;
    say @a;         # 42 2 3

C<@a> actually stores three scalar containers. C<@a[0]> returns one of
them, and the assignment operator replaces the integer value stored in that
container with the new one, C<42>.

An L<Array> also has methods that can change the number of elements, notably
C<push>, C<pop>, C<shift>, C<unshift> and C<splice>.

=head1 Assigning and binding to array variables

Assigning to a scalar variable and to an array variable both do basically
the same thing: discard the old value(s), and enter some new value(s).

Nevertheless, it's easy to observe how different they are:

    my $x = 42; say $x.^name;   # Int
    my @a = 42; say @a.^name;   # Array

This is because the C<Scalar> container type hides itself well, but C<Array>
makes no such effort. Also assignment to an array variable is coercive, so
you can assign a non-array value to an array variable.

To place a non-C<Array> into an array variable, binding works:

    my @a := (1, 2, 3);
    say @a.WHAT;                # (List)

=head1 Binding to array elements

As a curious side note, Perl 6 supports binding to array elements:

    my @a = (1, 2, 3);
    @a[0] := my $x;
    $x = 42;
    say @a;                     # 42 2 3

If you've read and understood the previous explanations, it is now time to
wonder how this can possibly work. After all, binding to a variable requires
a lexpad entry for that variable, and while there is one for an array, there
aren't lexpad entries for each array element (you cannot expand the lexpad
at run time).

The answer is that binding to array elements is recognized at the syntax
level and instead of emitting code for a normal binding operation, a special
method (called C<BIND-KEY>) is called on the array. This method handles binding
to array elements.

Note that, while supported, one should generally avoid directly binding
uncontainerized things into array elements.  Doing so may produce
counter-intuitive results when the array is used later.

    my @a = (1, 2, 3);
    @a[0] := 42;         # This is not recommended, use assignment instead.
    my $b := 42;
    @a[1] := $b;         # Nor is this.
    @a[2] = $b;          # ...but this is fine.
    @a[1, 2] := 1, 2;    # This is not allowed and will fail.

Operations that mix Lists and Arrays generally protect against such
a thing happening accidentally.

=head1 Flattening, items and containers

The C<%> and C<@> sigils in Perl 6 generally indicate multiple values
to an iteration construct, whereas the C<$> sigil indicates only one value.

    my @a = 1, 2, 3;
    for @a { };         # 3 iterations
    my $a = (1, 2, 3);
    for $a { };         # 1 iteration

Contrary to earlier versions of Perl 6, and to Perl 5, C<@>-sigiled variables
do not flatten in list context:

    my @a = 1, 2, 3;
    my @b = @a, 4, 5;
    say @b.elems;               # 3

There are operations that flatten out sublists that are not inside a scalar
container: slurpy parameters (C<*@a>) and explicit calls to C<flat>:


    my @a = 1, 2, 3;
    say (flat @a, 4, 5).elems;  # 5

    sub f(*@x) { @x.elems };
    say f @a, 4, 5;             # 5

As hinted above, scalar containers prevent that flattening:

    sub f(*@x) { @x.elems };
    say f $@a, 4, 5;            # 3

The C<@> character can also be used as a prefix to remove a scalar container:

    my $x = (1, 2, 3);
    .say for @$x;               # 3 iterations

Methods generally don't care whether their invocant is in a scalar, so

    my $x = (1, 2, 3);
    $x.map(*.say);              # 3 iterations

maps over a list of three elements, not of one.

=head1 Custom containers

To provide custom containers Perl 6 provides the class C<Proxy>. It takes two
methods that are called when values are stored or fetched from the container.
Type checks are not done by the container itself and other restrictions like
readonlyness can be broken. The returned value must therefore be of the same
type as the type of the variable it is bound to. We can use type captures to
work with types in Perl 6.

    sub lucky(::T $type) {
        my T $c-value; # closure variable
        return Proxy.new(
            FETCH => method () { $c-value },
            STORE => method (T $new-value) {
                X::OutOfRange.new(what => 'number', got => '13', range => '-Inf..12, 14..Inf').throw
                    if $new-value == 13;
                $c-value = $new-value;
            }
        );
    }

    my Int $a := lucky(Int);
    say $a = 12;
    say $a = 'FOO'; # X::TypeCheck::Binding
    say $a = 13; # X::OutOfRange

=end pod