# Nested Arrays (Continued)

## First Contact

### Definitions

We have already met *nested* arrays in [the chapter about Data and Variables](./Data-and-Variables.ipynb); let us just remind ourselves of some definitions:

An array is said to be *generalised* or *nested* when one or more of its items are not simple scalars, but scalars containing "enclosed" arrays (this term will be explained soon).

Such an array can be created in many ways, although until now we have only covered the simplest one, called *vector notation*, or *strand notation*.
Using this notation, the items of an array are just juxtaposed, and each item can be identified as a separate item because:

 - it is separated from its neighbours by **blanks**, or
 - it is embedded within **quotes**, or
 - it is an expression embedded within **parentheses**, or
 - it is a **variable name**, or the name of a niladic function which returns a result.
 
 Just to demonstrate how it works, we will create a nested vector and a nested matrix:

In [1]:
one ← 2 2⍴8 6 2 4
two ← 'Hello'

In [2]:
nesVec ← 87 24 'John' 51 (78 45 23) 85 one 69
]display nesVec

In [3]:
nesMat ← 2 3⍴'Dyalog' 44 two 27 one (2 3⍴1 2 0 0 0 5)
]display nesMat

Later, we will provide a more formal description of this notation.

### Enclose & Disclose

It seems so easy to create and work with nested arrays;
couldn't we turn a simple array into a nested array by, for example, replacing one item of a simple matrix with a vector, like this:

First, we create a simple matrix:

In [4]:
⎕← mat ← 2 3⍴87 63 52 74 11 62

Then we try to change it into a nested array:

In [5]:
mat[1;2] ← 10 20 30

LENGTH ERROR
      mat[1;2]←10 20 30
              ∧


It doesn't work!

We cannot replace **one** item with an array of **three** items.

`mat[1;2]` is a scalar.
We can only replace it with a scalar.

#### Enclose

Let us now use a little trick to make the assignment above work.
We just have to zip up the three values into a single "bag", using a function called *enclose*, represented by the symbol `⊂`.

Then we will be able to replace one item by one bag!

In [6]:
mat[1;2] ← ⊂10 20 30
mat

Now it works!

We can, of course, do the same with character data, but we now know that an expression like

In [7]:
mat[2;3] ← 2 4⍴'JohnPete'

LENGTH ERROR
      mat[2;3]←2 4⍴'JohnPete'
              ∧


is incorrect
we must enclose the array like this:

In [8]:
mat[2;3] ← ⊂2 4⍴'JohnPete'

The result is what we expected:

In [9]:
]display mat

The result of *enclose* is always a scalar: cf. [the section below](./Nested-Arrays-Continued.ipynb#Simple-and-Other-Scalars).

#### Disclose

If we look at the contents of `mat[2;3]`, we see a little 2 by 4 matrix, but if we look at its shape, we see that surprisingly it has no shape.
Its rank is zero, so it must be a scalar!

In [10]:
mat[2;3]

In [11]:
⍴mat[2;3]

As we can see, its shape is empty.
And its rank is zero:

In [12]:
⍴⍴mat[2;3]

The explanation is obvious:
we have put this little matrix into a bag (a scalar), so we now see the bag, and not its contents.
If we want to see its contents, we must extract them from the bag, using a function called *disclose*, which is represented by the symbol `⊃`.

With it, we now have access to the matrix:

In [13]:
⍴⊃mat[2;3]

And its rank is two, as expected:

In [14]:
⍴⍴⊃mat[2;3]

We experience the same behaviour if we try to extract one item from a nested vector.

Let us recall the nested vector `nesVec`:

In [15]:
nesVec

We can use similar expressions to the ones we used on `mat`:

In [16]:
⍴nesVec[5]

The above looks like a scalar; it is a scalar, containing an eclosed vector.

Once we disclose it, we gain access to its contents (three elements, in this case):

In [17]:
⍴⊃nesVec[5]

In fact, this should not have come as a complete surprise to us.
Earlier we learned that the shape of the result of an indexing operation is identical to the shape of the indices.
In this case (as well as in the matrix case above), the index specifies a scalar.
Hence, it would be incorrect to expect anything other than a scalar as the result of the indexing operation!

#### Mnemonics

It is easy to remember how to generate the two symbols for *enclose* and *disclose* on a US or UK keyboard:

 - *Disclose* `⊃` is generated by <kbd>APL</kbd>+<kbd>X</kbd>, as in e**X**tract; and
 - *Enclose* `⊂` is generated by <kbd>APL</kbd>+<kbd>Z</kbd>, as in **Z**ip-up.

For reference, the actual symbols are called *left shoe* and *right shoe*, respectively for `⊂` and `⊃`; "enclose" and "disclose" are the names of the functions.

#### Simple and Other Scalars

We know that the result of *enclose* is **always** a scalar, but there is a difference between enclosing a scalar number or character, and enclosing any other array.

When appropriate, we shall use four different terms:

 - *simple scalar* refers to a single number or letter (rank zero);
 - *enclosed array* refers to a scalar that is the result of enclosing anything other than a simple scalar;
 - *item* refers to a scalar that is a constituent of an array, whether it is a simple scalar or an enclosed array; and
 - *nested array* is an array in which at least one of the items is an enclosed array.

Always remember these important points:

 - *enclose* does nothing to a simple scalar - it returns the scalar unchanged. The same for *disclose*;
 - all items of an array are effectively scalars, whether they are simple scalars or enclosed arrays: their rank is 0, and their shape is empty;
 - a single item can be replaced only by another single item: a simple scalar, or an array of values zipped up using *enclose* (to form an enclosed array); and
 - *vector notation* (*strand notation*) avoids the use of *enclose*, because of the conventions used to separate individual items from one another.

Let us create four vectors:

In [18]:
a ← 'coffee'
b ← 'tea'
c ← 'chocolate'

In [19]:
v ← a b c

The last statement is just a simpler way to write:

In [20]:
v ← (⊂a),(⊂b),⊂c

So, we can see that each of the items of `v` is an enclosed character vector.
Thus,

In [21]:
⍴v[1]

is `⍬`, not `6`:

Here is another example:

In [22]:
nesVec[1 5 6] ← 'Yes' 987 'Hello'
]display nesVec

If we use any additional *enclose* primitives, the results are very different.
And the results also vary depending on where the *enclose* primitives are used.

Here are two examples:

In [23]:
nesVec[1 5 6] ← 'Yes' 987 (⊂'Hello')
]display nesVec

In [24]:
nesVec[1 5 6] ← ⊂'Yes' 987 'Hello'
]display nesVec

<!-- (TODO) Figure out what to do about the “More About DISPLAY” section. -->

## Depth & Match

### Enclosing Scalars

Applied to a simple scalar, *enclose* does nothing: the enclose of a simple scalar is the same simple scalar:

In [25]:
]display 35

In [26]:
]display ⊂35

However, when applied to any other array, *enclose* puts a "bag" around it.

First, we start with a simple vector:

In [27]:
]display 2 4 8

If we use *enclose* once, we get a scalar containing a numeric vector:

In [28]:
]display ⊂2 4 8

With one more *enclose*, we get a scalar containing another scalar, itself containing a numeric vector.

### Depth

Suppose that we write a function `Process`, which takes as its argument a vector consisting of: the name of a town, the number of inhabitants, a country code, and the turnover of our company in that town.

For example, we could call the function as `Process 'Lyon' 466400 'FR' 894600`.

For the purpose of this example, the function will just display the items it receives in its argument.
We choose to write it with the following syntax:

In [29]:
]dinput
Process ← {
    (town pop coun tov) ← ⍵
    ⎕← (15↑'Town = '),town
    ⎕← (15↑'Population = '),⍕pop
    ⎕← (15↑'Country = '),coun
    ⎕← (15↑'Turnover = '),⍕tov
}

Perhaps this is not the smartest thing we could do, but we did it!

Now, let us execute the function and verify that it works properly:

In [30]:
Process 'York' 186800 'GB' 540678

This looks promising, but what will happen if the user forgets one of the items that the function expects?
Let's test it:

In [31]:
Process 'York' 186800 'GB'

LENGTH ERROR
Process[1] (town pop coun tov)←⍵
                              ∧


As we might expect, an error message is issued: we cannot put 3 values into 4 variables!

Let us add a little test to our function to check whether or not the right argument has 4 items.

Here is the new version; notice the new line of code:

In [32]:
]dinput
Process ← {
    4≠≢⍵: 'Hey, weren''t you supposed to provide 4 values?'
    (town pop coun tov) ← ⍵
    ⎕← (15↑'Town = '),town
    ⎕← (15↑'Population = '),⍕pop
    ⎕← (15↑'Country = '),coun
    ⎕← (15↑'Turnover = '),⍕tov
}

It seems to work well now:

In [33]:
Process 'York' 186800 'GB'

But one day the user forgets all but one of the items, and just types the name of the town.
If the user is (un)lucky enough to type a town name with four letters, here is what happens:

In [34]:
Process 'York'

This trivial example shows that when nested arrays are involved, it is not sufficient to rely on the shape of an array;
we need additional information: specifically, is it a simple or a nested array?
To help distinguish between simple and nested arrays, APL provides a function named *depth*.
It is represented by the monadic use of the symbol `≡`.

Here is a set of rules that define how to determine the depth of an array:

 - the depth of a simple scalar is 0;
 - the depth of any other array of any shape is 1, if all of its items are simple scalars.

We call such an array a *simple array*, so we can instead say:

 - the depth of a non-scalar, simple array is 1;
 - the depth of any other array is equal to the depth of its deepest item plus 1; and
 - the depth is positive if the array is uniform (all of its items have the same depth), and negative if it is not.

Another intuitive definition of *depth* is this: if we `DISPLAY` the array and count the number of boxes you must pass to reach its deepest item.

Here are some examples:

In [35]:
≡ 540678

As seen above, a scalar has depth 0.

The following vector contains only simple scalars. Its depth is 1:

In [36]:
≡ 15 84 37 11

The rank of an array doesn't influence directly its depth.
If we reshape the vector above into a matrix, its depth is still 1 because it contains only simple scalars:

In [37]:
≡ 2 2⍴15 84 37 11

Now, let us consider this nested vector:

In [38]:
≡ vec1 ← (4 3) 'Yes' (8 7 5 6) (2 4)

It is composed of four enclosed vectors, each of depth 1 - so `vec1` has depth 2.
Now let us change the expression slightly:

In [39]:
≡ vec2 ← (4 3) 'Yes' (8 7 5) 6 (2 4)

This vector is no longer uniform: it contains four enclosed vectors and one simple scalar, so its depth is negative.
The *magnitude* of the depth has not changed, since it reports the highest level of nesting.

In this context, the word "uniform" only means that the array contains items of the same *depth*.

 - `vec2` is not uniform: it contains vectors (of depth 1) mixed with a scalar (of depth 0); and
 - `vec1` is uniform: all its items are vectors (of depth 1), even though they do not have the same shape, the same type, and certainly not the same content.

## Each

### Definition and Examples

To avoid the necessity of processing the items of an array one after the other in an explicitly programmed loop, one can use a monadic operator called *each*, which is represented by a diaeresis symbol, which looks like `¨` and is typed with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>1</kbd>.

As its name implies, *each* applies the function on its left (its *operand*) to each of the items of the array on its right (if the function is monadic), or to each pair of corresponding items of the arrays on its left and right (if the function is dyadic).

Let us try it with some small nested vectors and a monadic function:

In [40]:
vec3 ← (5 2) (7 10 23) (52 41) (38 5 17 22)
vec4 ← (15 12) 71023 (2 2⍴⍳4) (74 85 96)
vec5 ← (7 5 1) (19 14 13) (33 44 55)

Now, we can ask for the shape of `vec3`:

In [41]:
⍴vec3

Using `¨`, we can ask for the shape of *each of the items* of `vec3`:

In [42]:
⍴¨vec3

We can do the same with the second vector:

In [43]:
⍴¨vec4

Beware! One item of `vec4` is a scalar, so its shape is empty, as shown above.
If `]box` were off, this could look odd at first sight:

In [44]:
]box off

In [45]:
⍴¨vec4

In [46]:
]box on

If the function specified as the operand to *each* is dyadic, the derived function is also dyadic.
As usual, if one of the arguments is a scalar, the scalar is automatically repeated to match the shape of the other argument.
For example, take the following vector with the names of some months:

In [47]:
monVec ← 'January' 'February' 'March' 'April' 'May' 'June'

To take the first 3 letters of *each* vector in that vector of vectors, we would do

In [48]:
3↑¨monVec

As we have just shown, there is no need to repeat the `3` to have the same shape as `monVec`.

Naturally, the operand to *each* can also be a *user-defined function*, provided that it can be applied to all of the items of the argument array(s):

In [49]:
Average ← {(+/⍵)÷≢⍵}
Average¨vec3

<!-- begin remark -->
***Remark***:

 > In fact, *each* is a bit more than a "hidden" loop.
 >
 > Please, remember that all items of an array are scalars - either simple scalars or enclosed arrays.
 > So, in an expression like `⍴¨vec5`, shouldn't we expect the result to be just a list of three empty vectors, since the shape of a scalar is an empty vector?
 >
 > No, the *each* operator is smarter than that.
 > For each item of the argument array, the item is first *disclosed* (the "bag" is opened), the function is applied to the disclosed item, and the result is *enclosed* again to form a scalar (i.e., put into a new "bag").
 > Finally, all the new bags (scalars) are arranged in exactly the same structure (rank and shape) as the original argument array to for the final result.
<!-- end -->

So,

In [50]:
⍴¨vec5

is in fact equivalent to

In [51]:
(⊂⍴⊃vec5[1]), (⊂⍴⊃vec5[2]), (⊂⍴⊃vec5[3])

In [52]:
(⍴¨vec5)≡(⊂⍴⊃vec5[1]), (⊂⍴⊃vec5[2]), (⊂⍴⊃vec5[3])

If the operand to *each* is a dyadic function, the corresponding items of the left and right arguments are both disclosed before applying the function.

We have seen that the operand to *each* may be a primitive function or a user-defined function.
It may also be a *derived function* returned by another operator.
For example, in the following expressions, the operand to *each* is not `/`, but the derived function `+/`.

In this example, we sum the numbers inside each item of the vector:

In [53]:
+/¨vec3

In this next one, it still works, even though one item is a matrix:

In [54]:
+/¨vec4

Beware: in some cases, the same derived function can be applied with or without the help of *each*, but the result will not be the same at all:

In [55]:
]display vec5

Without `¨`, `+/` sums the three sub-vectors together:

In [56]:
+/vec5

With `¨`, `+/¨` will compute the sum of *each* of the sub-vectors:

In [57]:
+/¨vec5

### The Use of Each

*Each* is a "loop cruncher".
Instead of programming loops, in APL you can apply any function to each of the items of an array, each of which may contain a complex set of data.

This operator is also useful combined with *match* when a simple equal sign would have caused an error.
For example, to compare two lists of names:

In [58]:
'John' 'Julius' 'Jim' 'Jean' ≡¨ 'John' 'Oops' 'Jim' 'Jeff'

When used inappropriately, the *each* operator can sometimes use a large amount of memory for its intermediate results, so you may need to use it with some care.

Suppose that we have a huge list `customerTover`, of turnover amounts, one item per customer (we have more than 5,000 of them!).
Each item contains a matrix having a varying number of rows (products) and 52 columns (weeks).
Our task is to calculate the total average turnover per week per customer.
No problem, that's just `(+/¨+⌿¨customerTover)÷52`.

However, if `customerTover` is very large, and we do not have much workspace left, the above expression may easily cause a `WS FULL` error.

The reason is that the intermediate expression `+⌿¨customerTover` produces a list of 52 amounts per customer, and that may require more workspace than we have room for.

Instead, we can put the entire expression into a function.
As is often the case in APL (and in programming, in general), the hardest part of writing a function is finding a good name for it.
Fortunately, we can get by without a name if we use an anonymous dfn, with `{(+/+⌿⍵)÷52}¨customerTover`.

Because we have "isolated" the entire logical process in the function and used *each* to loop through the items one by one, we will at most have only one customer's data "active" at any time, and each intermediate result (a 52-item vector) will be thrown away before recalculating that for the next customer.
The result of each function call is just one number, so it is much less likely that we will run into `WS FULL` problems.

### Three Compressions!

In the following we will show three expressions which look similar, but their results are very different.
Let us first recall that `vec5` consists of three vectors, each containing three items:

In [59]:
vec5

What is the result of a *compression*?

In [60]:
1 0 1/vec5

Above, the vector `1 0 1` applies to the three items of `vec5`, compressing out the middle one.

In [61]:
]display 1 0 1/vec5

As mentioned, the compression applies to the items of `vec5`, as it would to any vector.
So, the second item has been removed.

If we use `1 0 1/¨vec5`, do you think the result is the same?
Are you sure?
It is not displayed the same way:

In [62]:
1 0 1/¨vec5

Things are different here: each item of `1 0 1` is paired with each sub-vector, like this:

 - `1/7 5 1` gives `7 5 1`;
 - `0/19 14 13` gives `⍬`; and
 - `1/33 44 55` gives `33 44 55`.

Thanks to `]display`:

In [63]:
]display 1 0 1/¨vec5

There is a third way of using *compress*.
If we *enclose* the left argument, the entire mask `1 0 1` is applied to each sub-vector.
The second item of each sub-vector has been removed:

In [64]:
]display (⊂1 0 1)/¨vec5

## Processing Nested Arrays

We have already seen a number of operations involving nested arrays; we shall explore some more in this section.
Because nested arrays generally tend to have a rather simple - or at least uniform - structure, we can illustrate the operations using our little vectors.

### Scalar Dyadic Functions

You can refer to [this section](Some-Primitive-Functions.ipynb#Scalar-vs-Non-scalar-Functions) concerning the application of scalar dyadic functions to nested arrays.

However, let us here explore again how *each* applies to scalar dyadic functions:

In [65]:
vec5

In [66]:
vec5 + 100 20 1

100, 20, and 1 are added to the three sub-vectors, respectively.

Using *each*, the result is still the same:

In [67]:
vec5 +¨ 100 20 1

If we *enclose* the right argument, then `100 20 1` becomes a scalar, and gets added to each of the three sub-vectors:

In [68]:
vec5 +¨ ⊂100 20 1

If we drop the *each* operator, the result is the same because the scalar on the right is extended to match the shape of the left vector:

In [69]:
vec5 + ⊂100 20 1

In fact, *each* is a superfluous operator when used with scalar dyadic functions, because scalar dyadic functions are *pervasive*, as seen in [a previous section](Some-Primitive-Functions.ipynb#Scalar-vs-Non-scalar-Functions).

### Juxtaposition vs Catenation

When you *catenate* a number of arrays, for example `v ← a,b,c`, you create a new array with the **contents** of `a`, `b`, and `c` catenated together to make a single new array, as we have seen many times before.

Let us use a small vector and see how it works:

In [70]:
small ← 3 4 5

In [71]:
1 2,small,6 7

As we can see, the result is a *simple* vector.

What happens here is of course that the first 3-item vector `small` and the 2-item vector `6 7` are combined into one 5-item vector.
Then, this 5-item vector is combined with the 2-item vector `1 2` to form the resulting 7-item vector.
Both the final and the interim results are *simple* vectors.

We can now explain what happens when you *juxtapose* two or more arrays (*strand notation*), for example `v ← a b c d e`: each array is enclosed, and the resulting scalars are catenated together.

Such an expression produces a vector made of as many items as we have arrays on the right.
In the example that follows, the result is a *nested* vector:

In [72]:
1 2 small 6 7

This is what we call *vector notation* or *strand notation*.
In this case, we juxtaposed five arrays, so we created a nested array of length five.

What happens here is that each of the five arrays is first enclosed, and then the resulting five scalars are catenated together to produce the 5-item vector.
Please remember that enclosing a simple scalar does not change it, so you can only see the difference for the array `small`:

In [73]:
(1 2) small 6 7

Here, we juxtaposed four arrays, two of which are vectors.
It is, again, an example of *strand notation*.

In other words, juxtaposition works on arrays seen as building blocks, while catenation works on the contents of the arrays.

It may help you to know that there is a strict relationship between catenation and *strand notation*:
`a b c` is the same as `(⊂a),(⊂b),(⊂c)`.

Here is an example:

In [74]:
a ← ⍬
b ← 'apl'
c ← 42

In [75]:
a b c

In [76]:
(⊂a),(⊂b),(⊂c)

The two results look the same; we can be sure they *are* the same by using `≡`:

In [77]:
a b c≡(⊂a),(⊂b),(⊂c)

Now, we will turn our attention to two other expressions that give the same result,

In [78]:
(1 2) small,6 7

and

In [79]:
(1 2) small 6 7

These two expressions give the same result, but for a different reason than the one explained above.
In fact, `small` is **not** catenated to the vector `6 7` as in the first example above.
To read this expression correctly, we must recall comma *is* an APL function:

 - its right argument is the vector `6 7`, of course; and
 - its left argument is whatever is on its left, up to the next function.
 As there is no such function (parenthesis are not functions), the left argument is the result of the entire expression to the left of the comma, i.e., the 2-item vector `(1 2) small`.

So, the result is that the 2-item vector `(1 2) small` is combined with the 2-item vector `6 7` to form the resulting 4-item vector.

**Remember this**: when interpreting an expression, you must never "break" a sequence of juxtaposed arrays (a *strand*), even if it is a nested vector.

So, in the previous example, the left argument to *catenate* is this whole array:

In [80]:
(1 2) small

When *catenate* is executed, the two items of this argument are catenated to the two items `6 7` of the right argument, making the same 4-item nested vector as in the previous example.

Can you predict the result of `(1 2),small 6 7`?

### Characters and Numbers

We have a character matrix `cm` and a numeric matrix `nm`:

In [81]:
⎕← cm ← 3 7⍴'FrancisCarmen Luciano'

In [82]:
⎕RL ← 73
⎕← nm ← (?3 4⍴200000)÷100

We would like to have them displayed side by side.

#### Solution 1

The first idea is to just type `cm nm`:

In [83]:
cm nm

The format of the result is not ideal; some values have two decimal digits, and some have only one or none.
But there is a much more important problem.
Imagine that we would like to draw a line on the top of the report.
We can catenate a single dash along the first dimension:

In [84]:
'-'⍪cm nm

This is not what we expected: the dash has been placed on the left, not on the top!
The reason is that the expression `cm nm` does not produce a matrix, but a 2-item nested vector.
And when one catenates a scalar to a vector, it is inserted before its first item or after the last one, to produce a longer vector.
This cannot produce a matrix, unless *laminate* is used, but we shall not try that now.

#### Solution 2

Well, if juxtaposition doesn't achieve what we want, why shouldn't we catenate our two matrices?

In [85]:
cm,nm

This is almost the same presentation, but not exactly; this is a matrix!

Now, let us try to draw the line:

In [86]:
'-'⍪cm,nm

Horrible!
What happened?

When we catenated `cm` (shape `3 7`) with `nm` (shape `3 4`), we produced a 3 by 11 matrix.
So, when we further catenated a dash on top of it, the dash was repeated 11 times to fit the last dimension of the matrix.
This is why we obtained 7 dashes on top of the 7 text columns, and 4 dashes, each on top of each of the 4 numeric columns.
This is still not what we want!

#### Solution 3

The final solution will be the following: convert the numbers into text, using the *format* function, and then catenate one character matrix to another character matrix:

In [87]:
'-'⍪cm,9 2⍕nm

Now, the line is exactly where we want it and the numbers are nicely formatted.

**Exercise 1**: try to deduce the results of the following 3 expressions (depth, rank, shape), and then verify your solutions on the computer:

```APL
(⊂cm) (⊂nm)
(⊂cm),(⊂nm)
cm,⊂nm
```

### Some More Operations

Let us use `vec5` once more.

#### Reduction

In [91]:
+/vec5

Notice the box around the final result!

The three enclosed arrays (scalars) have been added together, and the result is therefore an enclosed array (a scalar).
You can tell this from the output, because there is a box around the result.

We know that the reduction of a vector (rank 1) produces a scalar (rank 0), and this rule still applies here.

To obtain the _contents_ of the (enclosed) vector, we must disclose the result:

In [93]:
⊃+/vec5

The same thing can be observed if we try to collect all the values contained in `vec5` into a single vector, by catenating them together:

In [94]:
,/vec5

It worked, but here again we might want to disclose the result:

In [95]:
⊃,/vec5

#### Index Of and Membership

The function *index of* (dyadic `⍳`) may be used to search for (find the position of) items in a nested vector:

In [96]:
vec5 ⍳ (19 14 13)(1 5 7)

This is correct: the first vector appears in `vec5` as `vec5[2]`, and the second vector is not present.

But beware, there is a booby trap:

In [97]:
vec5 ⍳ (19 14 13)

`(19 14 13)` is not a nested array.
`vec5` is searched for each of these three numbers individually, and they are not found.

To get the expected result, we need to enclose the right argument to *index of*:

In [98]:
vec5 ⍳ ⊂19 14 13

It is also important to be aware of this when using *membership*:

In [99]:
(3 4 5)(7 5 1) ∊ vec5

In [100]:
(7 5 1) ∊ vec5

In [101]:
(⊂7 5 1) ∊ vec5

#### Indexing

The rules we saw about indexing remain true: when one indexes a vector by an array, the result has the same shape as the array.
If the vector is nested, the result is generally nested too:

In [102]:
]display vec4

In [103]:
]display vec4[2 2⍴4 2 1 3]

We have also seen, in [a previous section](./Data-and-Variables.ipynb#Array-Indexing), that a nested array can be used as an index.
For example, to index items scattered throughout a matrix, the array that specifies the indices is composed of 2-item vectors (row and column indices):

In [105]:
⎕← tests ← 6 3⍴11 26 22 14 87 52 30 28 19 65 40 55 19 31 64 33 70 44

In [106]:
tests[(2 3)(5 1)(1 2)]

In [107]:
tests[2 2⍴(2 3)(5 1)(1 2)]

Let us try to obtain the same result with the *index* function, or *squad*:

In [108]:
(2 3)(5 1)(1 2) ⌷ tests

LENGTH ERROR
      (2 3)(5 1)(1 2)⌷tests
                     ∧


The above cannot work.
*Index* expects a 2-item vector: a list of rows and a list of columns.

In [109]:
(2 3)(5 1)(1 2) ⌷¨ tests

RANK ERROR
      (2 3)(5 1)(1 2)⌷¨tests
                     ∧


This second attempt also won't work: each item of the left argument cannot be associated with a corresponding item of `tests`, because they do not have the same shape.

In order to get this to work, we need to enclose `tests`:

In [110]:
(2 3)(5 1)(1 2) ⌷¨ ⊂tests

This last expression worked correctly. **Each** couple of indices is paired with `tests` as a whole because it has been enclosed, and therefore the scalar on the right is extended to match the 3-item vector on the left.

#### Always Keep In Mind the Following Rules

 - The items of a nested array are scalars and are therefore always processed as scalars.

In the expression

In [112]:
(5 6)(4 2)×10 5

Above, `(5 6)` is multiplied by `10` and `(4 2)` is multiplied by `5`.

 - A single list of values placed between parentheses is not a nested array:

In [113]:
(45 77 80)

The parentheses do nothing here.

 - An expression is always evaluated from right to left, one function at a time. Note that strands can be easy to miss when determining what the left argument of a function is.

In the expression `2×a 3+b`, the left argument of the *plus* function is not `3` alone, but the vector `a 3`.

Before we go any further with nested arrays, we recommend that you try to solve some exercises.

### Intermission Exercises

**Exercise 2**:

You are given three numeric vectors:

In [1]:
a ← 1 2 3
b ← 4 5 6
c ← 7 8 9

Try to predict the results given by the following expressions in terms of depth, rank, and shape.
Then check your results using `]display`, or the appropriate primitives.

 1. `a b c × 1 2 3`
 1. `(10 20),a`
 1. `(10 20),a b`
 1. `a b 2 × c[2]`
 1. `10×a 20×b`

**Exercise 3**:

Same question for the following expressions:

 1. `+/a b c`
 1. `+/¨a b c`
 1. `1 0 1/¨a b c`
 1. `(a b c)⍳(4 5 6)`
 1. `1 10 3 ∊ a`
 1. `(⊂1 0 1)/¨a b c`
 1. `1 10 3 ∊ a b c`

**Exercise 4**:

Consider the following nested array:

In [4]:
⊢na ← 1 2 (2 2⍴3 4 5 6)7 8

What are the results of `+/na` and `,/na`?

## Split and Mix

We saw that in some cases we can choose to represent data either as a matrix or as a nested vector; remember `monMat` and `monVec`.

Two primitive monadic functions are provided to switch from one form to the other:

 - _Mix_ (`↑`) returns an array of _higher rank_ and _lower depth_ than that of its argument; and
 - _Split_ (`↓`) returns an array of _lower rank_ and _higher depth_ than that of its argument.

### Basic Use

Let us apply _mix_ to two small vectors:

In [10]:
vtex ← 'One' 'Two' 'Three'
vnum ← (6 2) 14 (7 5 3)
⎕← rtex ← ↑ vtex

Notice how we have converted a nested vector (of depth 2 and rank 1) into a simple matrix (of depth 1 and rank 2).

In [14]:
⎕← rnum ← ↑ vnum

In this example, we have converted a nested vector (of depth -2 and rank 1) into a simple matrix (of depth 1 and rank 2).

Of course the operation is possible only because the shorter items are padded with blanks (for text) or zeroes (for numbers), or more generally by the appropriate _fill item_ (this notion will be explained soon).

The last example above shows that when we say that the depth is reduced, we actually mean that the _magnitude_ of the depth is reduced.

And now, let us apply `split` to the matrices we have just produced:

In [15]:
⎕← newtex ← ↓rtex

We converted a simple matrix (of depth 1 and rank 2) into a nested vector (of depth 2 and rank 1).

In [16]:
⎕← newnum ← ↓rnum

Note that the two new vectors (`newtex` and `newnum`) are not identical to the original ones (`vtex` and `vnum`) because, when they were converted into the matrices `rtex` and `rnum`, the shorter items were padded.
When one splits a matrix, the items of the result all have the same size.

#### Mix applied to heterogeneous data

The examples shown above represent very common uses of _mix_ and _split_.
However, it is of course also possible to apply the functions to heterogeneous data.

For example, we can mix text and numbers:

In [18]:
↑'Mixed' (11 43)

And we can also mix a simple vector with a nested one.
As expected, the result below is a 2 by 3 matrix:

In [19]:
↑ 'Yes' ('Oui' 'Da' 'Si')

### Axis Specification

#### Split

When we apply the function _split_ to an array, its rank will decrease, so we must specify which of its dimensions is to be suppressed.
If we don't specify it explicitly, the default is to suppress the last dimension.

Let us work on `chemistry`, a matrix we used earlier:

In [20]:
⎕← chemistry ← 3 5⍴'H2SO4CaCO3Fe2O3'

In this case, there are two possible uses of _split_, we can apply it either to the first dimension or to the second dimension.

If we specify the first axis, the matrix is split column-wise:

In [21]:
↓[1]chemistry

If we specify the second axis, the matrix is split row-wise:

In [24]:
↓[2]chemistry

If we omit the axis specification, _split_ defaults to the last axis:

In [25]:
↓chemistry

#### Mix

The use of _mix_ is a bit more complex because it adds a new dimension to an existing array.
So does the function _laminate_, and the two functions use the same convention to specify where to insert the new dimension.

If we apply the function _mix_ to a 3-item nested vector of vectors, in which the largest item is an enclosed 5-item vector, the result must be either a 5 by 3 matrix, or a 3 by 5 matrix (the default).

In the same way as for _laminate_, a new dimension is created.
This new dimension can be inserted before or after the existing dimension.
The programmer decides this by specifying an axis:

 - `[0.5]` inserts the new dimension **before** the existing one, resulting in a 5 by 3 matrix; or
 - `[1.5]` inserts the new dimension **after** the existing one, resulting in a 3 by 5 matrix.

In [1]:
↑[0.5]'One' 'Two' 'Three'

In [2]:
↑[1.5]'One' 'Two' 'Three'

The last example is the default behaviour, where the new dimension is inserted after the existing one:

In [4]:
↑'One' 'Two' 'Three'

Let us now work with a nested matrix:

In [19]:
⎕← friends ← 2 3⍴'John' 'Mike' 'Anna' 'Noah' 'Suzy' 'Paul'

The shape of this matrix is `2 3`, and its items are all of length `4`.
So, _mix_ can produce three different results, according to axis specifications as follows:

| With the axis | the new dimension is inserted | and the resulting shape is |
| -: | -: | -: |
| `[2.5]` | after `2 3` | <code>  2   3 4</code> |
| `[1.5]` | between `2` and `3` | <code>  2 4 3  </code> |
| `[0.5]` | before `2 3` | <code>4 2   3  </code> |

Each of these three cases is illustrated below.

In [20]:
↑[2.5]friends    ⍝ Default case, [2.5] was unnecessary.

In [21]:
⍴↑[2.5]friends

In [22]:
↑[1.5]friends

In [23]:
⍴↑[1.5]friends

In [24]:
↑[0.5]friends

In [25]:
⍴↑[0.5]friends

In the first example, the names are placed "horizontally" as rows in two sub-matrices.

In the second case, they are placed "vertically" in columns.

The third case is more difficult to read; the names are positioned perpendicularly to the matrices, with one letter in each.
You might like to imagine that the letters are arranged in a cube, and that you are viewing it from three different positions.

Notice that, naturally, there is a connection between using `↑[k]` and using _mix_ followed by _dyadic transpose_.
In fact, 