![indexing](./IMG/indexing.jpg)
<span>Photo by <a href="https://unsplash.com/@jbsinger1970?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Jonathan Singer</a> on <a href="https://unsplash.com/s/photos/bookshelf?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></span>

# Indexing

> Elegance is not a dispensable luxury but a factor that decides between success and failure. --_Edsger Dijkstra_

There are several ways of indexing into arrays and vectors. Some might even say "too many". A good start point is to read up on [indexing](https://help.dyalog.com/latest/#Language/Primitive%20Functions/Indexing.htm) in Dyalog's documentation. 

Crucially, elements of vectors and matrices are always scalars, but a scalar can be a boxed-up vector or matrix.
Indexing with [] or ⌷ returns the box, not the element, although if the element is a simple scalar, it's the same thing.

In [1]:
⎕IO ← 0
]box on -s=min -t=tree -f=on
]rows on
assert←{⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕signal 8 ⋄ shy←0}

## Bracket indexing `[ ]`

Bracket indexing is similar to how c-like languages index into arrays, except that the indexing expression itself can be a vector or array:

In [6]:
⊢v ← 10?10 ⍝ Numbers 0-9 in random order
v[5]    
v[5 2]  

We can mutate the vector or array via a bracket index, too:

In [7]:
v[3] ← ¯1
v

Of course, this being APL, this idea extends to any shape of array, either by separating the axes by semi-colon:

In [1]:
]DISPLAY m ← 3 3⍴9?9 ⍝ a 3×3 matrix of numbers 0-9 in random order
m[1;1]  ⍝ Row 1, col 1
m[1;]   ⍝ Row 1
m[;1]   ⍝ Col 1

or by supplying one or more enclosed vectors each with shape equal to the rank of the array:

In [9]:
m[⊂1 1]            ⍝ Centre
m[(0 0)(1 1)(2 2)] ⍝ Three points along the main diagonal

As mentioned above, bracket indexing references cells, not values enclosed in the cells. For numeric or character scalars, there is no difference, but it's clear for a nested array:

In [10]:
⊢m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

In [11]:
]DISPLAY m[1;1] ⍝ Note the returned enclosure.

## Squad-indexing, a.k.a functional indexing `⌷`

Whilst bracket-style indexing feels immediately familiar for those of us coming from a different programming language tradition, it is somewhat frowned upon amongst APLers. One reason for this is that it binds differently; not strictly right-to-left.

There is an alternative native indexing method: [squad-indexing](https://aplwiki.com/wiki/Index_(function)). Squad, "squashed quad", is the glyph `⌷`. It can be seen mnemonically as two square brackets pushed together. Squad fixes some of the issues surrounding the bracket indexing method above (but introduces some new ones, too). The binding is now APL-sensible:

In [12]:
⊢m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)
1⌷m       ⍝ Row 1
1 1⌷m     ⍝ Cell 1 1
(⊂1 2)⌷m  ⍝ Rows 1 and 2

Picking cells from other axes than the first requires you to specify the subset explicitly with square brackets, which arguably looks a bit clumsy. Note that this isn't a bracket index, even though it looks like one. Pick a cell from axis 1, i.e column 1 in our case: 

In [13]:
1⌷[1]m

or we could avoid the bracketed axis specification by picking the row from the matrix's _transpose_, `⍉`:

In [14]:
1⌷⍉m

Squad-index does not let you mutate the array.

## Boolean indexing: compress

But wait! There's more to APL indexing. In fact, much of APL's expressive power comes from its central application of bit-boolean arrays, and it's typically highly optimised. It's a concept you don't often see in non-array languages, but you may have been exposed to limited forms of it from bolt-on array libraries such as Python's [NumPy](https://numpy.org/). Similar functionality can be achieved using a [filter](https://docs.python.org/3/library/functions.html#filter) function taking a predicate in other languages.

The core idea is actually quite simple: select cells from an array by using a boolean array as the indexing method, where a 1 means "yes, this one" and a 0 means "nope, not this one". We use [compress](https://aplwiki.com/wiki/Replicate) to do this in APL, one of the several things represented by a forward slash `/`.

In [15]:
select ← 0 0 1 0 1 1 0 1 1 0 ⍝ Select elements 2, 4, 5, 7 and 8
data   ← ⍳10
select/data

Compress is a special case of the `replicate` function, where the left argument is a Boolean vector. However, we can view the left argument more generally as a specification of how many times we should pick each element. In the compression case, that's either 1 or 0. But in the more general case we need not constain ourselves to binary -- we can pick _any_ number:

In [16]:
select ← 1 3 0 0 5 0 7 0 0 1
select/data

Replicate and compress apply along the given axis in higher-rank arrays, either via `⌿` or `/[axis]`:

In [17]:
m ← 3 3⍴9?9
]DISPLAY m
select ← 0 1 0
select⌿m ⍝ Replicate-first
select/m ⍝ Replicate

## Pick

Yet another way to index into arrays is to use [pick](https://aplwiki.com/wiki/Pick). Pick eh... picks _elements_, not boxes, which often comes in handy. A monadic pick means pick first element.

In [18]:
⊢m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

(⊂1 1)⊃m ⍝ Element at 1;1 - note, no box
⊃m       ⍝ First element - note, no box

## Reach Indexing

_Reach indexing_ is how you access elements of nested arrays. Note that nested arrays carry with them performance penalties and are best avoided if at all possible.

In [28]:
⊢G←2 3⍴('Adam' 1)('Bob' 2)('Carl' 3)('Danni' 4)('Eve' 5)('Frank' 6)
G[⊂(0 1)0] ⍝ First element of the vector nested at ⊂0 1 of G
G[((0 0)0)((1 2)1)]

## Assignable indexing expressions

As we saw above, bracket indexing is _assignable_, meaning that we can mutate the array. It is not the only assignable indexing expression in APL. The full list of _selective assignment functions_ is available from the Dyalog [documentation](http://help.dyalog.com/18.0/index.htm#Language/Primitive%20Functions/Assignment%20Selective.htm#SelectiveAssignment). It's worth studying this manual page, as it unlocks quite a few crafty ways of getting data into arrays.

For example, we can change the diagonal of a matrix by assigning directly to a dyadic transpose by noting that `0 0⍉m` is the main diagonal of the matrix m:

In [19]:
]DISPLAY m ← 3 3⍴9?9
(0 0⍉m) ← ¯1 ¯1 ¯1 ⍝ 0 0⍉m is the main diagonal.
]DISPLAY m

Indeed, we can even assign via Boolean indexing expressions, which might not be immediately obvious:

In [20]:
select ← 0 0 1 0 1 1 0 1 1 0
data   ← ⍳10
(select/data) ← ¯1 ¯1 ¯1 ¯1 ¯1
data

Perhaps even less obvious is assigning to `take`:

In [21]:
⊢s ← 'This is a string'
(2↑s) ← '**'
s

...or even compress-each:

In [22]:
s←'This' 'is' (,'a') 'string' 'without' 'is.'
((s='i')/¨s)←'*'
s

## Sane indexing

Some APLers are unhappy with the squad-index semantics, and have proposed yet another mechanism, called _sane indexing_ or [select](https://aplwiki.com/wiki/Select). It can be defined as:

In [23]:
I←⌷⍨∘⊃⍨⍤0 99 ⍝ Sane indexing

For the purposes of this explanation, it matters less how that incantation hangs together (we'll return to how this works in the section on the _rank_ operator, `⍤`, later), but it does have a set of nice properties for the user.

Compare and contrast squad and sane indexing:

In [24]:
⊢m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

1 2I m   ⍝ Sane: select leading axis cells 1 and 2, or m[1 2;]
1 2⌷ m   ⍝ Squad: select m[⊂1 2]

(⊂1 2)I m  ⍝ Sane: select m[⊂1 2]
(⊂1 2)⌷ m  ⍝ Squad: select m[1 2;]

So you can think of sane indexing as squad, but closer to the behaviour of the bracket indexing expression:

In [25]:
(0 0)(1 2)(2 2)I m ⍝ Multiple cells by index, like m[(0 0)(1 2)(2 2)]