![indexing](./IMG/indexing.jpg)
<span>Photo by <a href="https://unsplash.com/@jbsinger1970?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Jonathan Singer</a> on <a href="https://unsplash.com/s/photos/bookshelf?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></span>

# Indexing

> Elegance is not a dispensable luxury but a factor that decides between success and failure. --_Edsger Dijkstra_

There are several ways of indexing into arrays and vectors. Some might even say "too many". A good start point is to read up on [indexing](https://help.dyalog.com/latest/#Language/Primitive%20Functions/Indexing.htm) in Dyalog's documentation, and Richard Park's [webinar](https://dyalog.tv/Webinar/?v=AgYDvSF2FfU) on the topic is extremely helpful, too.

But as before, we begin by setting our _index origin_, extra important as we're about to discuss indexing. Whilst we're at it, let's also ensure we have output boxing turned on.

In [1]:
⎕IO ← 0

In [1]:
]box on

Anyway -- back to indexing. Crucially, elements of vectors and matrices are always scalars, but a scalar can be an enclosed vector or matrix. Indexing with `[]` or `⌷` returns the _box_, not the element _in_ the box. However, if the element is a simple scalar, it's the same thing.

Let's look at _bracket indexing_ first.

## Bracket indexing `[ ]`

[_Bracket indexing_](http://help.dyalog.com/18.0/index.htm#Language/Primitive%20Functions/Indexing.htm) is similar to how C-like languages index into arrays:

In [7]:
⎕ ← v ← 9 2 6 3 5 8 7 4 0 1
v[5]    ⍝ Grab the cell at index 5

However, unlike C (and its ilk), the indexing expression can be a vector, or even a higher-rank array:

In [8]:
v[5 2]  ⍝ Grab the cells at indices 5 and 2

We can mutate the vector or array via a bracket index, too:

In [7]:
v[3] ← ¯1
v

Of course, this being APL, this idea extends to any shape of array, either by separating the axes by semi-colon:

In [9]:
]DISPLAY m ← 3 3⍴4 1 6 5 2 9 7 8 3 ⍝ a 3×3 matrix
m[1;1]  ⍝ Row 1, col 1
m[1;]   ⍝ Row 1
m[;1]   ⍝ Col 1

or by supplying one or more _enclosed vectors_, each with shape equal to the _rank of the array_:

In [50]:
m[⊂1 1] ⍝ Centre

In [23]:
m[(0 0)(1 1)(2 2)] ⍝ Three points along the main diagonal

As indicated above, bracket indexing references _cells_, not the _values_ enclosed in the cells. For numeric or character scalars, there is no difference, but the difference is clear for a nested array:

In [51]:
⎕ ← m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

In [11]:
]DISPLAY m[1;1] ⍝ Note the returned enclosure.

## Functional indexing `⌷`

Whilst bracket-style indexing feels immediately familiar for those of us coming from a different programming language tradition, it is somewhat frowned upon amongst APLers. One reason for this is that it doesn't follow APL's normal strict right-to-left evaluation order as the indexing expression always must be evaluated first. As a consequence, it just stands out a bit: it's neither a monadic or dyadic function call. Another reason is that bracket indexing doesn't work in _tacit_ functions, a topic we'll cover in a [later chapter](./tacit.ipynb).

There is an alternative native indexing method: _functional_, or [_Squad indexing_](https://aplwiki.com/wiki/Index_(function)). [_Squad_](http://help.dyalog.com/18.0/index.htm#Language/Primitive%20Functions/Index.htm), "squashed quad", is the glyph `⌷`. It can be seen mnemonically as the two square indexing brackets pushed together. _Squad_ fixes some of the issues surrounding the bracket indexing method above (but introduces some new ones, too). As _Squad_ is a normal dyadic function, it behaves just like any other of APL's dyadic functions:

In [25]:
⎕ ← m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

In [26]:
1⌷m       ⍝ Row 1

In [27]:
1 1⌷m     ⍝ Cell 1 1

In [28]:
(⊂1 2)⌷m  ⍝ Rows 1 and 2

However, selecting cells from other axes than the first requires you to specify the axes explicitly with square brackets, which arguably looks a bit clumsy. Note that this isn't a bracket index, even though it looks like one. For example, here's how we select cell 2 from axis 1 (i.e. third column): 

In [53]:
2⌷[1]m

or we could avoid the bracketed axis specification by picking the row from the matrix's _Transpose_, `⍉`:

In [54]:
2⌷⍉m

_Squad index_ does not let you mutate the array.

Another issue with _Squad_ is that it flips the conventions established by the bracket indexing method. Let's return to a couple of our examples from the bracket indexing section, and compare those with how you'd achieve the same thing with _Squad_:

In [29]:
n ← 3 3⍴4 1 6 5 2 9 7 8 3

In [30]:
n[⊂1 1] ⍝ Centre

In [31]:
n[(0 0)(1 1)(2 2)] ⍝ Three points along the main diagonal

_Squad_'s indexing expression, unlike that of bracket indexing's, specifies the coordinate for each axis in turn:

In [14]:
1 1⌷n ⍝ Centre

If we enclose the indexing expression we pick major cells, which arguably "feels" odd compared with how bracket indexing behaves:

In [52]:
(⊂1 1)⌷n ⍝ Repeat row 1

So, how do we choose the three diagonal cells with _Squad_? With great difficulty, as it turns out. For this we need _Sane indexing_, up next.

## Sane indexing

Some APLers are unhappy with _Squad_'s semantics, and have proposed yet another mechanism, called _Sane indexing_ or [_Select_](https://aplwiki.com/wiki/Select). It's not yet built into Dyalog, but it can be defined as:

In [35]:
I←⌷⍨∘⊃⍨⍤0 99 ⍝ Sane indexing

For the purposes of this explanation, it matters less how that incantation hangs together (we'll return to how this works in the section on the [_Rank_](./rank.ipynb) operator, `⍤`, later), but it does have a set of nice properties for the user.

Compare and contrast _Squad_ and _Sane indexing_:

In [36]:
⎕ ← m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

Index with a vector:

In [37]:
1 2I m   ⍝ Sane:  select leading axis cells 1 and 2, or m[1 2;]
1 2⌷ m   ⍝ Squad: select m[⊂1 2]

Index with an _enclosed_ vector:

In [39]:
(⊂1 2)I m  ⍝ Sane:  select m[⊂1 2]
(⊂1 2)⌷ m  ⍝ Squad: select m[1 2;]

So you can think of _Sane indexing_ as _Squad_, but closer to the behaviour of the bracket indexing expression. We can finally select a bunch of cells by index:

In [40]:
(0 0)(1 2)(2 2)I m ⍝ Multiple cells by index, like m[(0 0)(1 2)(2 2)]

## Boolean indexing: compress

But wait! There's more to APL indexing. In fact, much of APL's expressive power comes from its central application of bit-Boolean arrays, and it's typically highly optimised. It's a concept you don't often see in non-array languages, but you may have been exposed to limited forms of it from bolt-on array libraries such as Python's [NumPy](https://numpy.org/). Similar functionality can be achieved using a [filter](https://docs.python.org/3/library/functions.html#filter) function taking a predicate in other languages.

The core idea is actually quite simple: select cells from an array by using a Boolean array as the indexing method, where a 1 means "yes, this one" and a 0 means "nope, not this one". We use [_Compress_](http://help.dyalog.com/18.0/index.htm#Language/Primitive%20Functions/Replicate.htm) to do this in APL, one of the several things represented by a forward slash `/`.

In [41]:
data   ← 0 1 2 3 4 5 6 7 8 9
select ← 0 0 1 0 1 1 0 1 1 0 ⍝ Select elements 2, 4, 5, 7 and 8
select/data

_Compress_ is really a special case of the _Replicate_ function, where the left argument is a Boolean vector. However, we can view the left argument more generally as a specification of how many times we should pick each element. In the compression case, that's either 1 or 0. In the more general case we need not constrain ourselves to binary -- we can pick _any_ number:

In [16]:
select ← 1 3 0 0 5 0 7 0 0 1
select/data

_Replicate_ and _Compress_ apply along the given axis in higher-rank arrays, either via _Replicate first_ (`⌿`) or by specifiying the axis explicitly with the [_bracket axis notation_](https://help.dyalog.com/latest/#Language/Primitive%20Operators/Axis%20with%20Dyadic%20Operand.htm), `/[axis]`:

In [42]:
m ← 3 3⍴9?9
]DISPLAY m

In [43]:
select ← 0 1 0

In [44]:
select⌿m ⍝ Replicate first

In [45]:
select/m ⍝ Replicate

## Pick `⊃`

Yet another way to index into arrays is to use [_Pick_](http://help.dyalog.com/18.0/index.htm#Language/Primitive%20Functions/Pick.htm). _Pick_ eh... picks _elements_, not boxes, which often comes in handy. A monadic _Pick_ picks the first element.

In [49]:
⎕ ← m ← 3 3⍴(1 2 3)(3 2 1)(4 5 6)(5 3 1)(5 6 8)(7 1 2)(4 3 9)(3 7 6)(4 5 1)

Element at 1;1 - note, no box:

In [48]:
(⊂1 1)⊃m

First element - note, no box:

In [46]:
⊃m

## Reach indexing

_Reach indexing_ is how you access elements of nested arrays. Note that nested arrays carry with them performance penalties and are best avoided if at all possible.

In [21]:
⎕ ← G ← 2 3⍴('Adam' 1)('Bob' 2)('Carl' 3)('Danni' 4)('Eve' 5)('Frank' 6)
G[⊂(0 1)0] ⍝ First element of the vector nested at ⊂0 1 of G
G[((0 0)0)((1 2)1)]

## Assignable indexing expressions

As we saw above, bracket indexing is _assignable_, meaning that we can mutate the array. It is not the only assignable indexing expression in APL. The full list of _selective assignment functions_ is available from the Dyalog [documentation](http://help.dyalog.com/latest/index.htm#Language/Primitive%20Functions/Assignment%20Selective.htm). It's worth studying this manual page, as it unlocks quite a few crafty ways of getting data into arrays.

For example, we can change the diagonal of a matrix by assigning directly to a [dyadic _Transpose_](./dyadictrn.ipynb) by noting that `0 0⍉m` is the main diagonal of the matrix `m`:

In [19]:
]DISPLAY m ← 3 3⍴9?9
(0 0⍉m) ← ¯1 ¯1 ¯1 ⍝ 0 0⍉m is the main diagonal.
]DISPLAY m

Indeed, we can even assign via Boolean indexing expressions, which might not be immediately obvious:

In [20]:
data   ← 0 1 2 3 4 5 6 7 8 9
select ← 0 0 1 0 1 1 0 1 1 0

(select/data) ← ¯1 ¯1 ¯1 ¯1 ¯1
data

Perhaps even less obvious is assigning to _Take_:

In [1]:
⎕ ← s ← 'This is a string'
(2↑s) ← '**'
s

...or even _Compress each_:

In [22]:
s←'This' 'is' (,'a') 'string' 'without' 'is.'
((s='i')/¨s)←'*'
s