# Glyphiary

> Are you quite sure that all those bells and whistles, all those wonderful facilities of your so called powerful programming languages, belong to the solution set rather than the problem set?
\--_Edsger Dijkstra_

Learning what each glyph does is an unavoidable chunk of time investment, and there are some mnemonic cues sometimes based on where they sit on the keyboard, or that related functions sometimes have glyphs that are visually similar. Other times all bets are off: here's looking at you, `/`...

We're not going to cover them all. Learn them a few at a time as the need arises. Use the language bar in RIDE. But let's run through some of the immediately handy ones. 

In [1]:
⎕IO ← 0
{}⎕SE.UCMD'box on -s=max -t=tree -f=on'

Let's have a random matrix for our demonstration purposes. We've met _Reshape_, `⍴`, already, but we'll get dyadic `?` -- called _Deal_ -- for free. It gives us a random selection of numbers from a set, without replacement:

In [2]:
⊢mat ← 3 4⍴12?12 ⍝ Ladies and gentlemen: our matrix

## Tally, Depth, Match: `≢≡`

Tally, monadic `≢`, gives the number of major cells in an array, kind of like Python's `len()`:

In [3]:
≢7 5 1 2 9
≢'Hello world'
≢mat

Pretty straight-forward. Monadic _equal-underbar_, `≡` is _depth_ -- the max level of nesting:

In [4]:
≡1 2 3 4
≡(1 2)(3 4)
≡((1 2)(2 3))((4 5)(6 7))
≡(1 2)(3 4)3

The last case, giving ¯2, means that the max depth is 2, but that not all cells are at the same depth. Depth is not rank. Say it with me: depth is not rank, depth is not rank, depth is not rank...

Turning to the dyadic forms, `≡` is _match_, and with a pleasing visual symmetry, `≢` is _not-match_. We can somewhat simplified think of match being "deep equals for arrays": same rank, same order, same depth, every element the same:

In [5]:
1 2 3 4 ≡ 1 2 3 4 5
1 2 3 4 ≡ 1 2 3 4
1 2 3 4 ≡ 4 1 2 3
1 2 3 4 ≡ 1 4⍴1 2 3 4

## Transpose, Reverse and Rotate: `⌽⊖⍉`

Three glyphs are dedicated to transposing, reversing and rotating arrays. They all look like a circle and a line: `⌽⊖⍉`. For example:

In [6]:
transpose ← ⎕ ← ⍉mat
reverse ← ⎕ ← ⌽mat
revfirst ← ⎕ ← ⊖mat

The two reverse glyphs mirror the issue we've seen with reduce (`/`) vs reduce-first (`⌿`) -- if you can, use the -first versions (the leading axis versions), and if you want to apply them along other axes, use either [rank](./rank.ipynb) `⍤` or the `[axis]` notation:

In [7]:
⊢⊖⍤1⊢mat ⍝ Apply reverse-first to second axis using rank
⊢⊖[1]mat ⍝ Apply reverse-first to second axis bracket-axis

Both transpose and reverse can be applied dyadically, too, which presents us with a slight conundrum: dyadic transpose requires a deeper understanding of APL that we don't yet have -- we'll push that one to its own [chapter](./dyadictrn.ipynb) later on. 

Dyadic `⊖` is actually _rotate first_:

In [8]:
1 2 ¯1 0⊖mat

Here the left argument vector specifies the per-column magnitude and direction of the rotation. 

## Mix, Split, Take and Drop: `↓↑`

Mix `↑` raises the rank by 1. Easiest to visualise as a means of turning a nested vector into a matrix (but works for any rank):

In [9]:
⊢v ← 'Hello' 'world'
↑v

Split `↓`, unsurprisingly, goes the other way, reducing rank:

In [10]:
⊢m ← 3 3⍴9?9
↓m

Mix and Split, when combined with Transpose, make for a bit of a power-combo, `↓⍉↑`, occasionally dubbed _remix_, or _zip_:

In [11]:
⊢v ← (0 6 3)(2 5 1)(4 7 8)
↓⍉↑v

Whilst it's tempting think of remix as the `zip()` found in for example [Python](https://docs.python.org/3/library/functions.html#zip), note that it most likely behaves differently to what you're used to:

In [12]:
↓⍉↑(1 2 3 4)(5 6)(7 8 9 10)

```python
Python 3.9.0 (default, Nov 15 2020, 06:25:35) 
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = [[1,2,3,4],[5,6],[7,8,9,10]]
>>> list(zip(*l))
[(1, 5, 7), (2, 6, 8)]
```

APL abhors ragged arrays and will inject the "prototype" element for whatever the element type is to ensure that all cells are the same size -- Python has no concept of array as such, and so abandons play if an element can't be filled. Mixing a vector with cells of unequal numbers of elements in each cell will show us what happens:

In [13]:
↑(1 2 3 4)(5 6)(7 8 9 10)

Dyadically, Mix and Split become _Take_ and _Drop_. Take ... takes cells: 

In [14]:
1↑(1 2 3 4)(5 6)(7 8 9 10) ⍝ Take 1
2↑(1 2 3 4)(5 6)(7 8 9 10) ⍝ Take 2

Note carefully the fact that Take returns _cells_, not elements, even if you take 1. Recalling the [indexing](./indexing.ipynb) chapter, Take 1 is equivalent to Squad-0, not Pick-0:

In [15]:
0⌷(1 2 3 4)(5 6)(7 8 9 10) ⍝ Squad-0 returns a cell
0⊃(1 2 3 4)(5 6)(7 8 9 10) ⍝ Pick-0 returns an element

We can also use negative numbers to take from the rear:

In [16]:
¯1↑(1 2 3 4)(5 6)(7 8 9 10) ⍝ Take 1 from the back

Drop does what we hopefully expect:

In [17]:
1↓(1 2 3 4)(5 6)(7 8 9 10) ⍝ Drop 1 from the front
¯1↓(1 2 3 4)(5 6)(7 8 9 10) ⍝ Drop 1 from the back

Take and Drop works on any rank array:

In [18]:
mat
1↑mat ⍝ Take first cell
1↓mat ⍝ Drop first cell

## Interval and index-of: `⍳`

Iota, `⍳`, called [interval](https://aplwiki.com/wiki/Index_Generator) (or _index generator_) when used monadically and [index-of](https://aplwiki.com/wiki/Index_Of) when used dyadically is one to figure out early. Note that there is another glyph that looks similar, iota-underbar, `⍸`, that does something entirely different, so don't confuse the two!

The monadic case generates an integer interval, starting from `⎕IO`:

In [19]:
⍳10

Thinking of this as a monadic function taking a shape vector, this generalises to more complex shapes:

In [20]:
⍳3 4

In other words, iota generates all possible _indices_ into an array with the shape of its argument.

In the dyadic form, iota becomes [index of](https://aplwiki.com/wiki/Index_Of), another useful thing to know. Index-of tells us the index of the first occurrence of an element:

In [21]:
'Hello world'⍳'o'

The right argument can have any shape, but the left argument is usually restricted to a vector.

In [22]:
'Hello world'⍳'od'

 A nifty feature is that if the right element isn't found, the returned index is `1+≢⍺` -- one more than the length of the left argument. This can be used to provide a default match for items not found:

In [23]:
⊢staff ← 'Adam' 'Bob' 'Charlotte'
lookup ← staff,⊂'Not found'
lookup[staff⍳'Bob' 'David']

## Ravel, Catenate, Enlist, Member: `,⍪∊`

Ravel, monadic `,` and Enlist, monadic `∊` do related things: ravel creates a vector of the major cells of its argument, and enlist creates a vector of the _elements_ of its argument. For non-nested arrays, there is no difference:

In [24]:
simple ← 3 4⍴3 0 5 1 7 9 8 6 2 10 11 4
⊢ravel ← ,simple
⊢enlist ← ∊simple

For a nested array, the difference is clearer:

In [25]:
⊢nested ← ↑((2 3)(4 5))((6 7)(8 9))
⊢ravel ← ,nested
⊢enlist ← ∊nested

In their dyadic guises, `,` becomes [catenate](https://aplwiki.com/wiki/Catenate), and `∊` becomes [member](https://aplwiki.com/wiki/Membership). 

Catenate merges its left and right arguments:

In [26]:
1 2 3 4 , 5 6 'hello'
1 2 3 4 5 6 , 'hello'

The distinction above is perhaps not obvious - and without `]box on` they would look identical. Catenate _ravels_ its right argument.

Note that catenate is trailling axis. There is a leading axis version, too, `⍪`, called _laminate_ (or, perhaps more logically, catenate-first).

We can catenate higher-rank arrays, too:

In [27]:
(3 3⍴⍳9),(3 3⍴⍳9) ⍝ Catenate-last (new cols)
(3 3⍴⍳9)⍪(3 3⍴⍳9) ⍝ Catenate-first/laminate (new rows)

Dyadic `∊` is _membership_, another handy glyph in your arsenal:

In [28]:
'l'∊'Hello world'

It's not unlike Python's `in`:

```python
>>> 'l' in 'Hello world'
True
```
at least at a superficial level. The APL version extends naturally to higher-rank arrays:

In [29]:
'lo w'∊'Hello world'

whereas Python would see that as _is substring_:

```python
>>> 'lo w' in 'Hello world'
True
```

You can of course get a similar substring behaviour in APL, too, but you need a different approach:

In [30]:
'lo'(⍸⍷)'Hello world' ⍝ Index of start of substring

but we need a bit more flesh on our APL bones before we're ready for that -- see the 'find' section later!

## Selfie, Commute, Constant: `⍨`

A firm favourite, the _selfie_, mirroring the confused look of the APL neophyte: `⍨`. A monadic operator, the selfie commutes the left and right arguments of its operand function. At first, this seems beyond pointless -- worse, in fact: it seems to offer nothing but added, deliberate obfuscation:

In [31]:
⊢v ← ⍳5
v↑⍨1     ⍝ Take-1 but commute arguments ⍨

As it turns out, it has its legitimate uses. Consider the consequences of APL's right to left evaluation order. If you have a dyadic function application with a complex expression to the _left_, you're forced to introduce parenthesis to ensure that the left side is fully evaluated before it is passed to the function. The commute operator -- by shifting the complex expression to the _right_ side, avoids this.

Compare the following two equivalent forms (disregarding for the moment what they mean):

In [32]:
{⍵⊂⍨1,2≠/⍵}
{(1,2≠/⍵)⊂⍵}

So you could say "big deal, _one_ glyph fewer to type", and you'd have a point. But the main advantage is that with the selfie, we can preserve the right-to-left evaluation order. By having parenthesised part of the expression, we have an unnatural evaluation order. With a few well-placed selfies we can read the expression as Ken intended. 

As with anything, there's a balance to be struck here. For the learner, selfies do make expressions harder, not easier, to read. The process of "flipping selfied expressions" occasionally helps when trying to deconstruct something someone else wrote.

If selfie is given an _array_ operand, it becoms _constant_: it always returns its left argument, kind of like a left-tack, or `{⍺}`:

In [33]:
1⍨42
1⍨¨⍳14

If we give no left argument we also get _constant_ but in a slightly different way. It echoes its right argument to its left:

In [34]:
=⍨1 2 3 4 5
1 2 3 4 5 = 1 2 3 4 5

If APL didn't already have a _tally_ function built in (`≢`) we could make one as the sum-reduction of constant-equal to count the number of elements in a vector, say:

In [35]:
tally ← +/=⍨

In [36]:
tally 1 2 3 4 5

## Unique, Union, Intersection, Without: `∪∩~`

We have the full complement of relational algebra operations at our disposal. Starting with _unique_, monadic `∪`, it does exactly what it says on the tin:

In [37]:
∪1 1 2 2 3 3 4 4 5 5 6 6
∪'hello world'

In its dyadic version, `∪` becomes _union_:

In [38]:
1 1 2 3 4 ∪ 1 2 5 6

Note that the arguments aren't proper sets. The above says "take ALL elements in the left argument, and add any element from the right which isn't already present".

_Intersection_, dyadic `∩`, works similarly:

In [39]:
1 1 2 3 4 ∩ 1 2 5 6

For each element to the left, keep it if it's also in the right.

Dyadic `~` is _without_ -- set difference:

In [40]:
1 1 2 3 4 5 ~ 1 3 5

The monadic `~` is boolean _not_:

In [41]:
~1 0 1 1 0 0 1

## Grade up/down: `⍋⍒`

To me, it's a christmas tree and a carrot, but these twins are called _grade up_, `⍋` and _grade down_, `⍒`. They are APL's very clever mechanisms for ordering arrays. To sort an array, we do:

In [42]:
⊢data ← 11 19 24 4 15 21 2 20 10 13 23 3 1 17 12 22 14 16 6 18 5 9 8 7 0
data[⍋data]

So what does the grade-up actually do? Let's have a look:

In [43]:
⍋data

Grading an array (up or down) produces a set of _indices_, not values. Consider the first element in the grade array. It says: the smallest element is to be found at index 24. The second-smallest is at index 12. The third smallest at index 6 etc.

At first blush, this seems like a roundabout way to sort something. First generate an indexing expression, then select elements according to this indexing expression. However, doing it this way -- separating the determining of the order from the reordering of elements -- has a number of advantages, chiefly that we can as easily apply the ordering to another array, not just the one that we generated the ordering from.

In any sort of data processing or analysis, this crops up all the time: give the customer names, ordered by contract date. Sort the keys based on the values. That sort of thing. You can also answer questions such as _where_ is the smallest value?

In [44]:
⊢minidx ← ⊃⍋data ⍝ Index of smallest value: first element of grade-up
data[minidx]