**Notes & questions:**
 
 - `0÷0` and `0*0` are mentioned. would it make sense to discuss briefly why the results of these expressions are mathematically incorrect in the specialist's section about `0÷0`?
 
 - The section about reduction had a warning about using `/` with non commutative functions, which I think is very pertinent. It just looks awkward in its original place, because it comes at the end of the subsection on reducing binary data (where we only used commutative functions) and before a couple other subsections where we only use commutative functions. I moved the warning to a small sub-section right before the "Application" sub-section, [here](#Reduction-With-Non-Commutative-Functions).
 
(Re: above) this makes sense if we take into account there already exists a really small subsection that warns the user about reduction of nested arrays, in which the shape and type of the sub-arrays must be compatible for the reduction. This new subsection on the non-commutative reductions is even larger than this one, so I guess it makes sense to have both as sub-sections OR both as warnings, although I prefer it like this.
 
 - Right above [Application 1](#Application-1) the author claims reduction with non-commutative functions is rare in business applications. I find it easy to believe, but is it true? Is it worth checking? Should I leave the claim as-is? Remove it?
 
 - Check if we can use the term "major cells" in the [General Case](#General-Case) of the *Index Of* function.
 
 - Check if the [Simple, Not Nested](#Simple,-Not-Nested) subsec is correct, and the whole section on *Where* in general.
 
 - The Index Generator was refactored, because many of its subsections moved elsewhere. "[Application 4](#Application-4)" and "[Comparison of Membership and Index Of](#Comparison-of-Membership,-Index-Of-and-Where)" moved to the section on ⍸ and the latter got slightly renamed, and "[Idioms](#Idioms)" moved to the end of the section on catenate. These changes need to be checked.
 
---

**References to fix:**

 - update reference to the Appendix on all the scalar dyadic functions (search the text for "is given in Appendix");
 - fix reference to appendix 4 right above [this](#Reduction-of-Binary-Data) (search for "It is also possible to write your own operators");
 - update reference to the chapter in the beginning of [this](#Scalar-vs.-Non-scalar-Functions) section;
 - update reference to the chapter after the `Bin` code cells in [this](#Reduction-of-Binary-Data) section (search for 'from the chapter on "Data and Variables"');
 - update reference to section on "Reduction of Binary Data" (search for "With what we learned in the section") in the [application 1](#Application-1);
 - update reference to "operators" chapter (search for "We shall learn more about *Axis*") in the sub sub section about Processing Arrays, inside the section on axis;
 - update reference to the "Application 2" subsection (search for "The expression we wrote in the") right in the beginning of [Our First Program](#Our-First-Program);
 - in the same place as above, fix reference to chapter on "User Defined Functions" (search for "these will be covered in detail");
 - fix reference to "user Defined Functions" chapter (search for "we shall see many other possibilities in the") right before the [Concatenation](#Concatenation) section;
 - ref to "Nested Arrays" chapter under [Replication](#Replication) (search for "is new and will be discussed in full");
 - ref to "Application 4" under [Discovery](#Discovery) (search for "there is another function that we can use");
 - ref to Specialist's Section under [Index Generator](#Index-Generator) (search for "you will find more information in the *Specialist's Section* at the end of this chapter");
 - ref to Discovery right after [Application 4](#Application-4) (search for "occurrence of a value in a vector (cf. the "Discovery" subsection)");
 - ref to The Index Function in [Increasing The Dimension](#Increasing-The-Dimension) (search for "(cf. the subsection with the same name)");
 - ref to Basic Approach: Compression in [First Question](#First-Question) (search for "As mentioned previously in the subsection about *Compression*");
 - In [Ravel](#Ravel) we talk about subsection on "Array Indexing" (search for "the result returned is a vector (we already mentioned this in the subsection on "Array Indexing")");
 - ref to "Our First Program" in [Empty Vectors and Black Holes](#Empty-Vectors-and-Black-Holes) (search for "a function to calculate the average of a vector (see the "Our First Program" section)");
 - ...

---

# Some Primitive Functions

## Definitions

In APL data is processed using what we call *functions*. It is important to distinguish between two types of functions:

 1. ***Primitive Functions***:
     - they are part of the APL language;
     - they are represented by symbols like `⍴`, `⍉` and `⌈`;
     - they cannot be overwritten or removed.


 2. ***User-Defined Functions***:
     - as their name implies, they are written by the user;
     - they are represented by names, for example `Average` or `Budget`;
     - they can be overwritten and removed.
   
APL has a very rich set of primitive functions. In this chapter, we will explore just a few of them; many others will follow in subsequent chapters.

In the introduction to this book, we mentioned that in traditional mathematics, some symbols can be used with a single argument or two arguments. For example:

| | | |
| :- | :- | :- |
| In the expression | $a = x - y$ | the minus sign means subtract. |
| Whereas in | $a = -y$ | the minus sign indicates the negation of $y$. |

The first form is called the "**_dyadic_**" use of the symbol.
The second form is called the "**_monadic_**" use of the symbol.

It is the same in APL, where most of the symbols (functions) have a *monadic* and a *dyadic* meaning. For example, here `⍴` obtains the shape of the `1 2 3 4` vector:

In [1]:
⍴ 1 2 3 4

Whereas in here `⍴` changes the shape of the `1 2 3 4` vector to `2 2`:

In [2]:
2 2 ⍴ 1 2 3 4

There is, however, a major difference. In traditional mathematics, the symbol representing a monadic function is sometimes placed before its argument (as the $-$ in $a = -y$), sometimes after it (as the $!$ in $a = y!$), sometimes on both sides (as the $|\cdot|$ in $a = |y|$), and some other conventions may be found.

In APL, the symbol representing a monadic functions is **always** placed before its argument, as the `⍴` in `⍴Var`.

## Some Scalar Dyadic Functions

### Definition and Examples

***Scalar dyadic functions*** are primitive functions which have the following properties:

 - they are *dyadic* (require an argument on both sides);
 - they work item by item (scalar by scalar);
 - they can work on two arrays of the same shape, in which case the result also has the same shape;
 - they can work on one array of any shape, and a single value (a ***scalar*** or any one-item array), in which case the result has the same shape as the non-singleton array.
 
The four basic arithmetic functions, ***Addition***, ***Subtraction***, ***Multiplication*** and ***Division*** are scalar dyadic functions. They apply themselves between each item of the left argument and the corresponding item of the right argument, like this:

In [3]:
5 3 2 9 + 2 6 8 4

The function is applied between each item of two 4-item vectors. The result is also a 4-item vector.

As an example of a function that is *not* a scalar function, let us look at the ***Reshape*** function. There is nothing in common between the shapes of its arguments:

In [4]:
2 3 ⍴ 6 8 2 1 9 3

In fact, the left argument has 2 items, the right one has 6 and the result in this case is a matrix.

Let us explore the behaviour of the basic arithmetic functions on vectors:

In [5]:
5 3 2 9 - 2 6 8 4

In [6]:
5 3 2 9 ÷ 2 6 4 7

In [7]:
Price ← 5.2 11.5 3.6 4 8.45
Qty ← 2 1 3 6 2
Costs ← Price × Qty
Costs

Scalar dyadic functions apply to arrays of any rank and shape.

As we saw in the introduction, a Sales Director makes forecasts for sales of 4 products over the coming 6 months, and assigns them to the variable `Forecast`. At the end of the 6 months, they record the actual values in the variable `Actual`. Here they are:

In [8]:
⎕RL ← 73
Forecast ← 10×?4 6⍴55
Forecast

In [9]:
⎕RL ← 73
Actual ← Forecast + ¯10+?4 6⍴20
Actual

```{admonition} Remark 
:class: tip
We initialise the `Forecast` and `Actual` variables with some random values by the use of the ***Roll*** function `?`. Notice this assignment is easier to type than some predefined set of values and we can use `⎕RL` to always get the same result. You can learn more about `?` and `⎕RL` later on in the book.
```

The first thing any self-respecting Sales Director will want to know is the difference between the expected and the actual results. This can be done easily by typing:

In [10]:
Actual - Forecast

Notice how subtracting two matrices gives a matrix of the same shape (recall that negative values are indicated by a high minus sign).

But remember, a scalar dyadic function may also be applied between a single value and an array of any shape.

For example, if we want to multiply `Forecast` by 2, we can type:

In [11]:
Forecast × 2        ⍝ same as 2 × Forecast

A complete list of ***Scalar Dyadic Functions*** is given in Appendix 1.

### Division By Zero

An expression such as `17÷0` leads to an error message:

In [12]:
17÷0

DOMAIN ERROR: Divide by zero
      17÷0
        ∧


This happens because zero does not belong to the domain of valid denominators.

However, notice what `0÷0` returns:

In [13]:
0÷0

Despite being mathematically incorrect, the default behaviour gives the result 1, as given by the extension of the rule that "any number divided by itself should give 1". Nevertheless, because this is sometimes inappropriate, it is possible to change the default behaviour (see the *Specialist's Section*).

### Power

In APL, the mathematical notation $A^n$ is written `A*n`.

The function *Power* (`*`) accepts any value(s) for `n`: integer or decimal, positive, negative, or zero, according to traditional usage.

To calculate the values of: $4^2$, $4^{1.4}$, $4^0$, $\sqrt 4$, $4^{-2.1}$, $4^5$ we just need to type

In [14]:
4 * 2 1.4 0 0.5 ¯1 ¯2.1 5

`0*0` gives `1`, which is also not mathematically correct but a fairly useful convention to take.

There is no special symbol in APL to represent a square root; it is obtained by raising a value to the power $\frac12$. If we take the square root of a negative number, then we get a complex number as a result:

In [15]:
¯1 * 0.5 

### Maximum & Minimum

***Maximum*** (`⌈`) and ***Minimum*** (`⌊`) return respectively the larger of two values and the smaller of two values, whatever their signs. Because they are scalar dyadic functions, they can be applied item by item between any two compatible arrays.

In [16]:
75 ⌈ 83

In [17]:
19 ⌈ 11 22 ¯20 60

In [18]:
52 14 ¯37 18.44 ⌊ ¯60 15 ¯40 11.23

*Minimum* can be used to apply a limit to the values in an array. For example, to set a ceiling of 450 in the matrix `Forecast`, it is sufficient to type:

In [19]:
Forecast ⌊ 450

Notice how the larger values have been capped to 450.

### Relationship

As in traditional mathematics, APL provides the 6 relationship functions:

| APL | Meaning |
| :-: | :- |
| `A < B` | $A$ less than $B$ |
| `A ≤ B` | $A$ less than or equal to $B$ |
| `A = B` | $A$ equal to $B$ |
| `A ≥ B` | $A$ greater than or equal to $B$ |
| `A > B` | $A$ greater than $B$ |
| `A ≠ B` | $A$ not equal to $B$ |

These symbols are obtained by pressing the <kbd>APL</kbd> key, simultaneously with the keys <kbd>3</kbd> to <kbd>8</kbd>, respectively.

All these 6 functions return `1` if the relation is true, or `0` if it is false.

In [20]:
11 < 7

In [21]:
24 ≤ 24 11 33

In [22]:
5 = 9

In [23]:
3 8 7 ≥ 5 8 0

In [24]:
6 > 2 3⍴7 2 9 3 6 4

The results are called binary, or Boolean, values (Boolean refers to the name of the mathematician George Boole). They can be processed in many different ways and are extremely useful, as we shall soon see.

Note that none of the four symbols `<` `≤` `≥` `>` can be applied to character arrays. Only `=` and `≠` can be used with character arrays, as illustrated below:

In [25]:
'm' = 'm'

In [26]:
'm' = 'M'

In [27]:
'k' ≠ 'a'

In [28]:
'sorry' ≠ 'r'

Because these functions are scalar dyadic functions, they are applied between individual scalars (the letters), not words:

In [29]:
'gold' ≠ 'gulf'

For the same reason, the two words (considered as vectors) must be of equal size, otherwise we get an error:

In [30]:
'male' ≠ 'female'

LENGTH ERROR: Mismatched left and right argument shapes
      'male'≠'female'
            ∧


### Residue

The *Residue* function, represented by `|`, returns the remainder of a division.

In the expression `R ← X|Y`, `R` is the remainder of `Y` divided by `X` (**be careful**; the arguments of *Residue* are given in the reverse order of that used by *Division* `Y÷X`).

In [31]:
7 | 54

In [32]:
2 | 216 47 29 28        ⍝ Find even and odd numbers

In [33]:
X ← 7 4 11 ¯4.3 3 ¯5 6 ¯3
Y ← 54 84 119 19.6 29 43 ¯14 ¯14
X | Y

The function can be used with negative and decimal values, as seen above.

The result `R` is always equal to `Y - (N×X)`, where `N` is the largest possible integer such that `R` is always between `0` and `X`, but never equal to `X`. Some calculations show that `N` is equal to `⌊(Y÷X)`, which means `X|Y` and `Y - (X × (⌊(Y÷X)))` give the exact same result:

In [34]:
(X|Y)

In [35]:
(Y - (X × (⌊(Y÷X))))

Using a new dyadic primitive `≡` (obtained with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>;</kbd>) we can ask APL to check if the two arrays are the same, instead of having to compare each array item by item:

In [36]:
(X|Y) ≡ (Y - (X × (⌊(Y÷X))))

Keep reading the next section to see how `≡` works.

### Array Comparison

We have seen above that the dyadic functions `=` and `≠` are scalar, which meant we cannot use them when we wish to check if two arrays are the same or when we wish to check if two arrays are different:

In [37]:
1 2 3 4 = 1 2 3 4

The result above says the arrays have matching items, but doesn't provide the summary information that *the arrays are the same*.

For a similar reason, we can't even compare vectors of different lengths because dyadic scalar functions expect the shapes of their arguments to match:

In [38]:
'male' = 'female'

LENGTH ERROR: Mismatched left and right argument shapes
      'male'='female'
            ∧


Instead, we must use the ***Match*** (`≡`) function to check if two arrays are exactly the same:

In [39]:
1 2 3 4 ≡ 1 2 3 4

In [40]:
'male' ≡ 'female'

The ***Not Match*** (`≢`) primitive (obtained with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>'</kbd>) is the counterpart to `≡` and checks if two arrays are different:

In [41]:
1 2 3 4 ≢ 1 2 3 4

In [42]:
'male' ≢ 'female'

## Order of Evaluation

Like other programming languages, APL allows the programmer to use parentheses to specify the order of evaluation of a complex expression. Thus the expression `5×(6+7)` means "add 6 to 7, then multiply by 5". In the absence of parentheses, most other programming languages employ rules of precedence to decide how a complex expression such as `5×6+7` would be evaluated. Typically, the result will be 37 because multiplication is given precedence over addition and is performed first.

When APL was designed, it was decided that the sheer number of primitive functions meant that a set of precedence rules would be impossibly complex to remember and apply. Instead, APL follows the traditional algebraic conventions.

The solution adopted in APL is simple, and consistent with the rules we apply to calculate complex expressions in traditional algebra. Suppose, for example, that we need to calculate

$$\log \sin \sqrt {x ÷ 3}$$

To do this, we would first divide $x$ by 3, then take the square root of the result, next calculate its sine, and finally calculate the logarithm: each function applies to the result of the entire expression to its right. This is how it is done in mathematics, and so it is in APL. The only different is that in APL there are *no* exceptions!

To evaluate

In [43]:
5 × 6 + 7

we first calculate

In [44]:
6 + 7

and then multiply by 5, giving 65:

In [45]:
5 × 13

By the use of parentheses we can instruct APL to do the multiplication first,

In [46]:
(5 × 6) + 7

which an experienced APL programmer would probably write as

In [47]:
7 + 5 × 6

```{admonition} Rule 
:class: tip
In an APL expression, each function takes as its right argument the result of the entire expression to its right. No functions have higher precedence than any others.

If the function is dyadic (takes both a left and a right argument), it takes as its left argument the array immediately to its left, delimited by the next function.

This is sometimes called "***Right to left evaluation***" (although this is not strictly correct).

If necessary, one can use ***Parentheses*** to force a different order of evaluation.
```

You must not be confused: each function is itself evaluated in its natural order, so `8÷4` gives 2, not 0.5! The expression "*right to left*" only means that the first operation executed is the rightmost one.

If the order of evaluation seems strange to you at first sight, just refer to a plain English sentence: "*take the top half of the bottom quarter*" does not mean "*take the top half first, and then take the bottom quarter*"; it means "_**first** split into quarters and take the bottom one, **then** split that quarter into two halves and take the top half of it_": this is exactly the way that APL works! Even in everyday English language, which we write from left to right, we implicitly use the "right to left evaluation" rule.

Let us apply this rule to some examples:

In [48]:
3×5+1

First we sum,

In [49]:
5+1

then we multiply:

In [50]:
3×6

Now an example with 3 operations:

In [51]:
3 6⌊4+2 9>7

First we perform the "greater than" comparison,

In [52]:
2 9>7

then we add 4

In [53]:
4+0 1

and finally we take the minimum:

In [54]:
3 6⌊4 5

An even more complex example, from the end of the subsection on the *Residue* (`|`) primitive function:

In [55]:
(X|Y)≡Y-X×⌊Y÷X

Notice that in the example above we still kept a set of parentheses around `X|Y`, because we want `≡` to compare `X|Y` with `Y-X×⌊Y÷X`. To get rid of those parentheses one would need to introduce an intermediate variable, for example like so:

In [56]:
R ← X|Y
R≡Y-X×⌊Y÷X

```{admonition} Warning 
:class: warning
In the beginning you may encounter some surprises. For example, if `V` is a vector, `1+⍴V` is different from `⍴V+1`. Let us see why, with the following vector:
```


In [57]:
V ← 5 2 7

The value of

In [58]:
1+⍴V

is

In [59]:
⍴V

followed by

In [60]:
1+3

Whereas the value of

In [61]:
⍴V+1

is

In [62]:
V+1

followed by

In [63]:
⍴6 3 8

which gives a different result.

This may be completely new to people who have experience with other programming languages, and is one of the reasons why we **recommend** that you do all of the exercises at the end of this chapter. With a little practice, you will soon find this simple rule very natural, and you will consider it a relief that you do not have to remember complex rules for function precedence.

## Monadic Scalar Functions

Most of the symbols we have encountered so far also have a monadic definition; let's look at them now.

### The Four Basic Symbols

We will begin with the four basic symbols: `+` `-` `×` `÷`

#### Conjugate

The Plus sign used monadically is the ***Conjugate*** function. It returns the ***complex conjugate*** of its arguments:

In [64]:
+ 0J1

When the numbers have no complex part, the function acts as the identity function, as the complex conjugate of a real number is the number itself:

In [65]:
+ 1 2 3 ¯3 6.5346 56J0

For compatibility reasons, monadic `+` also acts as the identity function if the argument is a character:

In [66]:
+ 'a'

Because it is a scalar function, it naturally works with mixed arrays:

In [67]:
+ 'B' 3 'a' ¯3 0J1 'g' 4J¯4

#### Negative

The Minus sign is the ***Negative*** function. It returns the negation of its argument:

In [68]:
- 19 11 ¯33 0 ¯17

#### Signum

The Multiply symbol used monadically is the ***Signum*** function. It tells us the sign of its argument, using the following convention:

| Result | Value |
| -: | :- |
| `1` | The value is positive |
| `0` | The value is zero |
| `¯1` | The value is negative |

Some examples follow:

In [69]:
× 19 11 ¯33 0 ¯17

#### Reciprocal

No surprise: the Divide symbol gives the ***Reciprocal*** or ***Inverse*** of its argument:

In [70]:
÷ 2 ¯4 .3 .25 ¯7

### Other Scalar Monadic Functions

#### Exponential

The expression `*N` gives $e^N$, where $e$ is the base of the natural logarithm, approximately $2.71828$:

In [71]:
*1

In [72]:
* 1 0 3 ¯1

#### Floor and Ceiling

***Floor*** (`⌊`) rounds its argument down, while ***Ceiling*** (`⌈`) rounds its argument up, to the nearest smaller or larger integer value, respectively:

In [73]:
V ← 51.384 48.962 0 12.5 ¯73.27 ¯9.99
⌊V

In [74]:
⌈V

To round a value to the nearest integer a commonly used method is to add 0.5 and then take the *Floor*, or alternatively, to subtract 0.5 and take the *Ceiling*, as shown here:

In [75]:
⌊V+0.5

In [76]:
⌈V-0.5

The results are the same, except when the decimal part of a number if 0.5. For example, 12.5 above gets rounded to 13 or 12 depending on which method is used.

#### Magnitude (Absolute Value)

The monadic stile represents the absolute (unsigned) value of its argument, as shown:

In [77]:
V ← 29.2 49.3 ¯14.8 0 ¯37.2
|V

Monadic `×` and monadic `|` are related:

In [78]:
(|V) ≡ (V××V)

The above equality, which holds for any numeric array `V`, tells us that the magnitude of a number is equal to the number multiplied by its signum.

## Left and Right Tacks

### Same

APL also includes two primitive functions whose monadic name is the same: ***Same***. These two primitives are the ***Right Tack*** (`⊢`) and the ***Left Tack*** (`⊣`), obtained respectively with <kbd>APL</kbd>+<kbd>\\</kbd> and <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>\\</kbd>.

Their monadic name is *Same* because they act as true identity functions, returning their argument completely unchanged:

In [79]:
⊢ 1 2 3

In [80]:
⊢ 'Banana'

In [81]:
⊢ 'Bag' 3 (1 2 'Bag') ('Bag' ('Bag' 3) (2 2⍴1 2 3 4))

Exactly the same thing happens (i.e. nothing happens) if you replace all the `⊢` above with `⊣`, for example:

In [82]:
⊣ 'Bag' 3 (1 2 'Bag') ('Bag' ('Bag' 3) (2 2⍴1 2 3 4))

These two functions may look useless, but they are not. For one, a neat little trick we can do with them is to display a value of a variable that we just assigned; for example, compare the two code cells below:

In [83]:
Matrix ← 2 3⍴1 2 3 4 5 6        ⍝ Nothing gets displayed explicitly

In [84]:
⊢Matrix← 2 3⍴1 2 3 4 5 6        ⍝ The value of Matrix is displayed 

In [85]:
⊣Matrix← 2 3⍴1 2 3 4 5 6        ⍝ The value of Matrix is displayed

These functions become even more useful if we take a look at their dyadic usages (explained next), combined with the power of operators or tacit programming (explained in detail in future chapters).

### Dyadic Usage

The *Left* and *Right Tacks* differ in their dyadic usage, of course, otherwise there would be no point in having two primitive functions that behaved in exactly the same manner.

When used dyadically, the *Left Tack* returns its left argument and the *Right Tack* returns its right argument. If you ever forget which dyadic tack returns what, just remember that each tack returns the argument to which it is pointing:

In [86]:
1 2 3 ⊣ 'Bananas'

In [87]:
1 2 3 ⊢ 'Bananas'

## Processing Binary Data

```{admonition} Remark 
:class: tip
Binary values are most often produced by the comparison functions that we have already seen. However, the result of *any* function (such as addition or subtraction) which is composed only of 1s and 0s can be used as a binary (or Boolean) value, and may be used as an argument to any of the special primitive functions that apply to Boolean values.
```

Among the various ways of producing binary results, *Membership* appears to be one of the most interesting tools.

### Membership

 - ***Membership*** tells whether the items of its left argument are present (`1`) or not (`0`) in the right argument, regardless of their position in it;
 - it accepts arguments of any shape or type;
 - the result produced always has the same shape as the **left** argument.
 
Some examples will help you understand the function:

In [88]:
23 14 41 19 ∊ 17 88 19 50 51 52 23 40

This means that 23 and 19 appear somewhere in the rightmost vector, whereas 14 and 41 do not. The left argument has 4 items, and so has the result.

The *Membership* function can operate on arguments of completely different shape. For example, it is possible to detect the presence of each item of a vector in a matrix, or vice versa.

In an earlier chapter we used a matrix containing the first 6 months of the year:

In [89]:
⊢MonMat ← 6 8⍴'January FebruaryMarch   April   May     June    '

We can ask if certain letters are present in this matrix:

In [90]:
'December' ∊ MonMat

The result shows that all letters of the word "December" appear in `MonMat`, except "D" and lowercase "m" (which should not be confused with the uppercase "M" or March and May).

In this case we used a vector left argument and a matrix right argument. Let's try it the other way around. The following expression tells us which letters in the matrix `MonMat` appear in the word "Century":

In [91]:
MonMat ∊ 'Century'

As you might imagine, any comparison between numbers and letters gives zero:

In [92]:
1952 ∊ '1952'

In [93]:
'1952' ∊ 1952

Remember that `'1952'` is a vector of 4 letters, none of which can be found in the number 1952.

We **recommend** that you do exercise 22 to discover all possibilities of *Membership*.

### Binary Algebra

Binary values can be processed using half a dozen specialised primitive functions, the main ones being ***And***, ***Or***, ***Xor*** and ***Not***. Additional functions will be described in the *Specialist's Section*.

The function ***And*** is represented by the symbol `∧` (<kbd>APL</kbd>+<kbd>0</kbd>), as it is in mathematics. It returns the result 1 if the left ***and*** the right arguments are both equal to 1:

In [94]:
0 ∧ 0

In [95]:
0 ∧ 1

In [96]:
1 ∧ 0

In [97]:
1 ∧ 1

We can condense the four expressions above into a single one, given that `∧` is also a scalar function:

In [98]:
0 0 1 1 ∧ 0 1 0 1

The function ***Or*** is represented by the symbol `∨` (<kbd>APL</kbd>+<kbd>9</kbd>), as it is in mathematics. It returns the result 1 if the left ***or*** right argument is equal to 1.

The four possible cases are shown in the following expression:

In [99]:
0 0 1 1 ∨ 0 1 0 1

***Xor*** is an acronym for ***eXclusive Or***. It returns the result 1 if one of the arguments is equal to 1, but not if both are equal to 1.

In automation, the same function is generally represented by a circled Plus sign, like $\oplus$.

APL does not need a different symbol for the function, because ***Xor*** is the same as on of the comparison functions we have already met: `≠`

In [100]:
0 0 1 1 ≠ 0 1 0 1

The last function is the monadic function ***Not***. Represented by the *Tilde* `~` (<kbd>APL</kbd>+<kbd>T</kbd>), it converts 0 into 1 and 1 into 0:

In [101]:
~ 0 1 0 0 0 1 1

```{admonition} Remark 
:class: tip
- ***And***, ***Or*** and ***Xor*** are scalar dyadic functions;
- ***Not*** is a scalar monadic function;
- ***Membership*** is a dyadic function, but it is not a scalar function.
```

All these functions can be applied to binary data of any shape. For example, let us see if any of those items of `Forecast`, which are greater than 350 thousand Euros, have been exceeded by `Actual` sales:

In [102]:
⊢bin ← (Forecast>350) ∧ (Actual>Forecast)

A side note: the parentheses around the rightmost expression `(Actual>Forecast)` are not strictly needed. However, they do no harm either, so we have added them here to help you read the expression, since you may not yet be fully familiar with APL's order of evaluation.

### Without

Given a vector `A` and any array `B`, the expression `A~B` returns a vector equal to `A`, but in which all items of `B` have been removed. The size and shape of `B` is immaterial, only the individual items of `B` are used.

This function is called ***Without***.

In [103]:
'This Winter is warm' ~ MonMat

Notice how the right argument above (`MonMat`) is a matrix.

In [104]:
'Congratulations' ~ 'ceremony'

The uppercase "C" is preserved here because it is different from the lowercase "c".

In [105]:
Matrix

In [106]:
0 2 4 6 8 10 12 ~ Matrix

Of course, it also works on numbers.

## Processing Nested Arrays

When working with nested arrays, it is important to recognise whether or not you are using a scalar function.

### Scalar vs. Non-scalar Functions

In the "Data and Variables" chapter, we set up a nested vector `Children`, which is composed only of numeric items:

In [107]:
)copy DISPLAY
Children ← (6 2) (35 33 26 21) (7 7) 3 (19 14)
DISPLAY Children

The application of scalar functions is straightforward.

For example, when we add 50 to `Children`, the value 50 is added to each of the items of `Children`. As these items are themselves scalars or vectors, adding 50 means adding 50 to each of *their* individual items. This process continues through all levels of nesting, ensuring that 50 gets added to all the individual items of `Children`. The result is therefore structurally identical to `Children`:

In [108]:
DISPLAY Children + 50

One way of expressing this behaviour is to say that the scalar functions (both the dyadic and monadic ones) permeate down through the structure of nested arrays, until they reach the lowest-level items, and then apply themselves at this level. They are said to be ***pervasive*** functions.

Non-scalar functions, like *Membership*, are not pervasive.

In [109]:
What ← 19 (6 2) 3 (33 26)
What ∊ Children

The item `(6 2)` of `What` is also an item of `Children`, hence *Membership* gives the answer 1, and the same is true for the value 3.

In contrast 19 is only an item of the fifth item of `Children`, it is not an *entire* item of `Children`. Because a non-pervasive function processes each item as a whole, 19 is not the same as (19 14), so the answer is 0. The same goes for (33 26), which is only part of the second item of `Children`.

### Be Careful With Shape/Type Compatibility

It is easy to add a vector of 5 scalar items to `Children`, because each of the 5 scalars can be added to the corresponding item of `Children`:

In [110]:
Children + 10 20 30 40 50

But if we try to add a vector of 5 sub-vectors to `Children`, we must ensure that the shape of each sub-vector is compatible with the shape of the corresponding item of `Children`, for example

In [111]:
Children + (4 8) (5 7 4 9) (1 ¯1) (100 200 500) (14 51)

or

In [112]:
Children + (4 8) 0 (1 ¯1) (100 200 500) ¯100

If there is any incompatibility, a `LENGTH ERROR` is issued:

In [113]:
Children + (1 2)(2 3)(3 4)(4 5)(5 6)

LENGTH ERROR: Mismatched left and right argument shapes
      Children+(1 2)(2 3)(3 4)(4 5)(5 6)
              ∧


All of the items of our vector could have been added to the corresponding items of `Children` except the second one. APL has detected and signalled this error.

You must also be careful if a nested or mixed array contains character data; it will not be possible to apply any arithmetic function to the array as a whole.

### Tally

We have seen the dyadic *Match* `≡` and *Not Match* `≢` functions, and now we will see how we can use their monadic versions to work with (nested) arrays.

We mentioned ***Tally*** briefly in the first chapter, which is the monadic use of `≢`. ***Tally*** does exactly what its name suggests: it *counts* the amount of items an array has along its first dimension.

For a vector, this corresponds to its length:

In [114]:
≢ 1 2 3 4 5 6

In [115]:
≢ 'ui'

A nested vector is still a vector:

In [116]:
(4 5) 'a' 'Bag' (35 'Cat' 42)

In [117]:
≢ (4 5) 'a' 'Bag' (35 'Cat' 42)

For higher dimensional arrays, *Tally* returns the size of its first dimension:

In [118]:
≢ 3 3⍴9 8 7 6 5 4 3 2 1

In [119]:
≢ 2 5 27⍴0

### Depth

Dyadic `≡` and `≢` are very closely related, but their monadic versions aren't. The monadic function ***Depth*** `≡` helps us work with nested arrays, in that it helps us count how many levels of nesting there are.

The result of `≡` is really simple to understand. First of all, the depth of a scalar is 0:

In [120]:
≡1

In [121]:
≡'a'

In [122]:
≡4525324

Then, the depth of an array is 1 larger than the depth of its deepest item, or sub-array. For example, a simple vector like `1 2 3 4` only has scalars as items, whose depths are 0, so the depth of the vector will be 1.

In [123]:
≡1 2 3 4

Now let us consider a nested vector composed of simple vectors like the one we just saw.

In [124]:
⊢Nested ← (1 2 3 4) (3 5) (10 20 30)

The items of `Nested` are vectors of depth 1, so the depth of `Nested` should be 2:

In [125]:
≡Nested

`≡` has one more thing to it: if the nesting of the sub-arrays is not uniform then the result will be negative.

For example, `1 2 3 4` has depth 1 and `42` has depth 0, so a vector composed of these two sub-arrays has depth -2:

In [126]:
≡(1 2 3 4) 42

## Reduction

### Presentation

A few pages ago we calculated the costs of some purchased goods:

In [127]:
⊢Costs ← Price × Qty

How much did we spend?

In [128]:
10.4 + 11.5 + 10.8 + 24 + 16.9

But writing things out like this is cumbersome and depends on us looking at the scalars in the `Costs` vector. What if the `Price` or the `Qty` changes?

Mathematicians are creative people who long ago devised the symbol $\sum$, always with a pretty collection of indices above and below, that is used to indicate the sum of some numbers. This symbol makes it complex to understand and is difficult to type on a keyboard.

In APL, the operation is written like this:

In [129]:
+/ Costs

Simple, isn't it? This expression gives the total of all the items in the vector. You can read this as "***Plus Reduction***" of the variable `Costs`.

To gain a better understanding of the process:

| | |
| :- | :- |
| When we write an expression such as | `+/ 21 45 18 27 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; it works as if we had written | `21 + 45 + 18 + 27 + 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; and we obtain the sum | `122` |
   
In fact, it works as if we had "inserted" the symbol `+` between the values.

| | |
| :- | :- |
| So, when we write | `×/ 21 45 18 27 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; it is as if we had written | `21 × 45 × 18 × 27 × 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; so, we get the product | `5051970` |
| Similarly, when we write | `⌈/ 21 45 18 27 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; it is as if we had written | `21 ⌈ 45 ⌈ 18 ⌈ 27 ⌈ 11` |
| &nbsp;&nbsp;&nbsp;&nbsp; so, we obtain the largest item | `45` |

And so on...

```{admonition} Exercise 
:class: hint
Try to evaluate the following expression in your head or with pen and paper: `23⌈ ⌈ ⌈/ 17.81 21.41 9.34 16.53`

Don't panic! Remember to evaluate it symbol by symbol, from right to left.
```


### Definition

Reduction, represented by the symbol `/`, belongs to a special category of symbols called ***Operators***.

In most programming languages the word *operator* is used to describe operations like addition, subtraction, multiplication, and so on. In APL such operations are called *Functions*; typical examples are `+`, `-`, `×` and `⍴`. The word *operator* has a separate meaning in APL.

In APL a ***function*** works on an **array** or between two arrays to produce a result:

In [130]:
Price × Qty

Whereas an ***operator*** applies to one or two ***operands*** (its "arguments") to produce what we call a *derived function*. That is, after we use the *operator* on its *operands*, we get a (derived) function which may then be used with an array, or between two arrays, to produce a result:

In [131]:
+/

This is the representation APL gives to the *Plus Reduction*, where the tree-like structure shows that `+` is an operand to `/`. We may then use this derived function with an array in order to get a result:

In [132]:
⊢Stock ← +/ Qty

In the expression above, the symbol `/` is the operator. It takes the function `+` as its single operand ("argument") and produces the derived function `+/`. This derived function is then applied to `Qty`, giving a result which is assigned to `Stock` and displayed with `⊢`.

Please note that the argument to a monadic function is always to the right of the function, whereas the function applied to a monadic operator (its operand) is always to the left of the operator.

Many, although not all, of the APL primitive functions may be used as the operand to *Reduction*; you can even apply a user-defined function. This generality makes Reduction, and other operators, extremely powerful.

Dyalog APL provides a total of around 20 such powerful *operators*, listed in Appendix 4. It is also possible to write your own operators, just as it is possible to write your own functions.

### Reduction of Binary Data

Among the typical usages of *Reduction* are `∧` and `∨` applied to binary data.

 - `∧/ Bin` gives the result `1` if ***All*** the items of `Bin` are equal to 1;
 - `∨/ Bin` gives the result `1` if ***At least one*** of the items of `Bin` is equal to 1;
 - `+/ Bin` tells us ***How many*** items of `Bin` are equal to 1.
 
You can verify it on some small examples:

In [133]:
⊢Bin ← 1 1 1 0 1 0 1

In [134]:
∧/ Bin

In [135]:
∨/ Bin

In [136]:
+/ Bin

In [137]:
⊢AllOnes ← 1∨Bin

In [138]:
∧/ AllOnes

Let us revisit the vector named `Contents`, from the chapter on "Data and Variables":

In [139]:
⊢Contents ← 12 56 78 74 85 96 30 22 44 66 82 27

Now we will answer some questions about the values of the items of `Contents`:

Are all the values greater than 20?

In [140]:
∧/ Contents > 20

The answer is no.

Is there at least one value smaller than 30?

In [141]:
∨/ Contents < 30

The answer is yes.

How many values are smaller than 30?

In [142]:
+/ Contents < 30

The answer is 3.

### Reduction of Nested Arrays

When you apply reduction to a nested array, you must check that the items of the nested array are compatible (in shape and type) with the function that you intend to apply:

In [143]:
+/ (4 8) (1 4) 10 (9 5)

The expression above works because all 2-item vectors can be added together, and a single scalar (the 10) can be added to an array of any shape, because `+` is a scalar function.

Notice, however, that in the expression below the 3-item vector cannot be added to the other 2-item vectors, so APL reports an error:

In [144]:
+/ (4 8) (1 4) (1 2 3) (9 5)

LENGTH ERROR: Mismatched left and right argument shapes
      +/(4 8)(1 4)(1 2 3)(9 5)
      ∧


### Reduction With Non Commutative Functions

Another thing to be careful about is the use of reduction with non commutative functions, like `-` or `÷`. Reducing an array by such a function yields results which may be counter-intuitive, but which may nevertheless be useful in a number of applications.

For example, remember that

In [145]:
-/ 45 9 11 2 5

is equivalent to

In [146]:
45 - 9 - 11 - 2 - 5

which, by APL's order of evaluation is equivalent to:

In [147]:
45 - (9 - (11 - (2 - 5)))

If, instead, we use the traditional mathematical convention that interprets

$$45 - 9 - 11 - 2 - 5$$

as

$$(((45 - 9) - 11) - 2) - 5$$

we get the result 18.

This kind of "alternating series" can be useful for some calculations, although only rarely for business applications.

### Application 1

The employees of a company are divided into three hierarchical categories, denoted simply 1, 2 and 3. Two variables contain the salaries and the categories of these employees; we define them below and then use `⍴` to show some of their initial values:

In [148]:
⎕RL ← 73
Salaries ← ?100⍴5000
10⍴Salaries

In [149]:
⎕RL ← 73
Categories ← ?100⍴3
10⍴Categories

We can see the salaries of the first three employees are, respectively, 2121, 4778 and 4914 (of some currency) and their respective categories are 1, 2 and 2.

With what we learned in the section about "Reduction of Binary Data" we can also find out how many employees belong in the third category:

In [150]:
+/ Categories = 3

Now the employees ask for an increase in their salaries. Each category of employee requests a different percentage increase, as shown in the following table:

| Category | Upgrade |
| :-: | :-: |
| 1 | 8% |
| 2 | 5% |
| 3 | 2% |

How much is that going to cost the company?

Let us just create a variable containing the three rates shown above:

In [151]:
⊢Rates ← 8 5 2 ÷ 100

The first employee is in category 1, so the rate that applies to this person is

In [152]:
Rates[1]

More generally, the rates applied to all of our employees can easily be obtained with `Rates[Categories]`:

In [153]:
10⍴Rates[Categories]

Having the rates, we only have to multiply them by the salaries to obtain the individual increases:

In [154]:
10⍴ Salaries × Rates[Categories]

Finally, by adding them all together, we discover how much it will cost the company:

In [155]:
+/ Salaries × Rates[Categories]

Note that:

 - the expression remains valid regardless of the number of employees or categories;
 - the result has been obtained without writing a program (no loops, no tests);
 - this expression can be phrased in the simplest possible English, namely:
 
 > *Sum the Salaries multiplied by Rates according to Categories*
 
This illustrates how the implementation of a solution in APL can be very close to the way that the solution would be expressed in everyday language. It also shows the advantage of not having to deal with trivial and "irrelevant" matters such as looping, memory allocation, declarations, etc. before a working solution can be developed.

### Application 2

Imagine now that we want to calculate the average of a set of values, for example the values contained in the variable `Contents`.

To do that, we must:

 - add all the values:

In [156]:
+/ Contents

 - count how many values we have:

In [157]:
≢Contents

 - divide one by the other:

In [158]:
(+/Contents) ÷ (≢Contents)

The result is 56.

Again, because of APL's simple rule for the order of evaluation, the rightmost set of parentheses could be omitted.

## Axis Specification

### Totals in an Array

#### Processing Arrays

We have seen the result of applying reduction to vectors, but what about matrices and higher rank arrays?

As an example, let us recall the array `Prod`. Its three dimensions represent respectively:

 1. 5 years;
 2. 2 assembly lines;
 3. 12 months.

In [159]:
⎕RL ← 73
⊢Prod ← ?5 2 12⍴50

We can calculate totals along any of these 3 dimensions: years, lines and months.

We specify the dimension (or ***Axis***) between brackets after the *Reduction* symbol:

```APL
+/[Axis] Prod
```

For example, suppose we want to calculate the total monthly production values over the 5 years. Years are represented by the 1st dimension of `Prod`, so we write:

In [160]:
+/[1] Prod

We obtain a 2 by 12 matrix, giving the production of the 2 assembly lines, month by month. If we were to divide this matrix by 5, we would get the average production for each month, per assembly line.

Now, let us add up the production numbers of the two assembly lines. Lines are represented by the 2nd dimension of `Prod`, so we write:

In [161]:
+/[2] Prod

We obtain a 5 by 12 matrix, with the total production of both assembly lines, month by month, in each of the 5 years.

And finally, let us calculate the annual production of each assembly line. Months are represented by the 3rd dimension of `Prod`, so we write:

In [162]:
+/[3] Prod

The result is a 5 by 2 matrix, in which the columns contain the annual production of the two assembly lines in each of the five years.

#### Axis Is Like an Operator

The dimension specified within brackets is the axis along which the function is applied.

This produces a derived function, and for this reason, the pair of *Axis* brackets is often called the ***Axis Operator***.

The syntax for *Axis* does not quite follow the general syntax for operators, but it shares all other properties with genuine operators. *Axis* takes a function as its left operand (the derived function `+/` in the last example above), the dimension specification as its "right operand" (3 in the example), and produces a derived function, which is applied to `Prod` to calculate the annual sums.

Viewed as an operator *Axis* is therefore dyadic. It is, however, important to emphasise that its "right" argument is not `Prod`, it is the expression within the brackets. This is the first example of an operator that takes an array as an operand. We will find some more as we explore operators later on.

#### Processing Arrays

We shall learn more about *Axis* in the "Operators" chapter; let us first explore another simple use of this operator.

Suppose that we would like to multiply each of the rows (or columns) of a matrix by different values; we can use *Axis* to specify whether we multiply row-wise or column-wise. First, here is a matrix:

In [163]:
⎕RL ← 73
⊢Tam ← ?3 5⍴9

Let us multiply row-wise:

In [164]:
Tam×[1]5 2 10

And now column-wise:

In [165]:
Tam×[2]2 5 0 2 1

### The Shape of the Result

The dimensions of `Prod` are

In [166]:
⍴Prod

and we can see that the dimensions of `+/[1]Prod`, `+/[2]Prod` and `+/[3]Prod` depend on this shape. In fact, when the axis is 1, the shape of the result is the original shape without the 1st item:

In [167]:
⍴+/[1]Prod

When the axis is 2, the shape of the result is the original shape without the 2nd item:

In [168]:
⍴+/[2]Prod

And when the axis is 3, the shape of the result is the original shape without the 3rd item:

In [169]:
⍴+/[3]Prod

You can see that *Reduction* of a 3-D array gives a 2-D array, in which the summed dimension has "disappeared". This is the origin of the term "*Reduction*"; it *reduces* the *rank* of the array.

This rule will help you predict the dimensions of the result of a reduction.

```{admonition} Rule 
:class: tip
When ***Reduction*** is applied along the Nth dimension of an array, the shape of the result is the same shape of the array, but without its Nth item.

The ***Rank*** of the result is 1 less than the rank of the original array.
```

Whenever you want to calculate the sum along a particular dimension of an array, think of the dimensions in terms of concrete things: years, lines, months, etc. This should help you.

### Special Notations

By default, if no axis is specified, reduction is applied along the **last** dimension of the array.

This means `+/Prod` and `+/[3]Prod` are the same,

In [170]:
(+/Prod)≡(+/[3]Prod)

much like `+/Forecast` and `+/[2]Forecast` are the same:

In [171]:
(+/Forecast)≡(+/[2]Forecast)

But it is also common to work along the first dimension of an array. For this reason, APL includes a special symbol for reduction along the first dimension: `⌿` (you can type it with <kbd>APL</kbd>+<kbd>/</kbd>).

This means `+⌿Prod` and `+/[1]Prod` are the same,

In [172]:
(+⌿Prod)≡(+/[1]Prod)

much like `+⌿Forecast` and `+/[1]Forecast` are the same:

In [173]:
(+⌿Forecast)≡(+/[1]Forecast)

```{admonition} Note 
:class: note
If one specifies an axis after the symbol `/` or `⌿`, the function is applied along the specified *Axis*, regardless of the symbol that is actually used.
```

We demonstrate this note with two examples:

In [174]:
(+⌿[3]Prod)≡(+/[3]Prod)

In [175]:
(+⌿[1]Forecast)≡(+/[1]Forecast)

## Our First Program

The expression we wrote in the "Application 2" subsection to calculate the average of a set of values is one that we may want to use time and time again. So let us store it as a program, or, to use the proper APL terminology, as a ***User Defined Function***.

There are many different ways to define functions, and these will be covered in detail in the "User Defined Functions" chapter. For now we shall use the simplest, which is perfectly suitable for straightforward calculation functions like this one. Let's type:

In [176]:
Average ← {(+/⍵)÷(≢⍵)}

 - `Average` is the name of the function. It is followed by the definition, delimited by a pair of curly braces `{` and `}`;
 - `⍵` is a generic symbol that represents the array that will be passed as the right argument of the function;
 - `⍺` is a generic symbol that represents the array that will be passed as the left argument of the function, if any.
 
The symbols `⍵` and `⍺` are obtained using <kbd>APL</kbd>+<kbd>W</kbd> and <kbd>APL</kbd>+<kbd>A</kbd>, respectively.

For more complex multi-line functions it is obviously more appropriate to use a text editor. However, this is beyond the scope of this chapter.

Once defined, this function may be invoked directly, just as if it were a built-in (*primitive*) function:

In [177]:
Average Salaries

In [178]:
Average 12 74 56 23

The word `Average` can now be used in any APL expression. We have enriched the vocabulary which can be used to process data in this workspace (provided that we save it).

Be patient: we shall see many other possibilities in the "User Defined Functions" chapter.

## Concatenation

***Concatenation*** is a dyadic function which joins two arrays together. It is represented by comma (`,`). The function name is normally abbreviated to ***Catenate***, and we will use both terms.

### Concatenating Vectors

*Catenate* is easy to understand:

In [179]:
A ← 24 15 67 89
B ← 11 33 75
A,B

It is like joining two sentences together, so it is easy to remember which symbol to use. You can see that the length of the result (`≢(A,B)`) is equal to the sum of the lengths of the arguments (`(≢A) + (≢B)`):

In [180]:
((≢A)+(≢B))=≢A,B

Character strings are processed in the same way:

In [181]:
C ← 'Tell me'
D ← 'More'
C,D

Note that there is no space inserted between the contents of the two vectors, just like there was no number inserted between `A` and `B` when we concatenated them above. When we concatenate a vector of 7 characters (like `C`) with a vector of 4 characters (like `D`), the result must have 11 characters.

When you concatenate an empty vector to another vector, the result is the same as the original. Let us define an empty numeric vector `V`:

In [182]:
⊢V ← 0⍴0

We could use `⍬` instead.

Notice how the numeric vector `A` remains unchanged:

In [183]:
A,V

Similarly, concatenating character strings with `''` does nothing:

In [184]:
C,'',D

### Concatenating Other Arrays

It is possible to concatenate two arrays if their shapes are compatible. The axis along which the concatenation is to be performed must be specified, if it is different from the default.

Let us use three matrices `A`, `B` and `C`:

In [185]:
⊢A ← 3 4 ⍴ 'A'

In [186]:
⊢B ← 2 4 ⍴ 'B'

In [187]:
⊢C ← 3 3 ⍴ 'C'

The possible concatenations are:

 - the vertical concatenation of `A` and `B`:

```{figure} ../res/Vertical_Concatenation_AB.png
---
name: Vertical_Concatenation_AB
---
Matrices `A` and `B` concatenated together vertically
```

 - and the horizontal concatenation of `A` and `C`:
 
```{figure} ../res/Horizontal_Concatenation_AC.png
---
name: Horizontal_Concatenation_AC
---
Matrices `A` and `C` concatenated together horizontally
```

It is not possible to concatenate `B` and `C` because none of their dimensions are compatible.

To concatenate vertically, we want the two matrices to be stacked on top of each other, creating a matrix with more *rows*, i.e. we want to create a matrix with a larger 1st dimension:

In [188]:
A,[1]B

The shape of the result is

In [189]:
⍴A,[1]B

and writing

In [190]:
B,[1]A

puts `B` on top of `A` instead.

Similarly, to stack matrices `A` and `C` horizontally we create a matrix where the 2nd dimension is larger:

In [191]:
A,[2]C

The shape of the result is

In [192]:
⍴A,[2]C

and writing

In [193]:
C,[2]A

puts `C` to the left of `A` instead.

In the same way as for *Reduction*, the *Axis* operator indicates which dimension will change during the operation, as we can see by inspecting the shapes of `A`, `B`, `C` and the stacked matrices:

In [194]:
⍴A

In [195]:
⍴B

In [196]:
⍴A,[1]B

Using `,[1]`changes the 1st dimension.

Using `,[2]` instead, we change the 2nd dimension:

In [197]:
⍴A

In [198]:
⍴C

In [199]:
⍴A,[2]C

The following nested array shows the four possible different catenations side by side:

In [200]:
(A,[1]B) (B,[1]A) (A,[2]C) (C,[2]A)

```{admonition} Rule 
:class: tip
It is possible to ***Concatenate*** two arrays `A` and `B` along their *I*<sup>th</sup> dimension, provided that they have the same rank, and provided that all other dimensions have the same lengths.

The operation is written like this: `A,[I]B`

It is also possible to ***Concatenate*** an array `A` of rank *N* to another array `B` of rank *N-1*. The concatenation **must** then be done along a dimension of `A` such that its other dimensions are strictly identical to those of `B`.
```

For example, it is possible to concatenate a vector to a matrix, provided that the vector has the same length as the corresponding dimension of the matrix:

In [201]:
A,[1]'JUMP'

In [202]:
A,[2]'TOP'

In the first example, `'JUMP'` has length 4 and the shape of `A` is `3 4`, so we can see `'JUMP'` as having shape `1 4` and so we **must** catenate along `[1]`. The shape of the result is

In [203]:
⍴A,[1]'JUMP'

and we see it was the 1st dimension that changed.

In the second example, `'TOP'` has length 3 and the shape of `A` is `3 4`, so we can see `'TOP'` as having shape `3 1` and so we **must** catenate along `[2]`. The shape of the result is

In [204]:
⍴A,[2]'TOP'

and we see it was the 2nd dimension that changed.

```{admonition} Example 
:class: tip
We can add a row of totals to the bottom of a matrix `Y` with an expression like `Y,[1] (+/[1]Y)`:
```


In [205]:
Forecast,[1] (+/[1]Forecast)

The parentheses are for ease of interpretation; they are not necessary.

In a similar way, it is possible to concatenate a matrix to a 3-D array.

```{admonition} Example 
:class: tip
We would like to append to `Prod` (a 3-D array) the production of a subcontractor, organised as an array of 5 years and 12 months.
```


In [206]:
⎕RL ← 73
⊢Subcon ← ?5 12⍴20

The shape of `Subcon` is

In [207]:
⍴Subcon

and the shape of `Prod` is

In [208]:
⍴Prod

The two **must** be concatenated along the 2nd dimension of `Prod` and the result will have the shape

In [209]:
⍴Prod,[2]Subcon

You see, it is as if `Subcon` had the length 1 along the concatenation dimension (the missing one), i.e. it as if `Subcon` had shape `5 1 12`.

In [210]:
Prod,[2]Subcon

### Concatenating Scalars

When a scalar is concatenated to an array it is repeated as many times as necessary to match the length of the appropriate dimension of the array.

Here are two examples, using the matrix `A` from before:

In [211]:
A,[1]'-'

In [212]:
A,[2]'*'

This property is very useful, because it saves us working out how many items are needed to match the corresponding dimension of the array.

We can also concatenate two scalars. The result is of course a 2-item vector:

In [213]:
7,9

### Special Notations

By default, if no axis is specified catenation works along the last dimension of the array(s).

So

In [214]:
A,C

is equivalent to

In [215]:
A,[2]C

In [216]:
(A,C)≡(A,[2]C)

APL also includes a special symbol that means "Concatenate along the **first** dimension"; this symbol is a comma topped by a minus sign: `⍪`.

It can be obtained by <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>,</kbd>.

So

In [217]:
A⍪B

is equivalent to

In [218]:
A,[1]B

In [219]:
(A⍪B)≡(A,[1]B)

```{admonition} Note 
:class: note
If an axis is specified, the operation is processed according to the axis specification, regardless of the symbol (`,` or `⍪`) that is actually used.
```

This means `A,[2]C` and `A⍪[2]C` are both equivalent to `A,C` and `A,[1]B` and `A⍪[1]B` are both equivalent to `A⍪B`.

### Idioms

It is common to see *Catenate* used in conjunction with *Reduce*, as `,/`. Let us start by understanding what it does:

In [220]:
0 (1 2) (3 4 5) (6 7 8 9)

In [221]:
,/0 (1 2) (3 4 5) (6 7 8 9)

In [222]:
'a' 'bcd' 'efghi'

In [223]:
,/ 'a' 'bcd' 'efghi'

We can see above that *Reduce by Catenate* joins the items of a vector all together.

We call ***Idioms*** to these common usages of primitives with each other. These are expressions which can be understood as an entity at first sight (with some practice!).

For someone who knows nothing of APL the expression above may take a bit of time to digest (even if they have an extensive knowledge of other programming languages), and they cannot readily appreciate that an APL programmer can understand it immediately, without having to read each of the symbols one by one.

This is not a paradox. For young children, reading the word "Daddy" is complex. It requires the comprehension of a sequence of letters one-by-one. I presume that you no longer do that, do you? **You do not read the letters**; you understand **the word** as a whole. This is exactly the same for the above idiom.

Incidentally, it is not just the APL programmer who is capable of processing an idiomatic expression in its entirety. Dyalog APL itself includes a special *Idiom Recognition* feature that speeds up the processing of APL code for many popular idioms. The Windows IDE even colours these idioms differently.

We have been using another short idiom fairly often: `≢⍴`, which computes the rank of an array. Other than the idioms automatically recognised by Dyalog APL, there is no objective criteria to determine if a combination of primitives is an idiom or not: it depends on your skill, experience and even the context you are working on.

## Replication

### Basic Approach: Compression

To extract scattered values from a vector, we can use indexing:

In [224]:
Contents[5 6 11]

We can also use a new function named ***Compression*** (or *Compress*). It takes a Boolean vector as its left argument, and any array of appropriate shape as its right argument. The items of the right argument which match the 1s in the left argument are preserved, whereas those which match the 0s are removed. It acts like a mask or a filter:

In [225]:
0 1 1 0 / 42 15 79 66

In [226]:
1 0 1 0 0 0 0 1 1 / 'Drumstick'

This is extremely useful, because we can use *Compression* to select items which match a given condition.

For example, let us extract from `Contents` the values which are greater than 80.

The Boolean vector for the left argument is obtained by `Contents>80`, and the selection is made by:

In [227]:
(Contents>80) / Contents

Of course, the same operation can be applied to any array of any dimension. For higher dimensional arrays, we can use the *Axis* to specify along which axis to compress. For example, if we have a matrix of chemical formulas:

In [228]:
⊢Chemistry ← 3 5⍴'H2SO4CaCO3Fe2O3'

In [229]:
1 0 1 /[1] Chemistry

By using `/[1]` we are compressing the 1st dimension of `Chemistry`, hence selecting two rows, corresponding to the two 1s in the vector on the left.

In [230]:
1 1 0 1 0 /[2] Chemistry

If we use `/[2]` we are compressing the 2nd dimension of `Chemistry`, hence selecting three columns. In this example columns 3 and 5 have been removed.

*Compression* is an excellent tool which allows you to:

 - extract some useful items from a variable;
 - remove some unwanted items from a variable, which is the same thing.
 
```{admonition} Advice 
:class: hint
Every time you obtain a Boolean vector, you should immediately think of two major things you can do with it: *Count* or *select*.
```

For example, using `Contents`, we can produce a Boolean vector that shows which items are smaller than 50:

In [231]:
bin ← Contents < 50

Then, we can:

 - count the items that are smaller than 50:

In [232]:
+/ bin

 - select (or extract) said items:

In [233]:
bin / Contents

```{admonition} Hint 
:class: hint
Programmers who are new to APL and who are familiar with indexing as the natural selection mechanism may be tempted to use the Boolean selection vector to create some indices, and then use the indices to select the desired items.
```

This works very well, for example:

In [234]:
ix ← bin / ⍳⍴Contents
Contents[ix]

However this is an unnecessary complication that wastes memory and processing time, compared to the straightforward selection shown above.

### Replication

In fact, *Compression* is just a special case of a more comprehensive function named ***Replication*** or ***Replicate***. Its left argument can be any vector of integer values, each of which produces the following result:

| Signum of the left item | Effect on the corresponding right item |
| :-: | :- |
| `1` | item is replicated the number of times specified by the left item |
| `0` | item is suppressed |
| `¯1` | item is replaced by as many "***Fill items***" as is indicated by the left item |

The concept of a "Fill item" is new and will be discussed in full in the "Nested Arrays" chapter. For now, you only need to know that the fill item for a simple numeric array is 0 and the fill item for a simple character array is a blank space.

Here are some examples, using the same left argument applied to numeric and character vectors:

In [235]:
0 1 3 0 / 42 15 79 66

42 and 66 have been removed, as their corresponding left items were 0. 15 was replicated 1 time and 79 was replicated 3 times, as indicated by their corresponding left items.

In [236]:
0 1 3 0 / 'boat'

In [237]:
2 ¯3 1 0 / 42 15 79 66

For the example above, 15 has been replaced by 3 zeroes because the fill item for simple numeric arrays is a zero.

In [238]:
2 ¯3 1 0 / 'boat'

For this example, the letter "o" was replaced by 3 blank spaces because the fill item for simple numeric arrays is " ".

### Scalar Left Argument

If the left argument of *Compression* or *Replication* is a scalar, it applies to all the items of the right argument.

In [239]:
v ← 'Phew'
1/v

When the left argument is 1 all the items are retained.

In [240]:
3/v        ⍝ Repeat each letter 3 times.

In [241]:
0/v        ⍝ Repeat each letter 0 times.

In [242]:
''≡0/v

As you can see above, when we use 0 as the left argument to *Replicate* we get an empty vector. Because `v` was a simple character vector, we get an empty character vector.

### Special Notations

Like *Reduction* and *Catenation*, *Replication* works by default along the last dimension of an array. However, it is possible for it to work on any dimension by using the *Axis*. It is also possible to use `⌿`, which we have already seen, to work on the first dimension by default.

For example,

In [243]:
0 1 0 ⌿ Chemistry

which is equivalent to

In [244]:
0 1 0 /[1] Chemistry

Beware, the result obtained this way is **not** a vector, but a matrix having only one row:

In [245]:
⍴0 1 0⌿Chemistry

You must not confuse *Reduction* and *Replication*: even if the symbol used is the same, they are completely different operations:

 - ***Reduction*** takes a function as its left operand; it is a monadic *operator* (e.g. `+/ Contents`);
 - ***Replication*** takes a vector as its left argument and an array as its right argument; it is a dyadic *function* (e.g. `vec/ Contents`);

## Position (Index Of)

### Discovery

It is very often necessary to locate the positions of particular values in a list of items. To solve this, APL has a special function named ***Position*** (also called "***Index Of***"), represented by the Greek letter Iota (`⍳`). This symbol can be obtained by <kbd>Ctrl</kbd>+<kbd>I</kbd> (the initial letter of Iota). Let us see how it works:

In [246]:
Vec ← 15 42 53 19 46 53 82 17 14 53 24
Vec ⍳ 19 14 53 49 15

Above we asked for the positions of 5 values (19, 14, 53, 49 and 15) and naturally we obtain 5 answers.

 - The result tells us that 19, 14 and 15 appear in positions 4, 9 and 1 respectively.

 - The result also tells us that 53 appears in position 3. This is of course true, but it also appears in positions 6 and 10, which are not included in the result. This is a necessary restriction: if we had searched for 5 values and obtained 7 results, it would not have been possible to say where each value appears. This is the reason why ***Index Of*** returns _only the first_ occurrence of each value.
 
We shall see later that this is an advantage: if instead we need to find all the positions in which a value occurs, there is another function that we can use (see the "Application 4" section below).

 - Surprisingly, the result tells us that 49 appears in position 12, though `Vec` has only 11 items! This is the way that *Index Of* indicates a missing value. We shall see that it is a great advantage, too.
 
The following rule explains how dyadic Iota works when the left argument is a vector:

```{admonition} Rule 
:class: tip
In the expression `R ← Haystack ⍳ Needles` we look for the `Needles` in the `Haystack`, and

- `Haystack` can be a vector of any type: numeric, character, mixed, nested;

- `Needles` can be any array (any type, any shape, any rank);

- `R` will have the same rank and shape as `Needles`;

- the items of `R` contain the positions of the first occurrence of the corresponding items of `Needles` in `Haystack`;

- items which do not appear in `Haystack` give the result `1+≢Haystack`.
```


In [247]:
'ABC' ⍳ 57

A number cannot occur in a 3-item character vector, so the result is 4.

In [248]:
4 8 ⍳ '4 8'

Similarly, characters cannot appear in a 2-item numeric vector, so each character results in a 3.

In [249]:
Alpha ← 'ABCDEFGHIJKLMNOPQRSTUVWXYZ 0123456789'
Alpha ⍳ Chemistry

The two lower case letters in `Chemistry` give the answer 38 because `Alpha` has 37 items. Also notice how the shape of `Chemistry` and `Alpha ⍳ Chemistry` is the same, like the rule above specified.

In [250]:
(⍴Chemistry)≡(⍴Alpha⍳Chemistry)

We can also use nested vectors:

In [251]:
'Tee' (3 7) 'Golf' ⍳ 3 7 (3 7) 'Tee' 'Green'

The function *Index Of* is one of the most important primitive functions in APL. It is very flexible and it can be used in many situations, as shown in the following examples.

```{admonition} Warning 
:class: warning
In the expression `A⍳B` we search for `B` in `A` whereas in `A∊B` we search for `A` in `B`. Do not be confused!
```


### Application 3

A car manufacturer decided that they will offer their customers a discount on the catalogue price. The country has been split into 100 geographical areas, and the discount rate will depend on the geographic area according to the following table:

| Area | Discount |
| :-: | :-: |
| 17 | 9% |
| 50 | 8% |
| 59 | 6% |
| 84 | 5% |
| 89 | 4% |
| Others | 2% |

which we save to two vectors:

In [252]:
Area ← 17 50 59 84 89
Discount ← 9 8 6 5 4 2

The first task is to calculate the discount rate to be claimed for a potential customer who lives in area `D`; for example

In [253]:
D ← 84

Let us see if 84 is in the list of favoured areas:

In [254]:
Area ⍳ D

We can see that 84 is the 4th item in the list.

Let us find the current discount rate for this index position:

In [255]:
Discount[4]

This customer can claim a 5% discount.

We could simply write

In [256]:
Discount[Area⍳D]

Now, what if a customer lives in any other area, such as 75, 45 or 93?

The expression `Area⍳D` will return the result `6` for all these area codes, because these values are absent from `Area`.

Then `Discount[6]` will always find the rate 2%, as specified. Here we can see that it is an advantage that *Index Of* returns 1 + the number of items in the vector to be searched.

```{admonition} A Vector Solution 
:class: note
The importance of this approach to finding the discount rates is that it is vector-based. If publicity attracts crowds and therefore `D` is no longer a scalar but a vector, the solution is still valid.
```

As an example, consider the following vector of area codes and the respective discount rates:

In [257]:
D ← 24 75 89 60 92 50 51 50 84 66 17 89
Discount[Area⍳D]

We have achieved all this without a program, neither a "loop" nor a "test". And it works for any number of areas. Readers who know other programming languages will probably appreciate the simplicity of this approach.

#### Changing The Frame of Reference

In reality, the expression that we have just written is an example of an algorithm for "*changing the frame of reference*". Don't panic, this term may seem esoteric, but the concept is simple: a list of area numbers (the initial set) is translated into a list of discount rates (the final set). The algorithm comprises only the function *Index Of* and indexing:

```{admonition} Algorithm 
:class: algorithm
`R ← FinalSet[InitialSet ⍳ Values]`
```

Let us imagine the initial set to be an alphabet composed of both lowercase and uppercase letters, and the final set to be composed of only uppercase letters, with a blank space in the middle:

In [258]:
AlphLower ← 'abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ'
AlphUpper ← 'ABCDEFGHIJKLMNOPQRSTUVWXYZ ABCDEFGHIJKLMNOPQRSTUVWXYZ*'

Now, let us write a sentence; we will write it in French in order to show what happens with missing characters:

In [259]:
Tale ← 'Le Petit Chaperon-Rouge a bouffé le Loup'

If we apply the algorithm seen above, the expression will convert the text from lower to upper case:

In [260]:
AlphUpper[AlphLower⍳Tale]

As one might expect, the characters `'-'` and `'é'`, which are absent from the initial alphabetic set, have been replaced by the `'*'`, the "extra" character at the end of the final set. This works because once again the final set is one item longer than the initial set.

Once more, the logical steps needed to solve the problem are easily translated into a programming solution, and the programmer can thereby direct all his attention to solving the problem.

### General Case

The left argument to *Index Of* need not be a vector. In fact, in the expression `Haystack ⍳ Needles`, `Haystack` can be a much general array of any rank, type and shape, as long as it is _not_ a scalar. However, `Needles` must have a shape that is appropriate for the `Haystack` you use.

We will see what this means by using a 3-D `Haystack` as an example:

In [261]:
⊢Haystack ← 2 3 4⍴1 2 3 4 5 6 7 8 9 0

When *Index Of* is used, the first thing Dyalog APL does is look at its left argument and try to figure out what is the shape of the things contained in the `Haystack`. For that matter, we first inspect the shape of the `Haystack`

In [262]:
⍴Haystack

and interpret it as

 > "*`Haystack` contains 2 items of shape `3 4`*"
 
which means we can look for needles with shape `3 4` inside the `Haystack`. Therefore, for `Needles` to be seen as "*containing items of shape `3 4`*" its shape must also end in `3 4`. Whatever is not the final `3 4` in the shape of `Needles` is what dictates the shape of the final result.

If the shape of `Needles` is too short for it to match the shape of the items the `Haystack` contains, we get a `RANK ERROR` because `Needles` doesn't have enough dimensions.

Similarly, if the shape of `Needles` is long enough but its trailing dimensions don't match those of the left argument, we get a `LENGTH ERROR` because the things we are comparing have different lengths along their dimensions.

In the table below we give examples of some `Haystack` and `Needles` shapes, along with the shape of the items that `⍳` thinks `Haystack` contains and the shape of the result (remember that `⍬` is the shape of a scalar):

| `⍴Haystack` | `⍴Needles` must end with | `⍴Needles` | `⍴R` |
| :- | :- | :- | :- |
| `2 3 4` | `3 4` | `3 4` | `⍬` |
| | | `3 1 3 4` | `3 1` |
| | | `3 1 4 4` | `LENGTH ERROR` |
| | | `1 2 3 3 4` | `1 2 3` |
| | | `3` | `RANK ERROR` |
| `6 2` | `2` | `4 2` | `4` |
| | | `2` | `⍬` |
| | | `3` | `LENGTH ERROR` |
| `5` | `⍬` | `1 2 3` | `1 2 3` |
| | | `1 3` | `1 3` |
| | | `5` | `5` |
| | | `⍬` | `⍬` |

As a practical example, if we take the `Haystack` we defined above, we can look for 3 by 4 matrices in there. Our `Needles` variable has 4 such matrices:

In [263]:
⊢Needles ← 4 3 4⍴1 2 3 4 5 6 7 8 9 0

In [264]:
Haystack ⍳ Needles

This shows that the first and second matrices were found in positions 1 and 2, respectively, and the third and fourth matrices were found nowhere, thus getting 3 as a result. The result is 3 because 2 is the number of *major cells* that `Haystack` has.

Now we write the rule that dictates how *Index Of* works in the general case:

```{admonition} Rule 
:class: tip
In the expression `R ← Haystack ⍳ Needles` we look for the `Needles` in the `Haystack`, and

- `Haystack` can be any array of rank `hr` with `hr` being at least 1;

- `Needles` can be any array of rank `nr` with `nr` being at least `hr-1`;

- the last `hr-1` numbers of the shape of `Needles` and the last `hr-1` numbers of the shape of `Haystack` must be the same;

- `R` has rank equal to `nr-hr-1` and shape equal to the first `nr-hr-1` numbers of the shape of `Needles`;

- the items of `R` contain the positions of the first occurrence of the corresponding items of `Needles` in `Haystack`;

- items which do not appear in `Haystack` give the result `1+m` if `m` is the leading number in the shape of `Haystack`.
```


## Where

The primitive function ***Where*** is the monadic use of the *Iota Underbar* `⍸`, which you can type with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>I</kbd>. The simplest use case for this primitive is to give it a simple Boolean array, for which `⍸` finds ***Where*** the values 1 are.

For example, given the `Contents` vector:

In [265]:
Contents

What items are greater than 75?

In [266]:
Contents > 75

And *Where* are them?

In [267]:
⍸Contents > 75

They are at positions 3, 5, 6 and 11, referring respectively to the elements 78, 85, 96 and 82.

### Application 4

You probably remember that the function *Index Of* returns only the *first* occurrence of a value in a vector (cf. the "Discovery" subsection). Using *Where* we can find *all* the occurrences.

Here is a vector, in which we would like to find the positions of the number 19:

In [268]:
Vec ← 41 17 19 53 42 27 19 88 14 56 19 33

In [269]:
Vec=19

Now that we have a Boolean vector, we can just use `⍸` to find out *Where* the 1s are:

In [270]:
⍸Vec=19

It is as simple as that!

Of course the same search technique will work on characters, because what we really care about is the Boolean vector we generate at a later step, not what the initial vector was. For example, let us find all the letters "a" in a sentence:

In [271]:
Sentence ← 'Panama is a canal between Atlantic and Pacific'
⍸Sentence='a'

Having found all the "a"s, we may wish to find all the lowercase vowels. For that matter, we need to create a Boolean vector with 1s in the positions with lowercase vowels. In the example above, `Sentence='a'` worked because we are allowed to compare a vector with a single scalar, but now we can't change this to

In [272]:
Sentence='aeiouy'        ⍝ 'y' is a vowel in many European languages

LENGTH ERROR: Mismatched left and right argument shapes
      Sentence='aeiouy'        ⍝ 'y' is a vowel in many European languages
              ∧


The code above does not work because `=` is trying to compare the two vectors item by item, only to realise the vector `'aeiouy'` is too short for that. Instead, what we can do is check whether or not each character of `Sentence` is a *member* of the character vector `'aeiouy'`:

In [273]:
Sentence∊'aeiouy'

Having computed this Boolean vector, the finishing touch is to compute *Where* the vowels were found:

In [274]:
⍸Sentence∊'aeiouy'

### Increasing The Dimension

Up until now we only used simple vectors as arguments to *Iota Underbar*, but in the beginning we talked about Boolean *arrays*, not *vectors*.

For higher dimensional arrays *Iota Underbar* behaves the same way: it returns the indices of the positions that contain 1s. Similar to what we did above, we might want to find all the lowercase vowels in the `MonMat` matrix:

In [275]:
MonMat

In [276]:
⍸MonMat∊'aeiouy'

The last item in the result above is `6 4`, which means that row 6, column 4 of `MonMat` contains a lowercase vowel: the final "e" in "June". We can verify this with the index function `⌷` you already learned about (cf. the subsection with the same name):

In [277]:
6⌷MonMat

In [278]:
6 4⌷MonMat

The example above also shows an interesting property of *Where*: it always returns a *vector*, regardless of the shape of the input, and each item of the resulting vector is a suitable index to `⌷`.

### Simple, Not Nested

*Where* does require that its argument be a simple array. If you provide a nested array, `⍸` won't know what to do with it. This is because there is no way of indexing into a nested array with an index vector like `3` or `6 4`, or even something longer.

In fact, if the argument to *Where* is a nested array, you get a `DOMAIN ERROR`:

In [279]:
⍸(0 1)(1 1)

DOMAIN ERROR
      ⍸(0 1)(1 1)
      ∧


### Beyond Boolean Arrays

Now that you have seen the most iconic use of *Where*, you will be shown the full specification for it, as we have only used Boolean arguments so far.

```{admonition} Rule 
:class: tip
In the expression `R ← ⍸Y`:

- `Y` must be a simple Boolean vector or a simple numeric array with non-negative integers;
- `R` is a vector, regardless of the shape of `Y`;
- each non-negative element of `Y` has its index repeated in `R` as many times as the element's value;
- if `Y` only contains zeroes, `R` is an empty vector;
- each element of `R` can be used as `i` in `i⌷Y` to retrieve an element from `Y`.
```

This repetition is similar to the way *Compress* works when the left argument only contains non-negative integers:

In [280]:
3 0 0 2 0 1 / 'words!'

The left argument to *Compress* tells it to use the 1st element 3 times, to use the 4th element 2 times and to use the 6th element 1 time.

The same argument to `⍸` returns the indices repeated as many times as they are needed:

In [281]:
⍸3 0 0 2 0 1

In this case, we could even achieve the same effect as the *Compress* expression above by using `[]` indexing:

In [282]:
'words!'[⍸3 0 0 2 0 1]

This breaks down if we want to include negative integers in the left argument to *Compress*, which *Where* doesn't support.

In fact, if the argument array contains negative integers or other numbers which aren't integers, `DOMAIN ERROR`s are issued:

In [283]:
⍸0 0 1 0 ¯1

DOMAIN ERROR: Where right argument must be non-negative
      ⍸0 0 1 0 ¯1
      ∧


In [284]:
⍸0 0 1 0 0.3

DOMAIN ERROR
      ⍸0 0 1 0 0.3
      ∧


### Comparison of *Membership*, *Index Of* and *Where*

We have discovered two different techniques, using the primitive functions *Membership* and *Where*, that allow us to look up one set of values in another and to determine the positions of the items of one set in the other. Depending on the problem that we have to solve, we can choose which of the two methods will be most appropriate for the job in hand. Consider the following example:

#### Example

A company named "Blue Hammer Inc." has subsidiaries in a number of countries; each country being identified by a numeric code. The country names are stored in a matrix named `Countries`, and the country codes are stored in a vector named `Codes`. To make things easier to read, let us show those two variables:

In [285]:
⊢Countries ← 9 13⍴'France       Great BritainItaly        United StatesBelgium      Swiss        Sweden       Canada       Egypt        '

In [286]:
⊢Codes ← 50 43 12 83 64 34 66 81 37

Now let us show those two variables side by side:

In [287]:
Countries, Codes

So, Sweden is identified by 66 and Belgium is identified by 64.

All the sales made during the last month have been recorded in two vectors:

 - `BHCodes` identifies in which country each sale has been made, and
 - `BHAmounts` identifies the amount of each sale.
 
Here are the two vectors:

In [288]:
⊢BHCodes ← 83 66 12 83 43 66 50 81 12 83 12 66

In [289]:
⊢BHAmounts ← 609 727 458 469 463 219 431 602 519 317 663 631

Some countries have not sold anything (Belgium, for example) whereas other countries made several sales (Italy, for example).

#### First Question

We would like to focus on some selected countries (43, 50, 37 and 66) and calculate the total amount of their sales. Let's first identify which items of `BHCodes` are relevant:

In [290]:
Selected ← 37 43 50 66
BHCodes ∊ Selected

Then we can apply this filter to the amounts, and add them up:

In [291]:
(BHCodes ∊ Selected) / BHAmounts

In [292]:
+/ (BHCodes ∊ Selected) / BHAmounts

An alternative solution is to find the **positions** of the selected countries, then using this set of indices to get the amounts, and add them. The result is, of course, the same:

In [293]:
Positions ← ⍸BHCodes ∊ Selected
+/BHAmounts[Positions]

As mentioned previously in the subsection about *Compression*, it is a kind of detour to solve this task using indexing, but here it serves to illustrate the different lookup methods.

Let us take a look at the selected countries and their positions in `BHCodes`:

In [294]:
Selected

In [295]:
Positions

Using *Membership*, we have obtained `Positions` which contains 5 items for the 4 countries in `Selected`. What does it tell us?

 - `Positions` contains the indices of *all* of the occurrences of the selected countries in the list of sales;
 - however, the items in `Positions` do not correspond to the items in `Selected` on a one-to-one basis; we cannot say that country #43 is in position 2, or country #50 in position 5, and so on;
 - it does not tell us that nothing was sold in country #37. Perhaps it would have been a good idea to identify this?

#### Second Question

Now, let us suppose that we want to display the names of the selected countries. To do this, we must determine the positions of the selected country codes in the entire list of country codes, and get the corresponding names.

If we use the *Membership* approach, here is what we get:

In [296]:
Selected

In [297]:
Positions ← ⍸Codes ∊ Selected
Countries[Positions;] , Selected

(As a side note, above we concatenated a character matrix with a numeric vector, so the result is a *Mixed* matrix)

At first sight, this **seems** to be good: all the selected countries are displayed. However, they are not in the correct order: France is not 37, Sweden is not 50 and Egypt is not 66.

The problem with this method is the lack of a one-to-one correspondence between the selected countries and their positions in the list of sales. The positions will always be in the order that the countries appear in `Countries` - because of `⍸`. However, the Boolean vector `Codes ∊ Selected` is completely independent of the order of the items in `Selected`: the expression returns the same result no matter how `Selected` is ordered.

The correct method to use in order to solve this task is to use the *Index Of* function (dyadic Iota):

In [298]:
Positions ← Codes ⍳ Selected
Countries[Positions;] , Selected

It is the one-to-one relationship between the items of the right argument to *Index Of* (`Selected`) and the items of its result (`Positions`) that guarantees a correct result.

#### Comparison

The following table summarises the most important properties of the two methods:

| Method | Properties |
| :- | :- |
| `Pos ← ⍸List ∊ Data` | the items in `Pos` do not have a 1-to-1 correspondence with the items in `Data` |
| | instead, the items in `Pos` correspond to the items in `List` |
| | `Pos` gives all the positions of multiple occurrences of `List` in `Data` |
| | `Pos` does not explicitly identify missing values |
| `Pos ← List ⍳ Data` | the items in `Pos` **do** have a 1-to-1 correspondence with the items in `Data` |
| | `Pos` ignores multiple occurrences; just gives the first |
| | `Pos` identifies missing values |

The choice of method depends on the kind of problem you want to solve.

## Index Generator

### Basic Usage

When used as a monadic function, the symbol *Iota* generates a vector of the first `N` positive integers. It is called ***Index Generator***.

In [299]:
⍳9

If we have to extract the first 12 items of a vector `Contents`, we can write:

In [300]:
Contents[1 2 3 4 5 6 7 8 9 10 11 12]

It is, of course, much easier to write

In [301]:
Contents[⍳12]

The result can be combined with simple arithmetic operations. For example, supposed we need to produce the following list of 6 values: 115 122 129 136 143 150 (note the increments of 7). We can do this as follows:

In [302]:
⍳6

In [303]:
(⍳6)-1

In [304]:
7×(⍳6)-1

In [305]:
115+7×(⍳6)-1

More generally, any arithmetic series of integers can be produced by the following algorithm:

```{admonition} Algorithm 
:class: algorithm
`R ← Origin + Step × (⍳Length) - 1`
```

```{admonition} Special Case 
:class: note
If `⍳N` gives a vector of length `N`, then `⍳0` should give a vector of length `0`, right? A vector having length 0 is an empty vector. In particular, this will be a *numeric* empty vector.
```

Let us check:

In [306]:
⍳0

Nothing gets displayed above, so it does look like the numeric empty vector `⍬`, which we can confirm with *Match*:

In [307]:
⍬≡⍳0

This was useful in early versions of APL, before the introduction of the *Zilde* symbol (`⍬`).

The definition of `⍳N` given above reflects only a limited part of what this function can do; you will find more information in the *Specialist's Section* at the end of this chapter.

### Application 5

Sometimes, a programmer needs to remove duplicate items from a vector and there is a well-known idiomatic way to do this. The idiom applies equally to numeric, character and nested vectors. Let us begin with a numeric vector:

In [308]:
Vec ← 12 89 57 46 12 50 36 47 83 46 27 12

The algorithm is based on the comparison of two vectors:

In [309]:
⍳≢Vec

which gives the position of each item of `Vec`, and

In [310]:
Vec⍳Vec

which identifies the first index in `Vec` in which each element of `Vec` is found. This may be a bit more complex to understand, so take your time to digest it. We are using `⍳` to identify the positions of the items of `Vec` in `Vec` itself. But because *Index Of* only returns the first occurrences, we get, for each item, the position where this value appears for the first time.

If we then compare these two vectors, we get a Boolean vector indicating which items of `Vec` are their first occurrences and which are repetitions:

In [311]:
(⍳≢Vec)=Vec⍳Vec

Finally, we can use *Compress* to keep only the first occurrences, so that the algorithm is as follows:

```{admonition} Algorithm 
:class: algorithm
`((⍳≢Vector) = Vector ⍳ Vector) / Vector`
```


In [312]:
((⍳≢Vec)=Vec⍳Vec)/Vec

The result above has *no* duplicates.

It also works on character arrays and on nested arrays:

In [313]:
Text ← 'All men are created equal'
((⍳≢Text)=Text⍳Text)/Text

In [314]:
⊢Nested ← 'one' 1 (2 2⍴⍳4) 9 'nine' 'five' 'nine' 1 'two' (2 2⍴⍳4) 'two' 'one'

In [315]:
((⍳≢Nested)=Nested⍳Nested)/Nested

#### Unique

Although still useful as an example, the idiom described above is now completely obsolete, because there is a primitive function in APL that does exactly this.

The primitive, called ***Unique***, is represented by the symbol `∪` (<kbd>APL</kbd>+<kbd>V</kbd>) used as a monadic function:

In [316]:
∪ Vec

In [317]:
∪ Text

In [318]:
∪ Nested

The symbol `∪` is the mathematical symbol for set union and we will see this is *not* a coincidence.

### Application 6

Some applications of *Index Generator* are extremely basic, but so useful! Suppose that you invest a certain sum of money, €6,000 for example, and you expect an interest rate of 4% p.a.. How is the investment expected to grow in the next 5 years?

We will have to calculate 1.04 to the power of 0, 1, 2, 3 and 4. The *Index Generator* will help us:

In [319]:
6000×1.04*(⍳6)-1

## Ravel

The function ***Ravel*** is represented by the monadic use of comma (`,`). Applied to any array, it returns all its items as a vector.

Naturally, if the array is already a vector, *Ravel* does not change anything.

Let us see how it works on some matrices.

In [320]:
⎕RL ← 73
⊢Tests ← ?6 3⍴100

In [321]:
,Tests

The items of the matrix have been strung out and returned as a vector, row by row.

In [322]:
Chemistry

In [323]:
,Chemistry

A common use of ***Ravel*** is to transform a scalar into a 1-item vector. The difference between a scalar and a 1-item vector is not readily obvious, until you use it as an index into a matrix.

Suppose that you need to select a particular set of columns `Cols` from the matrix `Forecast`. As long as `Cols` contains more than one value, the result will be a matrix:

In [324]:
Cols ← 1 4 6
Forecast[;Cols]

But if `Cols` happens to have only a single value, which is a scalar, the result returned is a vector (we already mentioned this in the subsection on "Array Indexing"):

In [325]:
Cols ← 4
Forecast[;Cols]

The rank of the result may be critical if some other expression in your program expects a matrix.

To make certain that your indexing expression always returns a matrix, you must ensure that your index will always be a vector by using *Ravel*:

In [326]:
Forecast[;,Cols]

With `,Cols`, whatever the rank of `Cols` is, the result will always be a matrix.

*Ravel* can be associated with an *Axis* specifier; this will be discussed in the *Specialist's Section*.

## Empty Vectors and Black Holes

When we apply a scalar dyadic function to a vector and a scalar, the rule is that the result has the same size as the vector:

In [327]:
42 75 86 31 + 10

Above, a scalar added to a 4-item vector gives a 4-item vector, too.

Below, a 7-item vector compared to a scalar gives a 7-item vector:

In [328]:
'MAMMOTH' = 'M'

But what happens if the vector is empty? The rule says that the result must have the same size as the vector: therefore it should be empty, too, regardless of the (scalar) function that we used!

In [329]:
Hole ← ⍬
Hole + 3

Notice how nothing is displayed above: the result is empty.

The same thing happens if we multiply by 100, for example:

In [330]:
Hole × 100

In [331]:
Hole = 0

In [332]:
Hole = Hole

In [333]:
⍴Hole

In [334]:
Hole ≡ ⍬

If you were working on the terminal and getting all these empty results, you could compare `Hole` to `⍬` with *Match* and conclude `Hole` is an empty numeric vector.

Empty vectors look very much like black holes: they absorb everything (but only when used with scalar functions). This may lead to some unexpected results.

### Not All Empty Vectors Are Created Equal

Something that might throw off new APLers is the fact that there are infinitely many different empty vectors. The first two examples being `⍬` and `''`:

In [335]:
⍬

In [336]:
''

They look the same when displayed as results, but they are *not* the same:

In [337]:
⍬≡''

And this is just the tip of the iceberg. If you recall the rule for *Where* (monadic `⍸`), we know that `⍸Y` returns an empty vector if `Y` only contains 0s, but an empty vector of *what*?

If `Y` contains positive integers, then the result of `⍸Y` will contain some indices that can be used to index into `Y` and the shape of those indices depends on the shape of `Y`:

In [338]:
⍸ 1 0 1 0

In [339]:
⍸ 2 2⍴1 0 1 0

So in the first example we see a vector with 2 integers and in the second example we see a vector with 2 vectors, each containing 2 items.

If we change one of the 1s to a 0, here is what we will be getting:

In [340]:
⍸ 1 0 0 0

In [341]:
⍸ 2 2⍴1 0 0 0

Now we have a vector with 1 integer and a vector with 1 vector of 2 items.

If we change the final 1 to a 0, we will be getting a vector with 0 integers (an empty numeric vector `⍬`) and a vector with 0 vectors of 2 items. For APL, these are *not* the same thing:

In [342]:
⍸ 0 0 0 0

In [343]:
⍬≡⍸ 0 0 0 0

In [344]:
⍸ 2 2⍴0 0 0 0

In [345]:
⍬≡⍸ 2 2⍴0 0 0 0        ⍝ different empty vectors

```{admonition} Tip 
:class: tip
When you are dealing with empty vectors, you can use the incantation `⎕SE.Dyalog.Utils.repObj` to inspect their actual shape.
```

To illustrate this incantation, we will see what it returns for the empty vectors we saw above:

In [346]:
⎕SE.Dyalog.Utils.repObj ''

In [347]:
⎕SE.Dyalog.Utils.repObj ⍬

In [348]:
⎕SE.Dyalog.Utils.repObj ⍸ 0 0 0 0

In [349]:
⎕SE.Dyalog.Utils.repObj ⍸ 2 2⍴0 0 0 0

What the code cell above tells us is that `⍸ 2 2⍴0 0 0 0` matches `0⍴⊂2⍴0`, which you can check by copying & pasting the result above, or using <kbd>APL</kbd>+<kbd>Z</kbd> to type `⊂`:

In [350]:
(0⍴⊂2⍴0)≡(⍸ 2 2⍴0 0 0 0)

(You will learn about what `⊂` does in a future chapter, don't worry.)

```{figure} ../res/Hierarchy.png
---
name: Hierarchy
---

```


## Exercises

```{admonition} Warning 
:class: warning
*The following exercises are designed to train you, not the computer.*

*For this reason, we suggest that you try to answer them on a sheet of paper, not on your computer. When you are sure of your answer, you can test it on the computer.*
```

**Exercise**:

Can you evaluate the following expressions?

```APL
3 × 2 + 6 ≠ 3 × 2
12 6 27 ⌊ 11 + ⍳3
4 5 6 ⌈ 4 + 2 5 9 > 1 6 8
7 ⌊ 25 6 17 - (2 × 3) + 9 3 5
((8 + 6) × 2 + 1) × 3 - 6 ÷ 3
(⍴4⌈5) + 4⌈5
1 ⊢ 2 ⊣ 3 ⊢ 4 ⊣ 5 ⊢ 6 ⊣ 7
1 ⊣ 2 ⊢ 3 ⊣ 4 ⊢ 5 ⊣ 6 ⊢ 7
1 2 3 ⊢ 4 5 6
```

**Exercise**:

Try to evaluate the following expressions. *Be careful*: they are not as simple as might first appear!

```APL
2 2+2 2
2+2 2+2
2+2,2+2
2,2+2,2
```

**Exercise**:

Given the vector `A ← 8 2 7 5`, compare the results obtained from the following sets of expressions:

```APL
1+⍴A
⍴A+1
```

and

```APL
1+⍳⍴A
⍳¯1+⍴A
⍳⍴A-1
```

**Exercise**:

Using your knowledge of the order of evaluation in APL, re-write the following expressions without using parentheses.

```APL
((⍳4)-1)⌈3
7⌊(⍳9)⌈3
1+((⍳5)=1 4 3 2 5)×5
```

**Exercise**:

Given a variable `A`, find an expression which returns the answer 1 if `A` is a scalar, and 0 if it is not.

**Exercise**:

Given two scalars `A` and `B`, write an expressions which gives 7 if `A` is greater than or equal to `B` and 3 if `A` is smaller than `B`.

**Exercise**:

Given two scalars `A` and `B`, find an expression which returns:
 - an empty vector, if `A` is zero, whatever the value of `B`;
 - 0, if `B` is zero, but `A` is not;
 - 3, if neither `A` nor `B` are zero.

**Exercise**:

Unfortunately, your keyboard has been damaged and your `∧` and `∨` keys no longer work. Which other symbols could you use to replace them?

You can test your solutions on these vectors:

In [351]:
L ← 0 0 1 1
R ← 0 1 0 1

**Exercise**:

Given these three vectors:

In [352]:
G ← 1 1 1 0 0 1
M ← 0 0 1 1 0 1
D ← 1 0 1 0 1 0

Evaluate the following expressions:

```APL
G∨D
~G∧D
~G∨~D
D∧~G
G∧M∨D
(~D)∧(~G)
(M⌈G)=(M⌊D)
(M⌊G)≠(M⌈D)
```

**Exercise**:

Evaluate the following expressions:

```APL
0 < 0 ≤ 0 = 0 ≥ 0 > 0
'sugar' ∊ 'salt'
11 ≠ '11'
'14' ⍳ '41'
```

**Exercise**:

Write an expression to compute the number of times the letter "e" appears in a character vector. You can test it with this vector:

In [353]:
Text ← 'The silence of the sea'

**Exercise**:

How many expressions can you write to retrieve the last element of a vector? Can you write one that uses the *Reduce* operator `/`?

You can test your expressions on these vectors:

In [361]:
Vec ← 1 2 3 4 5 6
Vec ← (1 2) 3 (4 5) 6
Vec ← 0 (5⍴0) (3 4⍴0) (1 2 3⍴0)

**Exercise**:

We have conducted some experiments on a variable `Z`:

```APL
      2 ⍴ Z
1 7
      +⌿ Z
20
      Z = 9
0
0
1
0
```

What is the value of `Z`?

**Exercise**:

We have conducted some experiments on a variable `Z`:

```APL
      Z = 0
0 1 0 0
1 0 0 1
      +/[2] Z
20 6
      +/[1] Z
8 7 6 5
```

What is the value of `Z`?

**Exercise**:

What are all the **positions** of the letter "e" in the character vector specified in exercise 11?

**Exercise**:

Given a vector `Vec` of any size and type (numeric or character or even nested), try to extract the items of `Vec` which are in the odd positions (the 1st, the 3rd, the 5th, ...).

You can test your solution on these vectors:

In [354]:
Vec ← 1 2 3 4 5 6 7 8 9
Vec ← 'CROANGGARYANTLULLBAPTWIZOSNSSB'
Vec ← (1 2) (2 3) (4 5 6) (6 9 42 1024)

**Exercise**:

How many numbers are there in the variable `Prod` used in this chapter?

**Exercise**:

How is it possible to remove all the values which do not fall between 20 (inclusive) and 30 (exclusive) from a given vector?

**Exercise**:

Notice how these two expressions look the same:

In [363]:
≢Vec

In [364]:
⍴Vec

Can you tell if/how they are different?

**Exercise**:

Can you compute the *Depth* (`≡`) of these arrays:

```APL
42
42 41 40 39
2 2⍴42 41 40 39
1 (2 3) (4 5)
(1 2) (3 4 5) (6 7 8 9)
1 (2 (3 (4 (5 (6 7)))))
(2 2⍴4) (2 2⍴⍳47) (3 6⍴⍳18)
```

**Exercise**:

In a vector, we would like to replace all the values that are smaller than 20 by 20, and replace all the values that are greater than 30 by 30. How can we do that?

**Exercise**:

The following 6 expressions cannot be executed, but instead generate error messages; can you say why?

```APL
3+(5-(6+2)×4
121÷(⍳4)-3
(¯X+5)*2
⍴4 5 6+2 3-1
⍳3-5
⍸1 0 2 ¯1
```

**Exercise**:

Write an APL expression which produces a vector of 17 numbers, the first being 23, with each subsequent number being equal to the preceding one plus 11.

**Exercise**:

In a shop, each product is identified by a code. You are given a list of the codes and the corresponding prices:

In [357]:
PCodes ← 56 66 19 37 44 20 18 23 68 70 82
Prices ←  9 27 10 15 12  5  8  9 98  7 22

A customer gives you a list of items they intend to buy as vector of code/quantity pairs: code, quantity, code, quantity, and so on.

In [358]:
Wannabuy ← 37 1 70 20 19 2 82 5 23 10

Can you evaluate their bill?

Note that this cannot be done easily in a single (and readable) APL expression, and you will therefore need to write several expressions.

The correct total should be 375.

**Exercise**:

We have organised a lottery, and we have created five vectors:

In [359]:
⎕RL ← 42
Tickets ← ?1000⍴999999
Sold ← Tickets[(800+?200)?1000]
Ours ← Sold[200?≢Sold]
Winners ← Tickets[100?1000]
Prizes ← ?100⍴1000

 - `Tickets` has the numbers of all the existing tickets;
 - `Sold` has the numbers of the tickets which have been sold;
 - `Ours` has the numbers of the tickets we bought ourselves;
 - `Winners` has the numbers of the winning tickets, and
 - `Prizes` has the prize amount associated with each respective number in `Winners` vector.
 
And now, try to answer the following 4 questions:

 1. What are the numbers of the unsold tickets?
 2. Are there some winning tickets which have not been sold?
 3. How many winning tickets do **we** have?
 4. How much did we win?

**Exercise**:

Can you calculate all the divisors of an integer number `N`?

Here are some results to check your answer:

| `N` | Divisors |
| :-: | :- |
| 1 | 1 |
| 2 | 1 2 |
| 3 | 1 3 |
| 24 | 1 2 3 4 6 8 12 24 |
| 1337 | 1 7 191 1337 |
| 1234321 | 1 11 101 121 1111 10201 12221 112211 1234321 |

**Solutions**: the solutions can be found at the end of this chapter.