# Operators

## Definitions

### Operators & Dervided Functions

We have already seen some operators: _reduce_ (described in [here](./Some-Primitive-Functions.ipynb#Reduce)), _axis_ (described in [here](./Some-Primitive-Functions.ipynb#Axis-Specification)), and _each_ (described in [here](./Nested-Arrays-Continued.ipynb#Each).
Let us define precisely what they are:

 - there are built-in (_primitive_) operators and user-defined operators;
 - an _operator_ is similar to a function, but rather than working on arrays to produce a result which is also an array, an operator works on functions (and sometimes, arrays) to produce a new function;
 - the new function generated by the operator and its argument(s) is called a _derived function_. The derived function can be applied to arrays in the same way as any other function;
 - the arguments passed to the operator are often referred to as _operands_, to distinguish them from the arguments to the derived function;
 - monadic operators take a single operand on their **left**. This is in contrast to monadic functions, which take their argument on the right;
 - dyadic operators have two operands, one on each side. The operands to an operator are usually functions, but it is not uncommon for user-defined operators to take on function and one array operand;
 - the derived function, in turn, can be monadic, dyadic, or ambivalent; and
 - neither of the functions supplied as arguments to an operator, nor the resultant derived function, can be niladic.

For example, in the expression below, the operator _reduce_ (`/`) _operates on_ the function _plus_ to produce the derived function _plus reduce_.
This derived function is then applied to `3 5 6` to produce the result `14`:

In [1]:
+/3 5 6

<!-- begin beware style=warning -->
***Beware***:

 > You must not be confused by the fact that some symbols are used to represent both a function and an operator, such as `/` and `\`, for example.
<!-- end -->

Let us compare two expressions:

 1. using `/` as the function _compress_, which takes two array arguments:

In [2]:
1 1 0 1 0 / 6 2 9 4 5

 2. using `/` as the operator _reduce_, which takes a function as a left operand:

In [3]:
+/ 6 2 9 4 5

The association of `+` with `/` creates a _derived function_ which could be parenthesised as `(+/)`, even though it is not necessary to do so.

For clarification, we can define a synonym for the derived function:

In [4]:
Sum ← +/

Until now, we have only considered `+/` (or `Sum`) as a monadic derived function:

In [5]:
Sum 6 2 9 4 5

But we shall soon see that it may also be used as a dyadic function:

In [6]:
2 Sum 6 2 9 4 5

So, we can say that this _derived function_ is _ambivalent_.

### Sequences of Operators

_Derived functions_ behave exactly like plain primitive functions.
So, they can be the argument of a second (and a third, ...) operator:

In [7]:
+/¨ (3 4 6)(4 9 7 1)(3 1)

The left argument of the operator _each_ `¨` is the derived function `+/`, so we could have written:

In [8]:
Sum¨ (3 4 6)(4 9 7 1)(3 1)

Now, suppose that we no longer want to add up vectors, but three small matrices instead:

In [9]:
⎕← A ← 2 3⍴⍳6

In [10]:
⎕← B ← 4 2⍴1 0 0 1 0 1 1 0

In [11]:
⎕← C ← 3 5⍴8 3 4 2 0 0 3 5 1 7 3 6 2 1 7

Because they are matrices, we must specify the axis along which we add them up.
Of course, we could use the two symbols `/` and `⌿`, but if the arrays had been of a higher rank, an explicit axis specification might have been necessary.
It could also be that we just prefer to use the explicit axis specification.
If so, a third level of operator⁽*⁾ can be added:

<!-- begin footnote -->
***Note**:

 > Although the _axis specification_ shares some properties with operators, it is a special syntactical element and not really an operator. See [below](#Axis) for more information.
<!-- end -->

In [12]:
+/[2]¨A B C

In [13]:
+/[1]¨A B C

When in doubt regarding what is the function operand to which operator, try parenthesising everything successively, to make it clearer what derived functions come from where.

In the two examples above, the first operator is `/`, then the second "operator" is `[]`, and the third operator is `¨`:

In [14]:
(((+/)[2])¨)A B C

### List of Built-in Operators

Dyalog APL has a rich set of built-in operators.
You will find a full list with detailed syntax and examples in [the Appendix](./Appendices.ipynb#Dyalog-APL-Operators).

## More About Some Operators You Already Know

### Reduce

Up to now, we have used the operator _reduce_ with rather basic functions (`+`, `×`, `⌈`, `∧`), but it can also be used, less obviously, with functions like _reshape_, _compress_, and _replicate_.
In these cases, the derived function typically takes a 2-item nested vector as its argument, and the effect is to insert the function (the operand to the operator) between the two items of this vector.

Just remember that, since `+/ (2 4 3)(7 1 5)` is equivalent to `⊂(2 4 3) + (7 1 5)`, then `⍴/ (2 4 3)(7 1 5)` is also equivalent to `⊂(2 4 3) ⍴ (7 1 5)`.

Here is an example of a _reduction by reshape_:

In [15]:
⍴/(2 5)(3 1 9 4 1 0 7)

This _looks_ very similar to

In [16]:
2 5⍴3 1 9 4 1 0 7

But we can see the results are different.
The result of the _reduction by reshape_ is not a matrix, but a scalar containing a _nested_ matrix, for the reason already seen in [a previous section](./Nested-Arrays-Continued.ipynb#Reduction): the reduction of a vector always gives a scalar.

Now, here is a _reduction by compression_, another by _replication_, and one by _index of_:

In [17]:
//(1 1 0 1 0 1 1)'Strange'

In [18]:
//(1 1 0 4 0 1 2)'Strange'

In [19]:
⍳/(2 6 1 7)(2 4⍴3 7 8 4 2 5 6 0)

### n-Wise Reduce

#### Elementary Definition

The derived functions of _reduce_ can be used with two arguments.
When the second argument is present, the form is called _n-wise reduce_.

When applied to vectors, _n-wise reduce_ has the syntax `r ← n F/ vector`, where `F` denotes a dyadic function.

This special kind of _reduce_ splits the vector into overlapping slices of length equal to `n`, reduces each slice using the specified function `F`, and then catenates the results together.

So, for example, `2 ×/ 8 10 7 2 6 11` starts by creating overlapping slices of length `2`:

In [20]:
(8 10)(10 7)(7 2)(2 6)(6 11)

Then, we apply the reduction `×/` on each slice and catenate everything:

In [21]:
(×/8 10),(×/10 7),(×/7 2),(×/2 6),(×/6 11)

You can verify that this is, indeed, the result we get:

In [22]:
2 ×/ 8 10 7 2 6 11

As another example, we can explain the result of

In [23]:
3 +/ 8 10 7 2 6 11

because we create overlapping slices of length `3`, apply _plus-reduce_ on each slice, and then catenate everything together:

In [24]:
(+/8 10 7),(+/10 7 2),(+/7 2 6),(+/2 6 11)

The length of the result vector is `(1+≢vector)-n`.

In the two examples above, we did not really need to do the catenation explicitly, because the results of applying `×/` and `+/` on each slice were simple scalars.
However, if we try an example where the reduction gives a nested result, we will see that we _do_ need to catenate everything together.

Here is an example of an _n-wise catenate reduction_:

In [25]:
2 ,/ 8 10 7 2 6 11

If we create the overlapping slices by hand and do not catenate the results together, we get a result that is nested too deep:

In [26]:
(,/8 10)(,/10 7)(,/7 2)(,/2 6)(,/6 11)

Thus, we do have to catenate the results after applying the reduction to each slice:

In [27]:
(,/8 10),(,/10 7),(,/7 2),(,/2 6),(,/6 11)

#### Full Definition

The general syntax is `r ← n F/[axis] array`, where `F` stands for any dyadic function.

 - the `array` is split into slices along the specified `axis`;
 - the left argument `n` can be positive (as in the examples above), zero, or negative;
 - if `n` is positive, _reduce_ is applied to slices of length equal to `n`;
 - if `n` is zero, the result is an array with the same shape as _array_, except that its length along the axis selected by `axis` is incremented by 1 and filled with the _identity item_ for the function `F`. This is explained in [a Specialist Section](#application-to-n-wise-reduction); and
 - if `n` is negative, each slice is reversed before _reduce_ is applied.

Here are some examples which use the matrix below:

In [28]:
tam ← 3 5⍴2 3 5 8 8 4 6 2 5 9 1 4 9 7 8

Find the largest items of 2 adjacent columns:

In [29]:
2 ⌈/ tam

Add up pairs of adjacent rows:

In [30]:
2 +⌿ tam

Return a matrix with one more column, filled with zeroes (identity item of addition):

In [31]:
0 +/ tam

Return a matrix with one more row, filled with ones (identity item of multiplication):

In [32]:
0 ×/[1] tam

Obtain the differences between adjacent values `(14-11)(15-14)...`:

In [33]:
¯2 -/ 11 14 15 21 23 30 28 34

### Axis

Strictly speaking, _axis_ is not an operator.
It has different syntax (consisting of two brackets enclosing a numeric value to the right of a function) and applies in different ways depending on the function that it modifies (its operand).
However, applying a function "with _axis_" does apply a transformation and produces a derived function, and it is common to think of _axis_ as an operator.

It is possible to use _axis_ with any of the scalar dyadic functions.
This can be useful, for example, to add the items of a vector to each of the rows of a matrix, or multiply the columns of a matrix by different values:

In [34]:
tam

In [35]:
tam +[1] 8 6 9

In [36]:
tam ×[2] 2 5 0 2 1

The list of all scalar dyadic functions is given in [an appendix](./Appendices.ipynb#Scalar-Functions).

The following functions can also use _axis_:

| Monadic Function | Description |
| :- | :- |
| `↑` and `↓` | _mix_ and _split_ |
| `⌽` `⊖` | _reverse_ |
| `,` | _ravel with axis_ |
| `⊂` | _enclose with axis_, _partitioned enclose_ |
| `⊂` and `⊃` | if `⎕ML>1`, APL2-like _split_ and _mix_, c.f. [this section](./Nested-Arrays-Continued.ipynb#Compatibility-and-Migration-Level) |

| Dyadic Function | Description |
| :- | :- |
| `+` `×` `⌈` `∧` `≤` etc... | all scalar dyadic functions |
| `↑` and `↓` | _take_ and _drop_ |
| `/` and `⌿` | _compress_ and _replicate_ |
| `\` and `⍀` | _expand_ and _scan_ (see next section) |
| `⌽` `⊖` | _rotate_ |
| `,` `⍪` | _catenate_ |
| `,` `⍪` | _laminate_ |
| `⊂` | _partitioned enclose_ |

## Scan

### Definition

_Scan_ is represented by the symbol `\` or `⍀`.
Its most general syntax is `r ← F\[axis] array`, where `F` stands for any appropriate dyadic function.

To understand how it works, let us apply it to a vector.

The nth item of `F\vector` is equal to `F/n↑vector`. A worked example follows:

In [37]:
+\ 3 6 1 8 5

 1. the 1st item is equal to `+/3`, which is `3`;
 2. the 2nd item is equal to `+/3 6`, which is `9`;
 3. the 3rd item is equal to `+/3 6 1`, which is `10`;
 4. the 4th item is equal to `+/3 6 1 8`, which is `18`; and
 5. the 5th item is equal to `+/3 6 1 8 5`, which is `23`.

Try to use a reasoning similar to the one above to understand the result of the _times scan_ shown below:

In [38]:
×\ 3 6 1 8 5

Warning!
It would be a mistake to always try to deduce the value of each item in the result from its immediate left neighbour.
While it is possible to do this for commutative functions like addition and multiplication, it is not appropriate for non-commutative functions like subtraction.

In fact, the result of `-\ 3 6 1 8 5` is **not** `3 ¯3 ¯4 ¯12 ¯17`, but something else entirely:

In [39]:
-\ 3 6 1 8 5

 1. the 1st item is equal to `-/3`, which is `3`;
 2. the 2nd item is equal to `-/3 6`, which is `¯3`;
 3. the 3rd item is equal to `-/3 6 1`, which is `¯2`, the result of `3-6-1` and **not** `(3-6)-1`;
 4. the 4th item is equal to `-/3 6 1 8`, which is `¯10`; and
 5. the 5th item is equal to `-/3 6 1 8 5`, which is `¯5`.

So, be careful when using _scan_ with non-commutative functions.

When applied to matrices or higher-rank arrays, _scan_ works along the specified axis.
If the axis specification is omitted, `\` works along the _last_ axis and `⍀` works along the _first_ axis.

In [40]:
+\[2]tam  ⍝ same as +\tam

In [41]:
+\[1]tam  ⍝ same as +⍀tam

### Scan with Binary Values

_Scan_ is very useful when applied to binary values.

In [42]:
∨\ 0 0 0 0 1 1 0 1 0 0 1 1

Because the function _or_ gives the result `1` as soon as one of its arguments is `1`, _or-scan_ repeats the first `1` up to the end of the vector.
In a way, you can see `∨\` as a knife spreading butter from left to right, and the `1` is the butter.

Other interesting patterns can be obtained by changing the function used.
For example, you can get the effect of "a knife spreading butter", where the `0` is the butter, if you use `∧\` instead of `∨\`:

In [43]:
∧\ 1 1 1 1 0 1 1 0 0 1 1 0

In the example above, as soon as _and-scan_ finds a `0`, everything else turns into a `0`.

The _less-than-scan_ marks the position of the first `1` and the _less-than-or-equal-scan_ marks the position of the first `0`:

In [44]:
<\ 0 0 0 0 1 1 0 1 0 0 1 1

In [45]:
≤\ 1 1 1 1 0 1 1 0 0 1 1 0

### Applications

_Scan_ can be used to solve common problems in a very simple way:

#### Inflate Values

Someone forecasts investments in a foreign country for the next five years:

In [46]:
inv ← 2000 5000 6000 4000 2000

But the country in question suffers from inflation, and the inflation rates are forecasted as follows:

In [47]:
inf ← 2.6 2.9 3.4 3.1 2.7

The cumulative sequence of these inflation rates can be calculated by multiplying them all with a _multiply-scan_:

In [48]:
7 3⍕ ×\ 1+inf÷100

Now, the investments expressed in "future values" would be:

In [49]:
9 2⍕ inv × ×\1+inf÷100

Finally, the year after year cumulative investment may be obtained by a _plus-scan_:

In [50]:
9 2⍕ +\ inv × ×\1+inf÷100

As you can see, we employed two _scans_ in the same expression.

#### Remove Leading/Trailing Blanks

One often has to remove leading (or trailing) blanks from a character vector.
We can use the _or-scan_ to do it.
The details of the method are shown here:

In [51]:
lb ← '    Remove my 4 leading blanks.'
lb≠' '

In [52]:
∨\ lb≠' '

In [53]:
(∨\ lb≠' ')/lb

This can be coded into a small utility function:

In [54]:
CutBlanks ← {(∨\' '≠⍵)/⍵}

This expression is recognised by Dyalog APL as an _idiom_ and is thus very fast.

To remove trailing blanks, it would suffice to reverse the vector, remove leading blanks as above, and then reverse it back again.

## Outer Product

### Definition

Imagine that you have calculated the multiplication table for the integers 1 to 9; you could present it like this:

$$\begin{array}{|c|rrrrrrrrr|}
\hline
\color{red}\times & 1 & 2 & 3 & 4 & 5 & 6 & \color{red} 7 & 8 & 9 \\
\hline
1 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
2 & 2 & 4 & 6 & 8 & 10 & 12 & 14 & 16 & 18 \\
\color{red} 3 & 3 & 6 & 9 & 12 & 15 & 18 & \color{red} {21} & 24 & 27 \\
4 & 4 & 8 & 12 & 16 & 20 & 24 & 28 & 32 & 36 \\
\text{etc.} & \text{etc.} & & & & & & & & & & \\
9 & 9 & 18 & 27 & 36 & 45 & 54 & 63 & 72 & 81 \\
\hline
\end{array}$$

The task of calculating this table consists of taking pairs of items of two vectors (the column and row headings) and combining them with the function at the top left.
For example, `3` times `7` gives `21` (highlighted in red above).
Once the operation has been repeated for all the possible pairs, one obtains what is called, in APL, the _outer product_.

We can change the values and replace the multiplications with additions:

$$\begin{array}{|c|rrrrrr|}
\hline
\color{red}+ & 8 & 5 & 15 & \color{red} 9 & 11 & 40\\
\hline
5 & 13 & 10 & 20 & 14 & 16 & 45 \\
\color{red} 4 & 12 & 9 & 19 & \color{red} 13 & 15 & 44 \\
10 & 18 & 15 & 25 & 19 & 22 & 50 \\
3 & 11 & 8 & 18 & 12 & 14 & 43 \\
\hline
\end{array}$$

The _outer product_ operator looks like `∘.F`, where `F` is an appropriate dyadic function.
The symbol `∘` is a _jot_ and you can type it with <kbd>APL</kbd>+<kbd>j</kbd>.

For example, for the multiplication table you can use `∘.×`:

In [55]:
(⍳9) ∘.× ⍳9

And for the addition table above, you can use `∘.+`:

In [56]:
5 4 10 3 ∘.+ 8 5 15 9 11 40

As you can see, the _outer product_ is also slightly special, in that the operand function `F` goes on the right of `∘.`, and not on the left.

Also, notice that the left column of the table is the left argument vector to the function derived from the _outer product_ and the top row is the right argument vector.
In fact, in the expression `r ← left ∘.F right`, the shape of the result `r` is `(⍴r) ≡ (⍴left),⍴right`.

### Extensions

#### Other Functions

The function used in an _outer product_ can be any primitive or user-defined dyadic function, so _outer product_ is an operator of amazing power.

Imagine you have written a little function to calculate the length of the hypotenuse of a right-angled triangle from the lengths of the other 2 sides given as the left and right argument:

In [57]:
Hypo ← {((⍺*2)+(⍵*2))*0.5}
3 Hypo 4

You can test it on a number of combinations of lengths in one expression like this:

In [58]:
8 3⍕ 3 6 12 ∘.Hypo 4 1 8 7 5

Now, let us have some fun with relational functions:

In [59]:
(⍳5) ∘.= (⍳5)

In [60]:
(⍳5) ∘.< (⍳5)

In [61]:
(⍳5) ∘.≥ (⍳5)

We shall study some applications of _outer product_ like `∘.<` or `∘.⌊` in [a later subsection](#Applications-of-Outer-Product).

Whenever the function `F` does not produce a scalar, the _outer product_ `∘.F` produces a nested array. This is the case with _outer products_ like `∘.⍴`, `∘.,`, or `∘./`:

In [62]:
3 4 2 ∘.⍴ 6 3 7

In [63]:
3 0 2 ∘./ 5 1 7

In [64]:
3 1 2 ∘., 6 3 0 7

In [65]:
3 2 4 ∘.↑ 5 8 4

#### Other Shapes and Types of Data

So far, we have applied _outer product_ to numeric vectors; it can, of course, also be used with character data and higher-rank arrays.
When applied to higher rank arrays, the result becomes very big quickly, because each item of the left array has to be combined with each item of the right one.

Remember, in the expression `r ← left ∘.F right`, the shape of `r` is equal to `(⍴left),(⍴right)`.

In [66]:
⎕← left ← ↑'DIMITRI' 'GUNTHER'

In [67]:
right ← 'VERONICA'

Now, we wish to study the result of `left ∘.= right`.
To help you visualise the comparisons being made, we catenate the left argument matrix on the left of the result and catenate the right argument vector on top:

In [68]:
(2 9⍴' ',right)⍪[2]left,left ∘.= right

The left argument is a matrix with shape `2 7` and the right argument is a vector with shape `8`, so the result is a 3D array with shape `2 7 8`.
Each of the two major cells corresponds to comparing one of the names in the left argument to the name `'VERONICA'`.

### Applications of Outer Product

#### Exhaustive Search

Because _outer product_ uses a dyadic function to combine **all** items of the left argument with **all** items of the right argument, _outer product_ is often used when some kind of exhaustive computation needs to be done.
One such example is that of exhaustive search.

As an example, suppose you want to figure out if there is a way to add one number from the vector `5 1 16 42 63 7 10` to another number from the vector `24 45 18 31 29 43 67` to get `73`.

With _outer product_, this is quite an easy question to answer:

In [81]:
5 1 16 42 63 7 10∘.+24 45 18 31 29 43 67

In [82]:
73∊5 1 16 42 63 7 10∘.+24 45 18 31 29 43 67

If you flip the arguments to _membership_ and use _where_, you can find the position(s) where `73` is:

In [83]:
⍸(5 1 16 42 63 7 10∘.+24 45 18 31 29 43 67)∊73

This pattern of exhaustive computations is fairly common, and although it generally is not the most computationally efficient way of solving a problem, it is generally fast enough to prototype as a first approach.

#### Draw a Bar Chart

Imagine that you have to represent a list of values with a bar chart.
Perhaps you will use dedicated graphical software, and you'd be right, but just have a look at this elegant solution, which again uses an _outer product_.

Here is the list of values that we want to chart:

In [78]:
nums ← 1 3 0 7 9 8 5 4 2 3 1

Let us first calculate the vertical scale.
It is made of the integers from 9 to 1 in reverse order and can be obtained by:

In [79]:
⌽⍳⌈/nums

Then, let us compare this scale to the values; an _outer product_ will build columns of `1`s up to the correct height:

In [80]:
(⌽⍳⌈/nums) ∘.≤ nums

Finally, to draw the graph, we can index a two-character vector, exactly as we did in [a previous section](./Data-and-Variables.ipynb#The-Shape-of-the-Result):

In [81]:
' ⎕'[1+(⌽⍳⌈/nums)∘.≤nums]

#### Decreasing Refunding

Some students have spent money to buy expensive books for their studies:

In [82]:
exp ← 740 310 1240 620 800 460 1060

Their university agrees to refund them, but places the following limits on the refunding rates:

| Expense range | Refund rate |
| :- | :- |
| 0 - 500 | 80% |
| 500 - 900 | 50% |
| 900+ | 0% |

We could say exactly the same thing in a somewhat different way:

Expenses from 0 to 900 have a refund rate of 50% and expenses up to 500 get an **additional** 30%.

Even if this rule may seem strange, both methods give the same result.
For example, a student who spent 740€ would get:

 - using the initial table, 80% of 500 plus 50% of 240:

In [83]:
(0.8×500)+0.5×240

 - using the reworded rule, 50% of 740 plus 30% of 500:

In [84]:
(0.5×740)+0.3×500

Now, let us limit the expenses to the given maxima:

In [85]:
exp ∘.⌊ 900 500

The first column of the result above contains the expenses capped at 900 (which get a refund of 50%) and the second column contains the expenses capped at 500 (which get an additional 30%).

So, according to our modified rule, we must pay 50% of the first column plus 30% of the second, which we can do by multiplying the columns with `0.5 0.3` (using an _axis_ operator) and then adding them:

In [86]:
+/ (exp∘.⌊900 500) ×[2] 0.5 0.3

And the total refund is, of course:

In [87]:
+/ +/(exp∘.⌊900 500)×[2]0.5 0.3

If we laminate the original vector, we can see the expenses and the refunding:

In [88]:
exp,[.5] +/(exp∘.⌊900 500)×[2]0.5 0.3

!["APLer applying sunscreen outside."](../res/Outer_Product.png)

### Outer Product Exercise

<!-- exercise ex-complex-refunding -->
***Exercise 1***:

Let us try to generalise the method used above to compute refunds.

In our example, we had chosen a very simple case, because we had only two slices, and all students used the same scale.
Let us now imagine a slightly more complex case:

 - the students are classified in three categories, which have different refunding rates; and
 - we now have four different expense ranges.

The new conditions are expressed with the traditional notation in the table below:

| Category \ Range | 0 to 600 | 600 to 1.100 | 1.100 to 1.500 | 1.500 to 2.000 |
| :- | :- | :- | :- | :- |
| Category 1 | 100% | 100% | 80% | 50% |
| Category 2 | 100% |  70% | 30% | 10% |
| Category 3 |  80% |  60% | 20% |  5% |

```APL
limits ← 600 1100 1500 2000
rates ← 3 4⍴100 100 80 50 100 70 30 10 80 60 20 5
```

Define a function `Refund` to solve this problem.
The function should take the limits vector, the rates matrix, the expenses vector, and the categories vector, as right arguments.
The return value should be a vector with the refunded amount to each student.

Using loops is strictly prohibited and may be punished with high severity!

Make use of the variables `expenses` and `categories` defined below and verify your solution by comparing it to the values shown below.
<!-- end -->

In [101]:
⎕RL ← 73
expenses ← ?350⍴2500
categories ← ?350⍴3
⎕← 2 10↑categories,[.5]expenses

In [150]:
10↑Refund limits rates categories expenses

In [151]:
+/Refund limits rates categories expenses

## Inner Product

_Inner product_ is a generalisation of what mathematicians call _matrix product_, a tool considered by most students as extremely abstract, full of bizarre notations, like $\sum a_{ij}b_{jk}$, and obviously far removed from everyday problems.
You will discover that:

 - the concept is really simple, nearly obvious; and
 - it can be applied to many real life problems.

A simple example will help us.

### A Concrete Situation

A company intends to open a series of hotels and resorts in four countries.
This requires serious investments over a period of five years.
The following table shows these investments (in millions of dollars, of course!):

| Country vs Year | Year 1 | Year 2 | Year 3 | Year 4 | Year 5|
| :- | -: | -: | -: | -: | -: |
| Greece | 120 | 100 | 40 | 20 | 0 |
| Brazil | 200 | 150 | 100 | 120 | 200 |
| Egypt | 50 | 120 | 220 | 350 | 600 |
| Argentina | 0 | 80 | 100 | 110 | 120 |

These figures are contained in a matrix called `invest`:

In [1]:
invest ← 120 100 40 20 0 200 150 100 120 200 50 120
⎕← invest ← 4 5⍴invest,220 350 600 0 80 100 110 120

These investments will be supported by the company itself plus two banks, each taking a certain percentage of the total, depending on the evaluation of each project.
The following table shows how the risks are shared:

| Stakeholder vs Country | Greece | Brazil | Egypt | Argentina |
| :- | -: | -: | -: | -: |
| Bank 1 | 50 | 10 | 20 | 30 |
| Bank 2 | 20 | 60 | 40 | 30 |
| Company | 30 | 30 | 40 | 40 |

Those percentages are contained in a matrix named `percent`:

In [131]:
⎕← percent ← 3 4⍴50 10 20 30 20 60 40 30 30 30 40 40

We would like to calculate, year by year, how much each of the 3 partners is engaged in this project.
For example, let us try to evaluate the contribution of Bank 2 during Year 3:

| Country | Project valuation | Stake | Total invested |
| :- | -: | -: | -: |
| Greece | 40 | 20% | 8 |
| Brazil | 100 | 60% | 60 |
| Egypt | 220 | 40% | 88 |
| Argentina | 100 | 30% | 30 |

The total invested is, thus, 186.

This could have been obtained by the sum of four products:

In [132]:
+/ percent[2;] × invest[;3]÷100

In order to calculate the total invested for each partner and each year, we should repeat that algorithm for all the rows of `percent`, and all the columns of `invest`: this is precisely what an _inner product_ does.

And because it _adds_ a series of _products_, it will be expresseed by a dot (the operator) between a plus and a multiply sign, like this:

In [133]:
percent +.× invest÷100

In <!--figure-->the presentation below<!--Inner_Product_Diagram-->, we have detailed the elementary products which lead to the calculation for bank 2 in year 3:

![Diagram representing the _inner product_ operation.](res/Inner_Product_Diagram.png)

<!--figure-->This figure<!--Inner_Product_Diagram--> has a great advantage: it clearly shows the relations that exist between the 3 matrices:

 - the left argument has as many columns as the right one has rows; and
 - the result has as many rows as the left argument and as many columns as the right one.

As you can see, the item `result[x;y]` is calculated from row `x` of the left argument (`⍺[x;]` in dfn notation) and column `y` of the right argument (`⍵[;y]` in dfn notation).

These rules will be generalised in the next section.

### Definition of Inner Product

The syntax of _inner product_ is `r ← x f.g y`, where the _inner product_ is represented by a dot (`.`) and `f` and `g` represent two appropriate dyadic functions (either primitive or user-defined).

The arguments may be arrays of any rank: scalars, vectors, matrices, or higher-rank arrays.
The shape of the arguments and the shape of the result follow very simple rules:

 - the length of the last dimension of the left argument must be equal to the length of the first dimension of the right argument (in other words, `(¯1↑⍴x)≡1↑⍴y`); and
 - the shape of the result is the catenation of the argugments' shapes, in which the common dimension has disappeared (in other words, `(⍴r)≡(¯1↓⍴x),1↓⍴y`).

Of course, as usual, scalars are repeated to fit the appropriate size.

Let us represent scalars by `s`, vectors by `v`, matrices by `m`, and higher-rank arrays by `a`.
The table below shows the same of the result of some _inner products_:

| `r ← x f.g y` | `⍴x` | `⍴y` | `⍴r` |
| :- | -: | -: | -: |
| `a ← a f.g a` | `2 3 8` | `8 5 4` | `2 3 5 4` |
| `a ← a f.g a` | `2 3 8` | `8 3 2` | `2 3 3 2` |
| `m ← m f.g m` | `3 5` | `5 8` | `3 8` |
| `v ← m f.g v` | `4 7` | `7` | `4` |
| `v ← v f.g m` | `4` | `4 7` | `7` |
| `s ← v f.g v` | `10` | `10` | `⍬` |

### Typical Uses of Inner Products

#### Two Simple Problems

Many students imagine that matrix products are complex things, reserved for mathematicians, and far removed from everyday life.
This opinion should be reconsidered: very simple problems can be solved using inner product.

`hms` is a variable which contains a time interval in hours, minutes, and seconds:

In [135]:
hms ← 3 44 29

We would like to convert it into seconds.
We shall see three methods just now, and a fourth method will be given in another chapter.

A horrible solution:

In [136]:
(3600×hms[1]) + (60×hms[2]) + hms[3]

A good APL solution:

In [137]:
+/ 3600 60 1 × hms

An excellent solution with inner product:

In [138]:
3600 60 1 +.× hms

The second and third solutions are equivalent in terms of number of characters typed and similar in performance.
However, **it is recommended** that you use the third one: it will help you become familiar with _inner product_ so that after a certain period, it will become part of your toolkit as an APL programmer.

Here is a very similar example.
Two vectors represent the prices of a certain number of goods and the quantities we bought:

In [140]:
price ← 6 4.2 1.5 8.9 31 18
qty   ← 2 6   3   5   1  0.5

To calculate how much we paid, we can use the beginner's solution, or a solution with a simple _inner product_; they give the same result, of course:

In [141]:
+/ price × qty

In [142]:
price +.× qty

Just to show how it works, <!--figure-->the figure below<!--Inner_Product_Diagram_2--> contains an explanatory diagram similar to the one we used for our Banks/Investments example.

![Diagram explaining the behaviour of an _inner product_ between two vectors](res/Inner_Product_Diagram_2.png)

#### A Useful Family

Used with comparison functions, _inner product_ offers 18 extremely useful derived functions.

Here is a vector `ages` containing the ages of 400 persons:

In [143]:
⎕RL ← 73
⎕← 20↑ages ← ?400⍴100  ⍝ We display the first 20 ages only.

In the same way as we did in [a previous section](./Some-Primitive-Functions.ipynb#Reduction-of-Binary-Data), we can answer some elementary questions:

Are all these people younger than 65?

In [144]:
∧/ ages < 65

Is there at least one person younger than 20?

In [145]:
∨/ ages < 20

How many people are younger than 20?

In [146]:
+/ ages < 20

We can now replace _reduce_ in the previous examples by _inner product_, like this:

Are all these people younger than 65?

In [147]:
ages ∧.< 65

Is there at least one person younger than 20?

In [148]:
ages ∨.< 20

How many people are younger than 20?

In [149]:
ages +.< 20

Clever, isn't it?

These expressions can be read as:

 - `∧.<` means "all smaller" – are the ages **all smaller** than 65?
 - `∨.<` means "at least one is smaller" – is there **at least** one age **smaller** than 20?
 - `+.<` means "how many are smaller" – **how many** ages are **smaller** than 20?

In those three expressions, we have combined `∧`, `∨`, and `+` with `<`.
We could just as well combine them with all the comparison symbols, giving 18 different _inner products_, as shown in this table:

In [151]:
'∧∨+' ∘.{⍺,'.',⍵} '<≤=≥>≠'

#### A Special Case of a Comparison Inner Product

In this family of _inner products_, `∧.=` is particularly interesting, because it answers the question "are all those values equal?".
For example, applied to vectors of the same length:

In [152]:
'customer' ∧.= 'customer'

In [153]:
'customer' ∧.= 'cucumber'

Let us use this property to search for a word in a matrix of words:

In [155]:
⎕← words ← 8 7⍴'CONTACTCOLUMNSFORTUNEPRODUCTCOLONELPROVIDEMACHINETYPICAL'

If we combine this 8 by 7 matrix with a 7-item vector, compatibility rules are obeyed, and the result will be a 8-item vector:

In [158]:
words ∧.= 'PRODUCT'

The shape of `words` is `8 7` and the shape of `'PRODUCT'` is `7`, so the common dimension disappears, and the result has shape `8`.

Now, let us search for three words:

In [159]:
⎕← three ← 3 7⍴'MACHINECOMFORTPRODUCT'

In [160]:
words ∧.= ⍉three

We must transpose the matrix to be compliant with the compatibility rules, and the result shows what word was found in what row of `words`.

If we wanted to know which words were found, we could add an _or-reduction_:

In [163]:
∨⌿ words∧.=⍉three

If we wanted to know which of the rows in `words` contain words in `three`, we could have used another _or-reduction_, but along the other axis:

In [164]:
∨/ words∧.=⍉three

It may also be useful to search for the positions of said matches, but we can use _index of_ for that:

In [165]:
words ⍳ three

The converse to the expression `∧.=` is `∨.≠`.
(That means that `∧.=` and `∨.≠` always return opposite Boolean values.)
It looks for _different_ values instead of looking for _equal_ values.
Let us look at one simple example:

In [166]:
words ∨.≠ ⍉three

A `1` means that this word in `three` does not match the word in this row of `words`.
So, if a row contains all `1`s, the word in that row does not match any of the words in `three`.
Using _and-reduce_ along the second axis pinpoints the rows of `words` for which this is true:

In [168]:
∧/ words∨.≠⍉three

With a compression, we can see the words that are not found in `three`:

In [169]:
(∧/words∨.≠⍉three) ⌿ words

## Solutions

The following solutions we propose are not necessarily the “best” ones; perhaps you will find other solutions that we have never considered. APL is a very rich language, and due to the general nature of its primitive functions and operators there are always plenty of different ways to express different solutions to a given problem. Which one is “the best” depends on many things, for example the level of experience of the programmer, the importance of system performance, the required behaviour in border cases, the requirement to meet certain programming standards and also personal preferences. This is one of the reasons why APL is so pleasant to teach and to learn!

We advise you to try and solve the exercises before reading the solutions!

<!-- solution ex-complex-refunding -->
***Solution 1***:

We start by defining a numeric vector with the upper limits and a matrix with the refund rates:

In [142]:
limits ← 600 1100 1500 2000
⎕← 4 0⍕ rates ← 3 4⍴100 100 80 50 100 70 30 10 80 60 20 5

The next step is rewriting the traditional refund rate table in the modified version.
For example, for category 1 students, expenses up to 2.000 are refunded in 50%, and then there is a 30% additional refund for expenses up to 1.500.
Why is that?
Because the refund rate for expenses up to 1.500 is 80%, and not 50%, meaning there is a difference of 30%.

By using an _n-wise reduction_, we can subtract each pair of adjacent columns and get these rates:

In [143]:
⎕← modifiedRates ← 2 -/ rates,0

In the expression above, we added a column of zeroes to the right of `rates`.
This can be interpreted as "all expenses above 2.000 have a refund rate of 0%".

Now, we compare each student's expenses with the upper limits of each range:

In [144]:
capped ← expenses ∘.⌊ limits

In [145]:
10↑expenses,' ',capped

Finally, we are ready to multiply each student's capped expenses with the appropriate rates.
We just have to be careful to take into account each student's category to use the appropriate rates:

In [146]:
refunds ← +/capped × modifiedRates[categories;] ÷ 100
⎕← 10↑refunds
⎕← +/refunds

To wrap up, we can put everything into a function:

In [152]:
∇ refunds ← Refund (limits rates categories expenses)
    ;modifiedRates ;capped
    modifiedRates ← 2 -/ rates,0
    capped ← expenses ∘.⌊ limits
    refunds ← +/capped × modifiedRates[categories;] ÷ 100
∇

In [150]:
10↑Refund limits rates categories expenses

In [151]:
+/Refund limits rates categories expenses