# User Defined Functions

## Landmarks

### Some Definitions

In previous chapters we made a distinction between the functions and operators which are part of APL, like `+`, `×`, `⌈` and `/` (we refer to them as *primitives*), and those functions and operators that are created by the user which are represented, not by a symbol, but by a name like `Average` (we say they are *user-defined*).

We also made an important distinction between *functions*, which apply to data and which return data, and *operators*, which apply to functions or data and produce *derived* functions (see [the definition of Reduce](./Some-Primitive-Functions.ipynb#Definition)).

This means that we can distinguish between 4 major categories of processing tools:

| Category | Name | Examples | Refer to |
| :- | :- | :-: | :-: |
| Built-in tools | Primitive functions | `+` `×` `⌈` `⍴` | Previous chapters |
| | Primitive operators | `/` | [Chapter on Operators](./Operators.ipynb) |
| User-defined tools | **User-defined functions** | `Average` | This chapter |
| | User-defined operators | | [Section on User-Defined Operators](./Operators.ipynb#User-Defined-Operators) |

This chapter is devoted to user-defined *functions*. The subject of user-defined *operators* will be covered later.

We can further categorise user-defined functions according to the way they process data. Firstly we can distinguish between ***Direct*** and ***Procedural*** **functions**.

 - ***Direct*** functions (commonly referred to as ***dfns***) are defined in a very formal manner.
   - They are usually designed for pure calculation, without any external or user interfaces. *Dfns* do not allow loops except by recursion and have limited options for conditional programming.
   
 - ***Procedural*** functions (commonly referred to as ***tradfns***, short for *traditional functions*) are less formal and look much more like programs written in other languages.
   - They provide greater flexibility for building major applications which involve user interfaces, access to files or databases and other external interfaces. *Tradfns* may take no arguments and behave like scripts.
   
Even though you may write entire systems with dfns, you might prefer to restrict their use to encapsulate statements that, together, perform some meaningful operation on the data given.

The second distinction we can make concerns the number of arguments a user-defined function can have.

 - ***Dyadic*** functions take two arguments which are placed on either side of the function (`X f Y`);
 - ***Monadic*** functions take a single argument which is placed to the right of the function (`f Y`);
 - ***Niladic*** functions take no argument at all;
 - ***Ambivalent*** functions are dyadic functions whose left argument is optional.

### Configure Your Environment

Dyalog APL has a highly configurable development and debugging environment, designed to fit the requirements of very different kinds of programmers. This environment is controlled by configuration parameters; let us determine which context will suit you best.

#### What Do You Need?

All you need (except for love) is:

 - a window in which to type expressions that you want to be executed (white Session window);
 - one or more windows in which to create/modify user-defined functions (grey Edit windows);
 - one or more windows to debug execution errors (black Trace window).
 
The colours above refer to the positions of the windows in <!--figure-->the figures below<!--IDE_Window_Layout,IDE_Window_Horizontal_Layout,IDE_Floating_Window_Layout-->.

The default configuration is consistent with other software development tools and in it is possible to divide the session window into three parts which can be resized, as shown in <!--figure-->here<!--IDE_Window_Layout-->:

![The default window configuration](res/IDE_Window_Layout.png)

This configuration provides a single Edit window and a single Trace window, each of which is "docked" along one of the Session window borders. You can dock these windows along any of the Session window sides. For example, <!--figure-->the figure below<!--IDE_Window_Horizontal_Layout--> shows a configuration with three horizontal panes, highly suitable for entering and editing very long statements.

![The Edit and Trace windows in a horizontal layout](res/IDE_Window_Horizontal_Layout.png)

The Edit window supports the *Multiple Document Interface* (MDI). This means that you can work on more than one function at a time.

 - On the Windows interpreter you can use the "Window" menu to *Tile* and *Cascade*, or you can maximise any one of the functions to concentrate solely upon it.
 - If you are using RIDE the default behaviour is to open a tab per item you are editing.
 
If you are working on a relatively small screen you may find that you prefer to work with "floating" windows in a layout similar to the one in <!--figure-->here<!--IDE_Floating_Window_Layout-->:

!["floating" windows layout](res/IDE_Floating_Window_Layout.png)

 - On the Windows interpreter, you can either
   - grab the border of a sub-window (Edit or Trace) and then drag and drop it in the middle of the session window, as an independent floating window,
   - or enable the "Classic Dyalog mode", which can be set under "Options" ⇨ "Configure..." ⇨ "Trace/Edit" as shown in <!--figure-->the figure below<!--IDE_Configure_Classic_Mode-->.

![Option to set "Classic Dyalog mode"](res/IDE_Configure_Classic_Mode.png)

 - If you are using RIDE you can go to "Edit" ⇨ "Preferences" ⇨ "Windows" and enable "Floating windows" as shown in <!--figure-->the next figure<!--RIDE_Configure_Floating_Windows-->:
 
![Enable "floating" windows in RIDE](res/RIDE_Configure_Floating_Windows.png)

Working with floating windows has the added benefit of allowing you to have a stack of trace windows (as opposed to a single trace window), showing which functions call which other. This will be explored in [the section on configuring the trace tools](./First-Aid-Kit.ipynb#Choose-Your-Configuration).

#### A Text Editor; What For?

Some dfns can be defined by a single expression and so are easy to define inside the session. We used this technique before to define a function named `Average` and here we do it again:

In [1]:
Average ← {(+/⍵)÷≢⍵}

However, as one defines more complex functions, it can become more complicated to define dfns in the session window.

For one, the ability to define multi-line dfns in the session was only made available with Dyalog 18.0. In these notebooks you can see that multi-line dfns are defined in cells that start with `]dinput`:

In [2]:
]dinput
Average ← {
    (+/⍵)÷≢⍵
}

In the Windows interpreter an expression like the one above might result in a `SYNTAX ERROR`, as you type `Average ← {`, then hit <kbd>Enter</kbd> to change line, and then the interpreter tries to execute the line you entered, instead of allowing you to continue the definition of `Average`. To change this behaviour and allow for multi-line dfns, you can go to "Options" &#8680; "Configure..." &#8680; "Session" and check the experimental multi-line input box at the bottom.

If you are using RIDE and this capability is not ON by default, you can turn it ON by setting the `DYALOG_LINEEDITOR_MODE` environment variable to `1` in the connection menu, like demonstrated in <!--figure-->the next figure<!--RIDE_Enable_Multiline_Input-->.

![Configuring RIDE to allow for multi-line input](res/RIDE_Enable_Multiline_Input.png)

Secondly, even if multi-line input is enabled, you have to be very careful not to press <kbd>Enter</kbd> and ***fix*** the dfn by mistake. This is not a terrible blunder but it does lose some time and breaks your train of thought. In APL, to ***fix*** a file means to define the variables (data, functions and operators) in the file and to make them available in the session. Hence, to fix a function means assigning it to a name and making it available for use in the session window.

For these reasons, it is more appropriate to edit multi-line dfns and tradfns in a suitable text editor. The built-in editors for the Windows interpreter and for RIDE are likely to be suitable for you, but other alternatives exist. You can find an enumeration of most of the available alternatives over at [the APL Wiki](https://aplwiki.com/wiki/Text_editors). We will also cover this in more depth in [the chapter about source code management](./Source-Code-Management.ipynb).

## Simple Dfns

### Definition

Dfns are a set of statements enclosed by curly braces `{}`, so a simple dfn is typically created with the syntax `Name ← { definition }` where

 - `Name` is the function name. It is followed by a definition, delimited by a pair of curly braces `{` and `}`. This definition may make use of one or two variables named `⍵` and `⍺`, which represent the values to be processed. `⍵` and `⍺` are called ***arguments*** of the function;
 - `⍵` (<kbd>APL</kbd>+<kbd>w</kbd>) is a generic symbol which represents the right argument of the function;
 - `⍺` (<kbd>APL</kbd>+<kbd>a</kbd>) is a generic symbol which represents the left argument if the function is dyadic.
 
Here is an example monadic dfn:

In [3]:
Average ← {(+/⍵)÷(≢⍵)}

And here are two more dyadic dfns, and an example showing how they can be used:

In [4]:
Plus ← {⍺ + ⍵}
Times ← {⍺ × ⍵}
3 Times 7 Plus 9

Notice the final statement above is strictly equivalent to

In [5]:
3 × 7 + 9

as the order of evaluation is the same.

The arguments `⍵` and `⍺` are read-only (they cannot be reassigned) and are limited in scope to only being visible within the function itself. The only exception to the "read-only rule" is when providing a default left argument to the dfn, which we'll cover in [a bit](#Default-Left-Argument).

The developer does not need to declare anything about the shape or internal representation of the arguments and the result. This information is automatically obtained from the arrays provided as its arguments. This is similar to the behaviour of dynamically typed programming languages, such as Python or Javascript. So, our functions can work on any arrays.

A scalar added to a matrix returns a matrix. No need to specify it:

In [6]:
12 Plus 2 3⍴⍳6

And a vector of integer numbers multiplied by a scalar fractional number returns a vector of fractional numbers:

In [7]:
7.3 Times 10 34 52 16

These simple dfns are well suited for pure calculations of straightforward array manipulation. For example, here is how we can calculate the hypotenuse of a right-angled triangle from the lengths of the two other sides:

In [8]:
Hypo ← {(+/⍵*2)*0.5}
Hypo 3 4

In [9]:
Hypo 12 5

### Unnamed Dfns

A dfn can be defined and then discarded immediately after it has been used, in which case it does not need a name. For example, the geometric mean of a set of $N$ values is defined as the $N^\text{th}$ root of their product. The function can be defined and used inline like this:

In [10]:
{(×/⍵)*÷≢⍵} 6 8 4 11 9 14 7

But because we didn't assign it to a name, it was discarded after being used and can't be used again. This kind of function is similar to *inline* or *lambda* functions in other languages.

A special case is `{}`. This function does nothing but placed at the left of an expression, it can be used to prevent the result of the expression from being displayed on the screen:

In [11]:
3 Plus 3

In [12]:
{} 3 Plus 3

### Modifying The Code

Single line dfns may be modified using the function editor, as will be explained in the next section. They can also be redefined entirely, as many times as necessary, as shown:

In [13]:
Magic ← {⍺+⍵}

In [14]:
Magic ← {⍺÷+/⍵}

In [15]:
Magic ← {(+/⍺)-(+/⍵)}

We defined `Magic` and then changed it twice. Only the most recent definition will survive.

Now we will delve into how to define and use more complex dfns. For this we will explore the built-in editor that comes with your interpreter and you will also learn some more syntax to empower your dfns.

## More on Dfns

The dfns we wrote in the previous section were very simple and consisted of a single statement. We will now use the text editor to define multi-line dfns.

### Characteristics

 - Generally, the opening and closing braces are placed alone on the first and last lines. This is not mandatory, it is just a convention;
 - dfns might be commented at will;
 - one can create as many variables as needed: they are automatically deemed to be local variables, which means they only exist while inside the dfn. Note that this is opposite to the default behaviour of tradfns, where all variables are global (cf. [this subsection on dfns](#Local-Variables) and [this subsection on tradfns](#Local-Names));
 - the arguments `⍵` and `⍺` retain the values passed to them as arguments and may not be changed. Any attempt to modify them causes a `SYNTAX ERROR` to be reported, except when *defaulting the left argument* (cf. [the subsection on "Default Left Argument"](#Default-Left-Argument));
 - as soon as an expression generates a result that is not assigned to a name or used in any other way, the function terminates, and the value of that expression is returned as the result of the function. If the function contains more lines they will not be executed (cf. [the subsection on "Returning the Result"](#Returning-a-Result));
 - traditional *control structures* and *branching* cannot be used in dfns (cf. [the subsection on "Guards"](#Guards)).
 
We will explore these characteristics in the following sections.

### A Working Example

As an example, let us see how we could define a function to calculate the *harmonic mean* of a vector of numbers. The harmonic mean of a vector is the inverse of the sum of the inverses of the numbers in the vector. This will become clearer when you see the code.

First of all, we must choose a name for our new function. We will choose to name it `HarmonicMean`.

Among the multiple ways of invoking the text editor, let us use a very simple one: type the command `)ED` followed by a space and the name of the function to create: `)ED HarmonicMean`.

Unless you already redefined the defaults, a mostly empty window should open to the right of the session. We include an example screenshot of such a window from the Windows interpreter in <!--figure-->the figure below<!--Win_Editor-->:

![The edit window opened by the command `)ED HarmonicMean`](res/Win_Editor.png)

Now that we have a window in which to define our dfn, we can go ahead and implement the harmonic mean. We shall split the process into a series of simple steps, as shown in <!--figure-->the next figure<!--Win_Editor_Harmonic_Mean-->: calculate the inverses, sum them, invert the sum.

![The `HarmonicMean` function defined in the edit window](res/Win_Editor_Harmonic_Mean.png)

We also define it in this notebook:

In [16]:
]dinput
HarmonicMean ← {
    inverses ← ÷⍵
    sum ← +/inverses
    ÷sum
}

Now that we know how to compute the harmonic mean of a vector of numbers, we just have to ***fix*** it so that we can use it in our session. ***Fixing*** a function is somewhat analogous to compilation in other programming languages, but can also be seen as a sort of "File save" followed by an "import" in interpreted languages. There are many ways to fix a function:

| Interpreter | *Fix* method |
| :- | :- |
| Windows interpreter | go to "File" &#8680; "Fix" |
| RIDE | right-click the edit window &#8680; "Fix" |
| both | press <kbd>Esc</kbd> (also closes the edit window) |
| both | define a custom keyboard shortcut |

Following one of the appropriate methods should make the `HarmonicMean` function available for use in the session window. You can make sure it worked by simply typing in the name of the function and pressing <kbd>Enter</kbd> in the session:

In [17]:
HarmonicMean

If instead of getting the source code of the function you get a `VALUE ERROR`, then the function wasn't properly fixed.

Now that we can compute the harmonic mean of a vector of numbers, we can answer questions like the following:

"*If I take 6 hours to paint a wall and you take 2 hours, how much time will we need to paint the wall if we do it together?*"

This type of question can be answered by taking the harmonic mean of the individual times:

In [18]:
HarmonicMean 6 2

So the two of us would take 1h30min to paint the whole wall.

Similarly, if we had further help from two people who could paint the whole wall in 4 and 5 hours, respectively, the four of us would need

In [19]:
times ← 6 2 4 5
⊢hours ← HarmonicMean times

hours, or approximately

In [20]:
⌊60×hours

minutes.

After using your function for a bit you realise it is over-complicated, in the sense that it involves too many intermediate steps and you wish to get rid of those. If your edit window is still open, you can simply edit the function and fix it again. If the edit window was closed, you can type `)ED HarmonicMean` again or you can double-click the name `HarmonicMean` in the session. Both options will open the appropriate edit window.

After having done so, perhaps you rewrite your function to

In [21]:
]dinput
HarmonicMean ← {
    inverses ← ÷⍵
    ÷+/inverses
}

Then you fix it and use it again a couple of times:

In [22]:
HarmonicMean times

In [23]:
HarmonicMean 4 3 2 1 0

DOMAIN ERROR: Divide by zero
HarmonicMean[1] inverses←÷⍵
                         ∧


But now it resulted in an error, and the error messages says `HarmonicMean[1] inverses←÷⍵`. This `HarmonicMean[1]` means the error was in line 1 of the `HarmonicMean` function. Right next to it, it also shows the part of the code that caused the error, but supposed you had a really long file, perhaps with multiple functions. How would you find the appropriate line in the first place?

Thankfully, both RIDE and the Windows interpreter have an option that can be set to display line numbers (cf. <!--figure-->the figure below<!--Editors_Toggle_Line_Numbers-->).

![Option to toggle line numbers in RIDE (to the right) and the Windows interpreter (to the left)](res/Editors_Toggle_Line_Numbers.png)



Different people like to comment their code in different ways, and naturally dfns can be commented. For illustrative purposes, consider the dfn that follows, which has comments before any statement, inline with some statements, between the statements and at the end of the dfn:

In [24]:
]dinput
HarmonicMean ← {
    ⍝ Monadic function to compute the harmonic mean of a vector
    inverses ← ÷⍵   ⍝ This inverts the numbers in the argument
    ⍝ and then
    ÷+/inverses     ⍝ we sum those inverses and return them.
    ⍝ Of course this will give an error if 0 is in the input argument.
}

The comments do *not* affect the behaviour of the function:

In [25]:
HarmonicMean times

### Local Variables

Notice that our `HarmonicMean` function makes use of an intermediate variable, `inverses`. Let us check its value:

In [26]:
inverses

VALUE ERROR: Undefined name: inverses
      inverses
      ∧


We got a `VALUE ERROR` because `inverses` isn't defined. It is a ***local variable***, that is, a variable that lives within the dfn only while the dfn is being executed. As soon as we exit the dfn the variable stops existing.

The notion of *local* variable is opposed to the notion of ***global variable***, which is a variable that lives in the session and thus can be accessed from anywhere. Useful global variables are functions themselves, because defining them globally means they can be used from within other functions.

As an example, we already defined the functions `Average` and `HarmonicMean` in the session. Let us now define a dfn named `AMHM` that checks empirically a mathematical theorem: that the arithmetic mean is always larger than or equal to the harmonic mean of a set of numbers:

In [27]:
]dinput
AMHM ← {
    am ← Average ⍵
    hm ← HarmonicMean ⍵
    am ≥ hm
}

In [28]:
AMHM ⍳6

In [29]:
AMHM times

Notice how the definition of `AMHM` uses both the `Average` and `HarmonicMean` dfns without defining them inside `AMHM`. This works because they were previously defined in the session.

For larger applications, proper source code management is needed and you should make sure the functions `Average` and `HarmonicMean` have been fixed when you use them inside `AMHM`, but for illustrative purposes this is absolutely fine.

### Default Left Argument

It was mentioned above that the values of `⍺` and `⍵`, the variables that represent the arguments to a dfn, cannot be assigned to. The only exception to this is when specifying a default left argument. This is relevant because a dyadic dfn can always be used monadically, as from the syntactic point of view its left argument `⍺` is always optional. If the left argument is not present it is possible to assign a default value to `⍺` by means of a normal assignment. If `⍺` is given a value because the dfn was called dyadically, such assignment is skipped.

Consider a function which calculates the $N^\text{th}$ root of a number, but which is normally used to calculate square roots ($N = 2$). You can specify that the default value of the left argument (when omitted) is 2, as follows:

In [30]:
]dinput
Root ← {
    ⍺ ← 2
    ⍵*÷⍺
}

If we don't specify the left argument of `Root`, it computes the square root:

In [31]:
Root 625

But if we specify `⍺`, then the `⍺ ← 2` assignment is skipped:

In [32]:
4 Root 625

Because the assignment with `⍺←` is skipped entirely if `⍺` was provided, you should be careful with any side effects the expression to the right of `⍺←` might produce. We illustrate this with the following (silly) example:

In [33]:
]dinput
Silly ← {
    a ← 1        ⍝ This assignment always happens
    ⍺ ← a ← 2    ⍝ Not executed if ⍺ already has a value
    a            ⍝ Return a
}

In [34]:
Silly 0

Because we didn't provide a left argument, the `⍺ ← a ← 2` line is executed and `a` becomes 2.

On the other hand, if we provide a left argument the `⍺ ← a ← 2` line is skipped and `a` remains 1:

In [35]:
0 Silly 0

As for `⍵`, attempting to assign to `⍵` makes no sense: a dfn is always called monadically *or* dyadically, so the right argument is *always* present. Here's a function that computes the square root of `⍵`, except that first it tries to assign 10 to `⍵`:

In [36]:
]dinput
RootOf10 ← {
    ⍵ ← 10
    ⍵*0.5
}

Simply typing the name of the function shows its code:

In [37]:
RootOf10

And calling it monadically raises an error:

In [38]:
RootOf10 5

SYNTAX ERROR
RootOf10[1] ⍵←10
             ∧


### Returning the Result

We mentioned above that a dfn executes its statements until the first statement that does not assign its value. Here is a curious dfn with 4 statements:

In [39]:
]dinput
Count ← {
    1
    2
    3
    10÷0
}

Notice that all four statements are simple. If we run `Count`, what will the result be?

In [40]:
Count 73

The result we get is 1 because the first statement evaluates to 1 (obviously) and then we do nothing with it, so that is what the dfn returns. It doesn't matter what we wrote afterwards and it doesn't even matter that the very last statement would give a `DOMAIN ERROR`.

These superfluous statements should be avoided, as they will sooner or later cause unnecessary confusion.

As a basic debugging tool, it is possible to modify statements to display intermediate results:

In [41]:
]dinput
Count ← {
    ⎕←1
    ⎕←2
    ⎕←3
    10÷0
}

In [42]:
Count 73

1
2
3
DOMAIN ERROR: Divide by zero
Count[4] 10÷0
           ∧


Be careful: by using `⎕←` to display intermediate results, suddenly we are doing _something_ with the superfluous statements and they are all being executed (we even reached the error statement).

And even if we remove the statement that gives an error, the function will still return something other than the original 1:

In [43]:
]dinput
Count ← {
    ⎕←1
    ⎕←2
    3
}

In [44]:
Count 73

Now the function returned 3 instead of 1! So always be careful with which statement is actually giving the final result and avoid any extraneous statements.

## Preliminary exercises

You are ready to solve simple problems. We **strongly recommend** that you try to solve all the following exercises before you continue further in this chapter.

**Exercise 1**:

Write a dyadic function `Extract` which returns the first `⍺` items of any given vector `⍵`.

In [45]:
Extract ← {}
3 Extract 45 86 31 20 75 62 18  ⍝ should give 45 86 31
6 Extract 'can you do it?'      ⍝ should give 'can yo'

**Exercise 2**:

Write a dyadic function which ignores the first `⍺` items of any given vector `⍵` and only returns the remainder:

In [46]:
Ignore ← {}
3 Ignore 45 86 31 20 75 62 18   ⍝ should give 20 75 62 18
6 Ignore 'can you do it?'       ⍝ should give 'u do it?'

**Exercise 3**:

Write a monadic function which returns the items of a vector in reverse order:

In [47]:
Reverse ← {}
Reverse 'snoitalutargnoc'       ⍝ should give 'congratulations'
Reverse '!ti did uoY'           ⍝ should give 'You did it!'

**Exercise 4**:

Write a monadic function which appends row and column totals to a numeric matrix.

For example, if `mat` is the matrix

In [137]:
⊢mat ← 3 4⍴75 14 86 20 31 16 40 51 22 64 31 28

Then `Totalise mat` should give

In [138]:
⊢totMat ← 4 5⍴75 14 86 20 195 31 16 40 51 138 22 64 31 28 145 128 94 157 99 478

Notice that `mat` occupies the upper left corner of `totMat`:

In [139]:
totMat ∊ mat

**Exercise 5**:

Write a monadic function which returns the lengths of the words contained in a text vector:

In [51]:
Lengths ← {}
Lengths 'This seems to be a good solution'    ⍝ should give 4 5 2 2 1 4 8

**Exercise 6**:

Write a dyadic function which produces the series of integer values between the limits given by its two arguments:

In [52]:
To ← {}
17 To 29    ⍝ should give 17 18 19 20 21 22 23 24 25 26 27 28 29

**Exercise 7**:

Develop a monadic function which puts a frame around a text matrix. For the first version, just concatenate minus signs above and under the matrix, and vertical bars down both sides. Then, update the function to replace the four corners by four `+` signs. For example, with

In [140]:
⊢towns ← 6 10⍴'Canberra  Paris     WashingtonMoscow    Martigues Mexico    '

we want to have `Frame towns` return

```
+----------+
|Canberra  |
|Paris     |
|Washington|
|Moscow    |
|Martigues |
|Mexico    |
+----------+
```

Finally, you can improve the appearance of the result by changing the function to use line-drawing symbols. You enter line-drawing symbols by using `⎕UCS`: the horizontal and vertical lines are

In [54]:
⎕UCS 9472 9474

and the four corners are

In [55]:
⎕UCS 9484 9488 9492 9496

The final result should look like

In [56]:
⍝ ┌──────────┐
⍝ │Canberra  │
⍝ │Paris     │
⍝ │Washington│
⍝ │Moscow    │
⍝ │Martigues │
⍝ │Mexico    │
⍝ └──────────┘

**Exercise 8**:

It is very likely that the function you wrote for the previous exercise works on matrices but not on vectors. Can you make it work on both?

For example, `Frame 'We are not out of the wood'` should give

In [57]:
⍝ ┌──────────────────────────┐
⍝ │We are not out of the wood│
⍝ └──────────────────────────┘

**Exercise 9**:

Write a function which replaces a given letter by another one in a text vector. The letter to replace is given first; the replacing letter is given second, like this:

In [58]:
Switch1 ← {}
'tc' Switch1 'A bird in the hand is worth two in the bush'
⍝ A bird in che hand is worch cwo in che bush

**Exercise 10**:

Modify the previous function so that it commutes the two letters:

In [59]:
Switch2 ← {}
'ei' Switch2 'A bird in the hand is worth two in the bush'
⍝ A berd en thi hand es worth two en thi bush

## Dfns in Depth

In this section we will cover a couple of more advanced subtleties about dfns, and then we will move on to learn about tradfns.

### Guards

Previously we said that control structures cannot be used in dfns. However, it is possible to have a dfn conditionally calculate a result, by using a ***Guard***.

A *guard* is any expression which generates a one-item Boolean result, followed by a colon.

The expression placed to the right of a *guard* is executed only if the *guard* is true. In a dfn, this looks like `guard: expr` and is similar to a

```c
if (guard) {
    return expr;
}
```

of some traditional programming languages.

For example, this function will give a result equal to `'Positive'`, `'Zero'` or `'Negative'` if the argument `⍵` is respectively greater than, equal to, or smaller than zero:

In [1]:
]dinput
Sign ← {
    ⍵>0: 'Positive'
    ⍵=0: 'Zero'
    'Negative'
}

In [2]:
Sign 3

In [3]:
Sign ¯3.6

In [5]:
Sign 0

### Shy Result

Dfns can be written in such a way that they return a ***Shy result***. A *shy result* is a result which is returned, but not displayed by default.

Consider a function which deletes a file from disk and returns a result equal to 1 (file deleted) or 0 (file not found). Usually, one doesn't care if the file did not exist, so the result is not needed. But sometimes it may be important to check whether the file really existed and has been removed. So, sometimes a result is useless and sometimes it is useful... this is the reason why shy results have been invented.

In a dfn, a shy result happens when the last expression that is evaluated is assigned to a (local) name, as opposed to just leaving the result of the expression unassigned. For this to happen, one has to be careful to leave the closing curly brace `}` next to that final statement, instead of having `}` alone in a new line.

Here is the function above, written without guards and with a shy result:

In [19]:
Sign ← { s ← (3 8⍴'NegativeZero    Positive')[2+×⍵;] }

In [20]:
Sign 3

In [21]:
Sign ¯3.6

In [22]:
⎕← Sign 0

Notice what happens if we were to format `Sign` as we have formatted previous dfns:

In [23]:
]dinput
Sign ← {
    s ← (3 8⍴'NegativeZero    Positive')[2+×⍵;]
}

In [24]:
Sign 3

In [25]:
Sign ¯3.6

In [26]:
⎕← Sign 0

VALUE ERROR: No result was provided when the context expected one
      ⎕←Sign 0
        ∧


The `VALUE ERROR` we get above is a very subtle error. Because the `s ← ...` statement is an assignment, when executing the `Sign` dfn the interpreter goes on to execute the expression on the next line, but the next line has *no* expression and so the interpreter raises a `VALUE ERROR`. If we want to have a shy result on a multi-line dfn, we *must* have the final curly brace on the same line as the final statement.

### Lexical Scoping

***Lexical scoping*** is the mechanism that turns local and global variables into relative notions. Dfns usually have access to global variables, but the variables that are "global" depend on where the dfn was written.

As a purely illustrative example, consider the function defined below:

In [3]:
]dinput
MultiplyBy10← {
    v ← 10            ⍝ define some variable
    TimesV ← {v×⍵}    ⍝ multiply something with v
    TimesV ⍵
}

In [5]:
MultiplyBy10 5

In [6]:
MultiplyBy10 10

Notice how the `MultiplyBy10` function takes your input and gives it to the `TimesV` function, which is defined as a function that "*takes its input (`⍵`) and multiplies it with `v`*". But what is `v`? We do not give a value to `v` inside `TimesV`, so when APL encounters the expression `v×⍵` it looks at its surroundings for the meaning of `v`. Because `v` was defined in the enclosing dfn as `v ← 10`, that is the value that is used.

Consider now a similar example, but with more occurrences of the variable `v`:

In [17]:
v ← 100          ⍝ (1)

In [18]:
]dinput
MultiplyBy10 ← {
    GiveMeV ← {
        v
    }
    ⎕← v         ⍝ (3)
    ⎕← GiveMeV 1 ⍝ (4)
    v ← 10       ⍝ (5)
    ⎕← v         ⍝ (6)
    ⎕← GiveMeV 1 ⍝ (7)
    v ← 10×⍵       ⍝ (8)
    v
}

In [19]:
⎕← v             ⍝ (2)
MultiplyBy10 3
⎕← v             ⍝ (9)

Let us go over the assignments and the outputs of the code above:

 - we start by defining the variable `v` in our session and we set it to 100 (1);
 - we then define a function named `MultiplyBy10` which happens to contain another dfn inside it;
 - then we print the value of the session variable `v` and we see its value is 100 (2);
 - then we call the dfn `MultiplyBy10` with argument `3` and
   - we define a new dfn named `GiveMeV`;
   - we print the value of `v` (3). `MultiplyBy10` doesn't know what `v` is and so it looks for it in the session and finds a `v` whose value is 100, because of (1);
   - we then call the function `GiveMeV` which simply returns `v` and we print it (4). `GiveMeV` doesn't know what `v` is, so it asks `MultiplyBy10`, which in turns asks the session, which knows of a `v` whose value is 100, because of (1);
   - we then define `v` to be 10 inside of `MultiplyBy10` (5), making `MultiplyBy10` aware of a variable `v`;
   - then we print `v` inside `MultiplyBy10` (6), which is 10 because we just defined it as such;
   - then we call `GiveMeV` again and we print its result (7). `GiveMeV` doesn't know what `v` is, so it asks `MultiplyBy10`, that *now knows* what `v` is: it is 10 because of (5);
   - and we finish executing the `MultiplyBy10` dfn by assigning 30 to `v` (8), which we then return from the dfn;
 - we leave the `MultiplyBy10` dfn call and 30 gets printed because that was the result of the dfn call;
 - finally we print `v` once more, and the session knows `v` is 100, so that is what we print (9).
 
That might look confusing, but I assure you it makes a lot of sense. Just go through the code calmly and make sure you understand what each part does separately. Then, simulate the execution of the code with some pen and paper and write down what you think should get printed at each step. Then read the explanation above and compare it to what you thought was supposed to happen. You will get used to lexical scoping in no time.

Lexical scoping can reveal itself to be extremely useful in languages where functions can return other functions, which is *not* the case with APL. Even so, lexical scoping will prove to be helpful later down the road: imagine a large(r) function inside which we define a small utility dfn to use with an operator, but we want the dfn to make use of things we have already computed inside the outer function. Lexical scoping kicks in at that point, allowing the inner dfn to access everything the outer function already computed.

## Miscellaneous

### List of Variables and Functions

You can obtain a list of your variables by typing

In [141]:
)Vars

and you can obtain a list of your functions by typing

In [61]:
)Fns

### Use of the Result

To sum up what we have already seen, once a function has been written, its result can be:

 - Immediately displayed and lost:

In [62]:
HarmonicMean times

 - Included in an expression:

In [63]:
60×HarmonicMean times

 - Assigned to a variable:

In [64]:
hours ← HarmonicMean times

### Vector Representation

We saw that double-clicking on a function name invokes the editor, and allows the user to see the code. We can also type the name of the function and see its code:

In [65]:
HarmonicMean

However, in a printed document, the conventional representation of a function is as follows:

In [66]:
⎕VR 'HarmonicMean'

The function is delimited by a pair of `∇` symbols. This special symbol is named "***Del***" in English, or "***Carrot***" (because of its shape) in some French-speaking countries. You can type a *Del* with <kbd>APL</kbd>+<kbd>g</kbd>.

One can also obtain this representation (as a character array) using the built-in ***System function*** `⎕VR` (for "*Vector Representation*") of Dyalog APL. *System functions* are a special kind of function, provided with the development environment. The first character of their name is a ***Quad*** (`⎕`) which guarantees that they cannot conflict with user-defined names, and their names are also case-insensitive:

In [67]:
⎕VR 'Average'

In [68]:
⎕Vr 'Average'

In [69]:
⎕vr 'Average'

In [70]:
⎕vR 'Average'

Note that this is quite unusual in a programming language. The result of `⎕VR` is a character vector representing the source code of our function, which is now available for processing by other functions in the workspace!

System functions will be discussed in detail in [a later chapter](./System-Interfaces.ipynb).

### Invoking the Text Editor

Double-clicking a name which represents an existing item invokes the editor and displays its contents, using the colour scheme appropriate for the type of the item (function, character matrix, nested array, etc) defined via "Options" &#8680; "Colors..." if you are using the Windows interpreter or via "Edit" &#8680; "Preferences" &#8680; "Colours" if you are using RIDE.

You can also invoke the editor by pressing <kbd>Shift</kbd>+<kbd>Enter</kbd> when the input cursor is inside or adjacent to the name. This is perhaps the most convenient way as, when working in an APL session, you tend to use the keyboard much more than the mouse.

Let us define a character matrix with the uppercase latin alphabet:

In [142]:
⊢charMat ← 2 13⍴⎕A

For some items (e.g. numeric matrices, some nested arrays) the editor is only good for viewing them, while for others such as functions, text vectors and text matrices, the editor can also be used to edit them. In <!--figure-->the example below<!--Win_Editor_Edit_Character_Matrix--> we have invoked the editor, and changed the contents of our `charMat` variable:

![Editing a character matrix with the built-in editor of the Windows interpreter](res/Win_Editor_Edit_Character_Matrix.png)

In <!--figure-->the figure<!--Win_Editor_Edit_Character_Matrix--> the edit window tells us that we modified the character matrix. We must now fix it, like we did with dfns before, for the variable to reflect its new value. RIDE doesn't tell you that the character matrix was modified, but you still need to fix it if you want the new changes to come into effect.

If we now fix the changes to `charMat`, it will become a matrix with 4 rows and 29 columns (the length of its longest row).

If, for some reason, you made a mistake, you can exit the edit window *without* fixing the changes by pressing <kbd>Shift</kbd>+<kbd>Esc</kbd>.

If a name is currently undefined (has no value), double-clicking or pressing <kbd>Shift</kbd>+<kbd>Enter</kbd> on that name invokes the editor on it as if it were a new function. This is one way to create a function.

You can also invoke the editor using the command `)ED` as we did before. By default, it opens a **function** definition, but you can explicitly specify the type of a new object by prefixing its name with a special character, as shown in the table below.

| Prefix | Example | Object Created |
| :-: | :- | :- |
| none | `)ed new` | Function | 
| `∇` | `)ed ∇ borscht` | Function |
| `-` | `)ed - papyrus` | Simple character matrix |
| `→` | `)ed → crouton` | Simple character vector |
| `∊` | `)ed ∊ grunt` | Nested vector of character vectors, with one sub-vector per line |

See also [the appendix](./Appendices.ipynb#Invoking-the-Editor) for additional prefixes.

It is possible to open several edit windows using a single command. For example, `)ed Tyrex -Moose` will open two edit windows. The first to create or edit a function named `Tyrex` and the second to create a character matrix named `moose`.

If a prefix is specified for the name of an already existing object, the prefix is ignored and the editor is invoked according to the type of the existing object.

There are some other ways to invoke the editor:

 - use `⎕ED` instead of the command `)ED`. For example: `⎕ED 'Clown'`. `⎕ED` is a *System function*, a concept that will be discussed in [a future chapter](./System-Interfaces.ipynb);
 - type a name, or put the input cursor on an existing name, and activate the menu "Action" &#8680; "Edit";
 - for the Windows interpreter, type a name or put the input cursor on an existing name, and click the "Edit Object" available in the toolbar (cf. <!--figure-->the image below<!--Win_Toolbar_Edit_Object-->).
 
![The "Edit Object" button in the toolbar of the Windows interpreter](res/Win_Toolbar_Edit_Object.png)

## Tradfns

## Solutions

**Exercise 1**:

If `⍵` is the vector argument, we can use `⍵[...]` to index into `⍵` and then we can use the *Index Generator* primitive to generate the indices we need, which should be the integers from `1` to `⍺`... Except that if `⍺` is too big, we cannot generate indices larger than the length of the vector, so we also find the minimum between `⍺` and `≢⍵`. If we don't, we get a `INDEX ERROR` when indexing. Here is a possible implementation:

In [72]:
]dinput
Extract ← {
    ⍵[⍳⍺⌊≢⍵]
}

In [73]:
3 Extract 45 86 31 20 75 62 18

In [74]:
6 Extract 'can you do it?'

In [75]:
20 Extract 1 2 3

**Exercise 2**:

We can use a similar logic as that of the first exercise, except now we want to start the indices at `⍺+1` and go up until `≢⍵`. For this to happen, we first need to find out how many numbers we need. If a vector has `≢⍵` elements and we are going to drop `⍺` of them, we are going to be left with `(≢⍵)-⍺`. This means `⍳(≢⍵)-⍺` will generate the correct amount of indices, but they will be starting at `1` and should start at `⍺+1`, so we just need to add `⍺` to that.

Finally, we just need to worry about what happens if `⍺` is too large, i.e. if we want to ignore too many elements. The reverse of that concern is, what happens if `(≢⍵)-⍺` is too small? Recall that `(≢⍵)-⍺` tells you how many elements you will want to keep. But that number must be at least `0` elements (i.e. "keep no elements") because it makes no sense to keep a negative number of elements. So we can just use `⌈` to find the maximum between `0` and `(≢⍵)-⍺`. If `⍺` is too large, `0⌈(≢⍵)-⍺` gives `0` and `⍳0` is the empty vector `⍬`, so the indexing will work just fine.

In [76]:
]dinput
Ignore ← {
    ⍵[⍺+⍳0⌈(≢⍵)-⍺]
}

In [77]:
3 Ignore 45 86 31 20 75 62 18

In [78]:
6 Ignore 'can you do it?'

In [79]:
20 Ignore 1 2 3

**Exercise 3**:

Here is another exercise on index arithmetic. Here is what we want to happen with a vector argument of length 10:

 - generate the indices `1 2 3 4 5 6 7 8 9 10`
 - transform them into  `10 9 8 7 6 5 4 3 2 1`
 
We can do this if we do the correct subtraction:

In [80]:
11 - ⍳10 

But here `11` was a special number: it was `1+≢⍵`. So that is the general tactic we must employ:

In [81]:
]dinput
Reverse ← {
    ⍵[(1+≢⍵)-⍳≢⍵]
}

In [82]:
Reverse 'snoitalutargnoc'

In [83]:
Reverse '!ti did uoY'

**Exercise 4**:

This exercise can be solved by using the *Reduce* operator to sum: `+/`. Then we need to specify the axis we care about with `[1]` and `[2]`.

If we do `+/[1]` then we are reducing across the first axis, which means we get the sums along the columns:

In [144]:
⊢mat ← 3 4⍴75 14 86 20 31 16 40 51 22 64 31 28

In [145]:
+/[1]mat

We can then catenate the original matrix to these column sums vertically (by using `⍪`), and then use `+/[2]` to find the row sums and catenate them with `,`:

In [146]:
]dinput
Totalise ← {
    colSums ← +/[1]⍵
    r ← ⍵⍪colSums
    rowSums ← +/[2]r
    r,rowSums
}

In [148]:
Totalise mat

In [149]:
totMat ← 4 5⍴75 14 86 20 195 31 16 40 51 138 22 64 31 28 145 128 94 157 99 478
totMat ≡ Totalise mat

**Exercise 5**:

When reading this exercise, one should immediately realise that one is going to need to find *Where* the blank spaces are:

In [89]:
text ← 'This seems to be a good solution'

In [90]:
⍸' '=text

These indices tell where the blank spaces were in the character vector, and in between those indices are the indices that correspond to word characters:

 - the first word has indices `1 2 3 4`
   - then there is a space at position `5`
 - the second word has indices `6 7 8 9 10`
   - then there is a space at position `11`
 - ...
   - then there is a space at position `24`
 - the last word has indices `25 26 27 28 29 30 31 32`
 
The `32` above is `≢text`:

In [91]:
≢text

From the list above we can see that most words are *between* spaces, but the first and last words may not be between spaces. We can fix this by *forcing* the first and last words to be *between* spaces if we had a single `' '` to the beginning and to the end of our variable:

In [92]:
⍸' '=' ',text,' '

Now we have

 - the first space at position `1`
   - the first word in positions `2 3 4 5`
 - a space at position `6`
   - a word in positions `7 8 9 10 11`
 - ...
 - a space at position `25`
   - the last word in positions `26 27 28 29 30 31 32 33`
 - the final space at position `34`
 
So we can find the lengths of those runs of non-spaces by subtracting positions of consecutive spaces and then subtracting 1 from those, because `6-1` gives 5, but between 1 and 6 there's only 4 integers.

In [93]:
]dinput
Lengths ← {
    spaces ← ⍸' '=' ',⍵,' '
    idx ← ⍳(≢spaces)-1
    ¯1+spaces[1+idx]-spaces[idx]
}

In [94]:
Lengths 'This seems to be a good solution'

The final step where we index into `spaces` to get "all but the last" and "all but the first" elements of `spaces` could have been done with your previous solutions:

In [95]:
]dinput
Lengths ← {
    spaces ← ⍸' '=' ',⍵,' '
    ¯1+(1 Ignore spaces)-((¯1+≢spaces) Extract spaces)
}

In [96]:
Lengths 'This seems to be a good solution'

Notice that doing `¯1+expr` is a little "trick" you can employ when you want to subtract 1 from `expr`, but `expr` would then need parenthesis if you were to have it on the left of the `-` sign. For example, to subtract 1 from `≢spaces` you would have to do `(≢spaces)-1` but instead you can do `¯1+≢spaces`.

Finally, can you improve your solution to handle multiple consecutive spaces?

In [97]:
Lengths 'This only    has      five words   '

Probably seeing how your function works with multiple consecutive spaces gives the solution away: consecutive spaces will make a 0 appear in the final result, so we just have to remove those:

In [98]:
]dinput
Lengths ← {
    spaces ← ⍸' '=' ',⍵,' '
    r ← ¯1+(1 Ignore spaces)-((¯1+≢spaces) Extract spaces)
    (0≠r)/r
}

In [99]:
Lengths 'This only    has      five words   '

**Exercise 6**:

We have seen in [a previous section](./Some-Primitive-Functions.ipynb#Basic-Usage) how to create any arithmetic sequence of integers. This is just a special case of the algorithm give, with `Step ← 1`:

In [100]:
]dinput
To ← {
    ⍺+¯1+⍳(1+⍵-⍺)
}

In [101]:
17 To 29

**Exercise 7**:

This exercise is easier than it might look because the primitives to catenate vertically and horizontally, `⍪` and `,`, know how to deal with a matrix and a single scalar:

In [150]:
⊢towns ← 6 10⍴'Canberra  Paris     WashingtonMoscow    Martigues Mexico    '

In [151]:
towns,'|'

In [152]:
towns⍪'-'

So we just have to frame the four sides and *then* change the corners:

In [155]:
]dinput
Frame ← {
    f ← '|',⍵,'|'
    f ← '-'⍪f⍪'-'
    (r c) ← ⍴f
    f[1 r;1 c] ← '+'
    f
}

In [156]:
Frame towns

Here we used the very convenient indexing notation `f[1 r;1 c]` that allows us to access the positions `1 1`, `1 c`, `r 1` and `r c` of the matrix `f`.

Modifying this function to use the appropriate line-drawing symbols just means swapping the `'|-+'` in the original function. Care must be taken, however, when assigning the corners. With `f[1 r;1 c] ← m` APL expects `m` to be a scalar *or* an array with the same shape as that of the left, and since `f[1 r;1 c]` is a 2 by 2 matrix, we will have to reshape the vector with the corners into a 2 by 2 matrix as well:

In [157]:
]dinput
Frame ← {
    f ← (⎕UCS 9474) , ⍵ , (⎕UCS 9474)
    f ← (⎕UCS 9472) ⍪ f ⍪ (⎕UCS 9472)
    (r c) ← ⍴f
    f[1 r;1 c] ← 2 2⍴⎕UCS 9484 9488 9492 9496
    f
}

In [158]:
Frame towns

**Exercise 8**:

Well, what if what we wrote actually works for vectors? Let's give it a try:

In [159]:
Frame 'We are not out of the wood'

RANK ERROR
Frame[4] f[1 r;1 c]←2 2⍴⎕UCS 9484 9488 9492 9496
          ∧


A `RANK ERROR`? That makes no sense, after I frame `⍵` with the horizontal and vertical bars I have a framed matrix, I just need to update the corners... right? Wrong! Here's what happens if you use `,` and `⍪` on a vector:

In [110]:
text ← 'We are not out of the wood'
(⎕UCS 9472) ⍪ (⎕UCS 9474) , text , (⎕UCS 9474) ⍪ (⎕UCS 9472)

Because `text` has shape

In [111]:
⍴text

the primitives `,` and `⍪` work the same way. We need to turn input vectors into matrices with 1 row before we proceed with the framing process.

Let us define `shape ← ⍴⍵` as the shape of the input `Frame` gets. If `⍵` is a matrix, then `shape` is the appropriate shape, otherwise we need `⍵` to be reshaped into `1,shape`. In traditional programming languages we could use an `if-else` statement. However, dfns do not have support for such flow control structures and so we need to handle this matter in a different way.

A possibility is to create the vector `v←1,shape` and then index into it with care. If `⍵` is a vector, `v` has 2 elements and we want both. If `⍵` is a matrix, `v` has 3 elements and we want the elements in positions `2 3`. A way of generating the indices `1 2` if `⍵` is a vector and `2 3` if `⍵` is a matrix is with the expression `0 1+≢⍴⍵`:

| `⍵` | `≢⍴⍵` | `0 1+≢⍴⍵` |
| :- | :-: | :-: |
| vector | 1 | 1 2 |
| matrix | 2 | 2 3 |

When implementing the function we don't need to actually create `v`:

In [160]:
]dinput
Frame ← {
    shape ← (1,⍴⍵)[0 1+≢⍴⍵]
    f ← shape⍴⍵
    f ← (⎕UCS 9474) , f , (⎕UCS 9474)
    f ← (⎕UCS 9472) ⍪ f ⍪ (⎕UCS 9472)
    (r c) ← ⍴f
    f[1 r;1 c] ← 2 2⍴⎕UCS 9484 9488 9492 9496
    f
}

In [161]:
Frame text

In [162]:
Frame towns

**Exercise 9**:

The logic to solving this task resembles what we did in the first exercises. First we will find *Where* the first letter is, and then we will use indexing to put the second letter in those positions:

In [168]:
]dinput
Switch1 ← {
    r ← ⍵
    r[⍸⍺[1]=⍵] ← ⍺[2]
    r
}

In [167]:
'tc' Switch1 'A bird in the hand is worth two in the bush'

We take the intermediate step of doing `r ← ⍵` because we can't assign to `⍵` and so `⍵[⍸⍺[1]=⍵] ← ⍺[2]` wouldn't work.

**Exercise 10**:

A very obvious modification of the function above is to write

In [171]:
]dinput
Switch2 ← {
    r ← ⍵
    r[⍸⍺[1]=⍵] ← ⍺[2]
    r[⍸⍺[2]=⍵] ← ⍺[1]
    r
}

In [172]:
'ei' Switch2 'A bird in the hand is worth two in the bush'

However, a really elegant solution becomes possible if we use the *Index Of* primitive and the concept of changing the frame of reference we discussed previously (cf. [Changing The Frame of Reference](./Some-Primitive-Functions.ipynb#Changing-The-Frame-of-Reference)). We used this concept to convert lower case letters into upper case letters.

In [173]:
]dinput
Switch2 ← {
    pos ← (⍺,⍵)⍳⍵
    (⍺[2 1],⍵)[pos]
}

In [174]:
'ei' Switch2 'A bird in the hand is worth two in the bush'

What exactly is happening? Well, we are basically establishing the initial and final sets (as seen in [Changing The Frame of Reference](./Some-Primitive-Functions.ipynb#Changing-The-Frame-of-Reference)) as the sentence itself, but preceded by the two characters. For the initial set, we have them in their input order (the `(⍺,⍵)` above) but for the final set we swap them (the `⍺[2 1],⍵` above).

This establishes the following "conversion":

```
eiA bird in the hand is worth two in the bush
ieA bird in the hand is worth two in the bush
```

Whenever I have a character, I look for it in the first line, stopping as soon as I find it (that is what `⍳` does) and then I swap it with the corresponding character in the line below.

We can thus re-implement `Switch2` with more intermediate steps, to make this more obvious:

In [175]:
]dinput
Switch2 ← {
    ⎕← initialSet ← ⍺,⍵
    pos ← initialSet⍳⍵
    ⎕← finalSet ← ⍺[2 1],⍵
    finalSet[pos]
}

In [176]:
'ei' Switch2 'A bird in the hand is worth two in the bush'