# Execute & Format Control

## Execute

### Definition

_Execute_ is a monadic function represented by `⍎`, which you can type with <kbd>APL</kbd>+<kbd>;</kbd>; its dyadic use will be explained in the Specialist's Section at the end of this chapter.

_Execute_ takes a character vector (or scalar) as its argument.

If the character vector represents a valid APL expression, _Execute_ will just... execute it, as if it had been typed on the keyboard.
Otherwise, an error will be reported.

Take a look at the following example:

In [1]:
⎕← letters ← '5×6+2'

In [2]:
⍎letters

The argument can contain any valid expression:
 - numeric or character constants, or variables;
 - left arrows (assignment) or right arrows (branch);
 - primitive or defined functions and operators; and
 - calls to other _execute_ functions.

Let us define a short function to call from within _execute_:

In [3]:
∇ r ← x Plus y
    r ← x + y
∇

In the expression below, _execute_ calls our `Plus` function and creates a new variable:

In [4]:
⍎'new ← 3 + 4 Plus 5'

In [5]:
new

If the expression returns a result, it will be used as the result of _execute_:

In [6]:
res ← ⍎'3 Plus 10'

In [7]:
res

We could just as well have written

In [8]:
⍎'res ← 3 Plus 10'

In [9]:
res

<!--begin beware style=warning-->
***Beware***:

 > Note that if the argument does not return a result, it can still be executed, but _execute_ will not return a result, and any attempt to assign it to a variable or to use it in any other way will cause a `VALUE ERROR`.
<!--end-->

Take a modification of our `Plus` function that returns no result:

In [10]:
∇ x PlusNoRes y;r
    r ← x+y
∇

These expressions all work:

In [11]:
⍎''
⍎'     '
⍎'3 PlusNoRes 5'

But trying to assign results from all those expressions will fail,

In [12]:
res ← ⍎''
res ← ⍎'     '
res ← ⍎'3 PlusNoRes 5'

VALUE ERROR: No result was provided when the context expected one
      res←⍎''
          ∧
VALUE ERROR: No result was provided when the context expected one
      res←⍎'     '
          ∧
⍎VALUE ERROR: No result was provided when the context expected one
      3 PlusNoRes 5
        ∧


because that is equivalent to writing the following:

In [13]:
res ← 
res ←       
res ← 3 PlusNoRes 5

SYNTAX ERROR: Missing right argument
      res←
         ∧
SYNTAX ERROR: Missing right argument
      res←
         ∧
VALUE ERROR: No result was provided when the context expected one
      res←3 PlusNoRes 5
            ∧


### Some Typical Uses

#### Convert Text into Numbers

_Execute_ may be used to convert characters into numbers.
One common application of execute is to convert numeric data, stored as character strings in a text file (for example, a `.csv` file), into binary numbers.
You can just read in a string such as `'123, 456, 789'` and execute it to obtain the corresponding 3-item vector:

In [14]:
⍎'123, 456, 789'

We saw in [the chapter about user-defined functions](./User-Defined-Functions.ipynb) that _format_ can be used to convert numbers to characters; the reverse can be done using _execute_.
This explains why those two functions are represented by "reversed" symbols, as shown in <!--figure-->the figure below<!--Dual_Execute_Format-->:

![Representation of the duality between the _execute_ and _format_ primitives.](res/Dual_Execute_Format.png)

There is, however, a major difference: _format_ can be applied to matrices, whereas _execute_ can only be applied to vectors.

In [15]:
birthdate ← 'October 14th, 1952'
+/ ⍎ ⎕←birthdate[9 10,13+⍳5]

Notice that the `'14 1952'` above is a character _vector_ of length 7, and not a vector of 2 character vectors.
In fact, compare

In [16]:
birthdate[9 10,13+⍳5]

In [17]:
≢birthdate[9 10,13+⍳5]

with

In [18]:
'14' '1952'

to which `⍎` _cannot_ be applied:

In [19]:
⍎'14' '1952'

DOMAIN ERROR
      ⍎'14' '1952'
      ∧


Because _execute_ can only be applied to vectors, a matrix of numeric characters can only be converted after it has been ravelled.
But to avoid characters of one row being attached to those of the previous row, it is necessary to catenate a blank character before ravelling.

As an example, take the matrix `mat` below:

In [20]:
⎕← mat ← 4 4⍴' 8451237 9332607'

If we ravel it and execute it, we get

In [21]:
⍎,mat

which is not what we want.
The correction conversion will be obtained by first catenating a blank space:

In [22]:
⍎,mat,' '

#### A Safer and Faster Solution

Using _execute_ to convert characters into numbers may cause errors if the characters do not represent valid numbers.
So, we strongly recommend that you instead use `⎕VFI` (for _Verify and Fix Input_).
This is a specialised _system function_ that performs the same conversion, but securely, and is about twice as fast as _execute_.
`⎕VFI` will be studied in [the chapter about system interfaces](./System-Interfaces.ipynb).

#### Other Uses

_Execute_ can be used for many other purposes, including some that may be considered to be rather advanced programming techniques.
Some examples are provided in the Specialist's Section at the end of this chapter:
 - conditional execution (rather obsolete);
 - case selection (also obsolete);
 - dynamic variable creation.

Please bear in mind that these _execute_ use-cases aren't necessarily _recommended_ programming practices.

### Make Things Simple

The vector passed in to _execute_ is often constructed by catenating pieces of text, or tokens.

These tokens may contain quotes (which must then be doubled), commas, parentheses, etc.
But to build the final expression, you will also need quotes (to delimit the tokens), commas (to concatenate them), parentheses, and so on.

By now, the expression is becoming extremely complex.
It may be difficult to see if a comma is part of a token or is being used to concatenate two successive tokens, and this is only partly alleviated by syntax colouring that modern IDEs provide.
It may be hard to see whether or not the parentheses and quotes are properly balanced.
If the final expression is wrong, fixing it might be difficult, and if it is correct, later modifying it or expanding it might be just as difficult.

To simplify maintenance, it is good practice to assign the text to a variable before executing it.
If the operation fails, for any reason, you can just display the variable to see if it looks correct.
For example, here is a statement involving _execute_:

In [23]:
size ← 43
⍎'tab',(⍕size),'←(4 ',(⍕size),'⍴'') '''

⍎SYNTAX ERROR
      tab43←(4 43⍴') '
                 ∧


That's rather obscure!
If any problem occurs, it can be difficult to spot the cause.

Let us insert a variable just before the _execute_ function:

In [24]:
⍎debug←'tab',(⍕size),'←(4 ',(⍕size),'⍴'') '''

⍎SYNTAX ERROR
      tab43←(4 43⍴') '
                 ∧


Now, it easy to look at `debug` and see if its value is what we would expect:

In [25]:
debug

Obviously, this is not a correct statement, so it failed when we tried to execute it.

## The Format Primitive

The _format_ primitive function `⍕`, typed with <kbd>APL</kbd>+<kbd>'</kbd>, has already been briefly described in [a previous section](./User-Defined-Functions.ipynb#Format).
We shall cover it in more depth in this section.

### Monadic Format

Monadic _format_ converts any array, whatever its value, into its character representation.
This applies to numbers, characters, and nested arrays.
The result is exactly the same as you would see if you displayed the array on your screen, because APL internally uses monadic _format_ to display arrays.
The previous statement assumes that you have no options modifying your output; for example, if you have `]box on` it is no longer true that _format_ produces exactly the same representation as if you just displayed the the array yourself:

In [26]:
⍕1 (2 3)

In [27]:
]box

In [28]:
1 (2 3)

Ignoring the effects of such session modifiers like `]box`, which will be covered with some more detail later in this chapter (see [this section](#Output-User-Commands)), monadic _format_ is such that:
 - character arrays are not converted, they remain unchanged; and
 - numeric and nested arrays are converted into vectors or matrices of characters.

In [29]:
⎕←chemistry ← 3 5⍴'H2SO4CaCO3Fe2O3'

`chemistry` is a character matrix with shape `3 5`, and it is not modified by `⍕`:

In [30]:
⍴⎕←⍕chemistry

In [31]:
chemistry≡⍕chemistry

On the other hand, a numeric vector with, say, 3 items, becomes a (longer) character vector once converted:

In [32]:
≢52 69 76

In [33]:
≢⎕←⍕52 69 76

A nested matrix like `nesMat`,

In [34]:
⍴⎕←nesMat ← 2 3 ⍴ 'Dyalog' 44 'Hello' 27 (2 2 ⍴ 8 6 2 4) (2 3⍴1 2 0 0 0 5)

which we have already used before, becomes a character matrix that is 20 characters wide and with 3 rows, because `nesMat` contained two small matrices:

In [35]:
⍴⎕←⍕nesMat

### Dyadic Format

#### Definition

Dyadic _format_ applies **only** to numeric values; any attempt to apply it to characters will cause a `DOMAIN ERROR`.

The general syntax of _format_ is `descriptor⍕values`,
where `values` can be an array of any _rank_.
Dyadic _format_ converts numbers into text in a format that is described by the left argument, the format _descriptor_.
`descriptor` is therefore made up of two numbers:
 - the first number indicates the number of characters to be assigned to each numeric value; or to put it another way, the width of the field in which each numeric value is to be represented; and
 - the second number indicates how many decimal digits will be displayed.

In [36]:
⎕RL ← 73
⎕←nm ← (?3 4⍴200000)÷100

The representation above is the normal display of the `nm` matrix, and it is also how monadic _format_ would present the matrix.

Compare that with the result below, where we represent each number right-aligned in a field that is 8 characters wide, with 2 decimal digits.

In [46]:
8 2⍕nm

The result has, of course, 3 rows and 32 columns (8 characters for each of the 4 columns):

In [47]:
⍴8 2⍕nm

We can also draw a basic ruler (with the help of a short dfn) below the formatted matrix to help you count the width of each field:

In [69]:
]dinput
ruler ← {
    c ← ¯1↑⍴⍵
    ⍵⍪c⍴4 1 4 1/'¯''¯|'
}

In [70]:
ruler 8 2⍕nm

In the next example we represent each number in a field that is 6 characters wide, right-aligned, and with no decimal digits.

In [74]:
ruler 6 0⍕nm

The result has now 3 rows and 24 columns:

In [72]:
⍴6 0⍕nm

<!--begin remark-->
***Remark***:

 > You can see that the numbers to be formatted are **rounded** rather than truncated when the specified format does not allow the full precision of the numbers to be shown.
<!--end-->

#### Overflow

If a column is not wide enough to represent some of the numbers, these numbers will be replaced by asterisks.

Recall what `nm` looked like:

In [73]:
nm

We have seen above that `8 2⍕nm` looked good:

In [54]:
8 2⍕nm

If we reduce the width of the columns by 1, some values will now be adjacent to the values immediately to their left, making them difficult to read. For example, the second column will be adjacent to the left column:

In [57]:
7 2⍕nm

If we further reduce the width of the columns, the largest values will no longer fit in their allotted space and will be replaced by asterisks. Most of the other numbers are now adjacent to their neighbours.

In [59]:
6 2⍕nm

<!--begin remark-->
***Remark***:

 > To calculate the width required to represent a number you must account for the minus sign, the integer digits, the decimal point, and as many decimal digits as specified in the _descriptor_.

#### Multiple Specifications

One can define a different format for each column of numbers.
Each format definition is made of 2 numbers, so if the matrix has `n` columns, the left argument must have `2×n` items:

In [75]:
ruler 8 2 10 0 9 4 8 2⍕nm

In this case, we formatted the first column with `8 2`, the second column with `10 0`, the third column with `9 4`, and the fourth column with `8 2` again.

If the format descriptor (the left argument) does not contain enough pairs of values, it will be repeated as many times as needed, provided that the width of the matrix is a multiple of the number of pairs.

In other words, in `desc⍕values`, the residue `(≢desc)|2×¯1↑⍴values` must be equal to 0, otherwise a `LENGTH ERROR` is reported.
For example,

In [78]:
8 2 10 0⍕nm

In the above, columns 3 and 4 reused the _descriptors_ of columns 1 and 2, respectively `8 2` and `10 0`.
This is equivalent to writing the repeated pairs by hand:

In [80]:
8 2 10 0 8 2 10 0⍕nm

We can also compare `8 2 10 0⍕nm` with the statement above about the residue.
We see that `≢desc` is `4` in this case, and we have

In [81]:
4|2×¯1↑⍴nm

#### Scalar Descriptor

When the *descriptor* is reduced to a simple scalar, it specifies the number of decimal digits.
The columns are formatted in the smallest width compatible with the values they contain, plus one separating space.
For example:

In [82]:
2⍕nm

In the example above we see that numbers are displayed with 2 decimal digits and each column is separated from the preceding one (and from the left margin too!) by a single space.

This technique is convenient for experimental purposes, to have the most compact presentation possible, but you cannot control the total width of the final result with it.

## The `⎕FMT` System Function

The _format_ primitive function is inadequate 