# A Programmer's Guide To Octave

Adapted from Mike James [Programmers Guide to Octave](https://www.i-programmer.info/programming/other-languages/4779-a-programmers-guide-to-octave.html) by John Minter and converted into a Jupyter Notebook with an Octave Kernel.

N.B. - the section on Range indice needs some work...

## Introduction

This guide gives you the essence of Octave. It is designed to answer the questions that arise when you first encounter the language with some knowledge of any other language, no matter how slight. It isn't an in depth account and there are lots of topics that aren't even mentioned in passing. The aim is to get you started and to provide an orientation that will make exploring the rest of Octave easier.

Octave is an interpreted language that is pragmatic rather than theoretically pure. It gets the job done, but there aren't many interesting language features that are going to make you think hard about programming. However, if you have some math you want to get done, Octave will enable you to solve the problem fast. You can also think of it as the math companion of the R statistical language. Octave is ideal for general math or when you need to implement some
general statistical algorithm and play with it.

## Getting Octave

One can download Octave binaries from the Octave [Download](https://www.gnu.org/software/octave/download.html) site.  You will notice four tabs: **Source**, **GNU/Linux**, **macOS**, **BSD**, and **Windows**.

- The **Source** tab provides a link to the appropriate [ftp](https://ftpmirror.gnu.org/octave) mirror.

- The **GNU/Linux** tab notes that most distributions have Octave in their package managers. Those that support [**Flatpak**](https://flatpak.org/setup/) can get the software from [**Flathub**](https://flathub.org/apps/details/org.octave.Octave).

- The **MacOS** tab notes that one can get the application from [Homebrew](https://brew.sh/), [MacPorts](https://www.macports.org/), [Fink](http://www.finkproject.org/), or as an [App Bundle](https://octave-app.org/Download.html). I am currenly running the Octave 5.1.1 [Developer](http://octave-app.org/Developer-Downloads.html) App bundle on macOS Mojave.

- The **Windows** tab lists the current installers.

The installer will guide you through the steps needed to install Octave and you should accept the defaults unless you have a good reason not to.

The latest version of Octave comes with a GUI interface complete with a REPL that allows you to type in commands and see the result at once.

![command window](inc/cmd-win-1.png)

## Programming

Programming in Octave is slightly different from most languages - as it is a persistent programming environment. What this means is that you work at a command prompt and any variables you create persist for the session. Anything you type is evaluated as soon as you press the return key and the result is displayed - unless you finish the line with a semi-colon when the output is suppressed.

Note that is if you type `A=10` you get output:

In [1]:
A=10

A =  10


If you later type `A` Octave returns the value:

In [2]:
A

A =  10


Once you have created a variable at the command line, it stores whatever value you have given it until you close Octave down or use the clear command.

If you type clear then all variables are removed from the environment and if you then type A you will see the error message "error 'A' undefined near line x column y"

This retained or persistent mode is great for interactive programming. For example, you can read or type in a data matrix and then try out commands to compute whatever you are trying to get from it.

This is a good way to learn Octave but at some point you will want to write something that looks more like a traditional program that you load and run. This is very easy but it is important to realize that this all works within the persistent environment and code that you read in from a file is treated just like code you type in. That is after the code has run the environment has changed according to what variables the code created and modified. This means you can work by running one program on some data you typed in and then run another program to process the results of the first program.

If this very interactive style of working worries you - don't fret because you can create programs that work in isolation simply by starting them off with the clear command.

## Writing Programs

So how do you create an Octave program?

The answer is you can simply put the code that you would have typed at the command line into a file with a name that ends in .m.

The GUI interface comes with an editor but you can use any editor you care to but NotePad or NotePad++ are good choices. You can invoke the editor within Octave by typing edit filename which editor you get depends on the system and how it has been configured. If you don't change the default configuration then you will get the built in GUI editor which is what I recommend using to get started at least.

To run an Octave program you can simply type its name at the command prompt. Of course the file has to be stored in the current directory and you need to know that the ls command will list the contents of the current directory and the cd will change the directory in the usual way. 

If you have created a program in a file and when you type the name Octave cannot find it and reports an "undefined" error then it is most likely that the file isn't in the current directory.

The new GUI editor provides features such as syntax highlighting and you can use it to directly run a program and even set breakpoints for debugging. 

So for example if you start the editor using File,New, New Script and enter the commands:

```
A=10;
A
```

![Octave's editor](inc/editor.png)

You can run it by clicking on the Save and Run icon - the gear wheel with the arrow - give the file the name `test.m`.

The only question is where is the output?

The answer is in the command Window which you can switch to using the tabs at the bottom of the screen. A better idea is to drag the command Window to a new location and tile it so that you can see it and the editor. 

When you do see the command window you will see displayed:

![command window](inc/cmd-win-2.png)

Notice that after the program has run, A is still defined and has the value 10 so you can carry on using it by typing in commands at the command prompt or by running other programs that use a variable called A.

You can see what variables have been defined in the Workspace window along with information about their type:


In [3]:
A

A =  10



Also notice that Octave is case sensitive, so  if you saved the file as "Test.m"  you have to type "Test" whenever you want it;  "test", "TEST", "tesT" or any other variation just doesn't work.

If the file contains a function - see later - then you can call the function just by typing the file name followed by brackets and any arguments. For example if Test.m contains a function you can write Test() to run the function.

## Matrices

The central data structure in Octave is the matrix and the sooner we get to grips with it the better. All data in Octave is a matrix even a single number is a 1x1 matrix. 

You define a variable by using it, i.e. by assigning it a value. Variables are not typed and you can store anything in a variable at any time as long as it makes sense.

Numbers can be integer, floating point and complex. Complex values use i or j for the imaginary component.

For example:


```
A=1     integer
A=0.1   floating point
A= 1E3  floating point
A= 1+2i complex
```

Note:  `i` has to trail the value in a complex number - `1+2i`, **not** `1+i2`, and there can be no spaces between the i and the number...

All values in Octave are represented internally as double precision value (**including integers**). There are built-in functions for working with real and complex values.

There is also a special missing data value, `NA`, which can be used to implement statistical procedures that recognize missing data.

Logical values are represented by 1 as **true** and 0 as **false**. There are also strings that can be stored in variables and manipulated, and these work in the way you would expect.

What makes Octave special is the ease with which you can create matrices and work with them.

A matrix is define using square brackets to contain a list of numbers. Matrices are one or two-dimensional. Working with multidimensional structures is possible but more complicated.

When typing in a matrix the comma separator means move on one column and the semicolon means move on one row.

So for example:

In [4]:
A=[1,2,3;4,5,6]

A =

   1   2   3
   4   5   6



defines a 2x3 matrix.

Matrices cannot be irregular - if the first row has three values then the subsequent rows must have three values.

Obviously,

In [5]:
A=[1,2,3]

A =

   1   2   3



is a row vector and

In [6]:
A=[1;2;3]

A =

   1
   2
   3



is a column vector.

## Arithmetic with Matrices

The important thing about matrices in Octave is that the arithmetic operators
work as you would expect from math rather than programming.

As long as the matrices and scalars involved in the expression are conformable then the operation will be a matrix operation.

For example:

In [7]:
A=[1,2,3;4,5,6]
B=[7,8;9,10;11,12]
A*B

A =

   1   2   3
   4   5   6

B =

    7    8
    9   10
   11   12

ans =

    58    64
   139   154



will multiply the two matrices together. If 

In [8]:
C=10

C =  10


then

In [9]:
C*A

ans =

   10   20   30
   40   50   60



is a scalar multiplication of each element of A by 10.

The only variation in the way matrices are treated is that you can opt for an element-by-element operation by putting a dot before the operator.

For example,

In [10]:
A=[1,2,3;4,5,6]
B=[2,4,7;1,2,3]
A.*B


A =

   1   2   3
   4   5   6

B =

   2   4   7
   1   2   3

ans =

    2    8   21
    4   10   18



performs an element-by-element multiplication of the two matrices and not a matrix multiplication i.e. a<sub>ij</sub>b<sub>ij</sub>.

There are two specifically matrix operations that we need to know about. The single quote performs a complex transpose, e.g.

In [11]:
A=[1,2,3]'

A =

   1
   2
   3



is a column vector.

For a real matrix the dash is a simple transpose. If the matrix is complex the dash also takes the complex conjugate (i.e. it is the Hermitian transpose). If you want a simple transpose of a complex matrix then use dot single quote.

The second is the inverse which is more complicated. You can use the inverse function to find the inverse of any square non-singular matrix. For example:

In [12]:
A = [1,2;3,4]
B = inverse(A)
A*B

A =

   1   2
   3   4

B =

  -2.00000   1.00000
   1.50000  -0.50000

ans =

   1.00000   0.00000
   0.00000   1.00000



displays the identity matrix.

There is another way to use the inverse via an operator.

The expression `x\y` is the left division of `y` by `x` and is equivalent to

```
inverse(x)*y
```

The advantage of using this notation is that the inverse isn't actually used in the calculation.

The expression `x/y` is the right division of `x` by `y` and it is equivalent to

```
x*inverse(y)
```

Again the inverse matrix is never computed and generalized inverses are used if necessary.

## Indexing

In an ideal world we would just define some matrices and get on with combining them using matrix arithmetic. In practice matrices are often the wrong shape for the job and we need to get at sub-matrices or even single elements.

This is what indexing is all about - specifying sub-matrices.

The indexing operator is `()` and if you specify index values for each dimension of a matrix then things work as you would expect. For example:

In [13]:
A(1,2)

ans =  2


is the value in row 1 column 2. 

You can assign a new value to a single element e.g.

In [14]:
A(1,2)=3

A =

   1   3
   3   4



If you supply just one index for a 2D matrix then it is treated as a single 1D column vector obtained by stacking up each column of the original matrix.

To define sub-matrices you need to use more complicated indexing. There are two approaches - vectors of simple indexes or ranges.

## Vector Indexes
A vector of indexes just picks out the combined set of elements that each index would pick out. For example:

In [15]:
A([1,2],1)

ans =

   1
   3



picks out `A(1,1)` and `A(2,1)` and the result is a column vector because you have specified part of a column of the original matrix.

It doesn't matter if the vector of indexes is a row or column vector. That is `A([1;2],1)` is the same as `A([1,2],1)`.

You can use vector indexes in both index positions. for example:

In [16]:
A([1,2],[1,2])

ans =

   1   3
   3   4



picks out `A(1,1)`, `A(2,1)`, `A(1,2)` and `A(2,2)` which is returned as a `2x2`
matrix because this is the "shape" of the sub matrix in the original array.

The most important thing to notice about vector indexes is that they allow you to pick out non-contiguous columns and rows. For example:

In [17]:
A = [1,2,3;4,5,6;7,8,9]
A([1,3],[1,2])

A =

   1   2   3
   4   5   6
   7   8   9

ans =

   1   2
   7   8



also picks out a `2x2` matrix but it takes the intersection of the first and third rows and the first and second columns.

In general if you write something like:

```
Matrix[ v1, v2]
```

where v1 and v2 are vectors then this gives a matrix made up of the rows specified by v1 and the columns specified by v2.

You can use variables within vector indexing, for example:

In [18]:
v1=[1,3]
v2=[1,2]
A(v1,v2)

v1 =

   1   3

v2 =

   1   2

ans =

   1   2
   7   8



works and is the same as:

In [19]:
A([1,3],[1,2])

ans =

   1   2
   7   8



Also:

In [20]:
s=1
A([s,3],[1,2])

s =  1
ans =

   1   2
   7   8



Notice that any sub-matrix you specify can be retrieved or you can assign a matrix of the same size to it. For example:

In [21]:
A([1,3],[1,2])=[0,0;0,0]

A =

   0   0   3
   4   5   6
   0   0   9



zeros the intersection of the first and third rows and the first and second columns.

## Range indexes

Vector indexes work well when you want to select a small number of rows and columns but they are a lot of work if the number increases. For example how would you select the first 1000 elements of column 1? Clearly


```
A([1,2,3,4....],1)
```

isn't going to be easy typing.

The solution is to use a range. A range represents a numerical range of value. For example, 1:10 generates the numbers 1 to 10. You can also specify a step size, so for example 1:2:7 (or 1:2:8) generates 1,3,5,7. Using a range you can pick out lots of rows and columns very easily but only if they have a simple numerical pattern.

For example to pick out the first 1000 elements in column 1 you would write:

```
A=(1:2000)
A(1:1000,1)
```

You can write:


```
A([1:1000],1)
```

and get the same 1000 rows but a range is interpreted as a row vector so you don't have to.

If you would like every other row of the matrix you can use:

```
A([1:2:1000],1)
```

In general a range is specified as

```
start:increment:end
```
and if you leave out the increment it is assumed to be 1 and the range is

```
start:end
```

The increment can be negative.

You can also specify a default range as just a : which means the entire possible range. So

```
A(:,1)
```

You can also specify a default range as just a : which means the entire possible range. So

```
A(:,1)
```

means all the rows and column 1 i.e. all of column 1 and

```
A(1,:)
```

means all of the columns and row 1 and finally

```
A(:,:)
```

is the same as A.


You can also use A(:) to mean all of the rows and columns returned as a single column vector in column order.

As with vector indexes you can use variables to specify the start, stop and increment values.

You can also assign to the selected sub-matrix if what you are assigning has the same size as the sub-matrix.

Finally you can mix vector and indexing.

If you want to pick out the first 1000 rows and the 1500th row you can write:

```
A([1:1000,1500],:)
```

Also notice that an index can occur more than once in both vector, range and mixed indexing. If this happens the column or row is included more than once. This can be used to construct larger matrices from smaller ones - more on this in the next section. For example:

```
B=[1,2,3]
C=B([1,1,1,1],:]
```

makes C a 3x4 matrix with identical rows.


## Defining Matrices

Often you need to define regular matrices, i.e. that have a regular pattern of entries. For example:

In [22]:
A=eye(3)

A =

Diagonal Matrix

   1   0   0
   0   1   0
   0   0   1



sets up a 3x3 identity matrix usually represented by I hence the name of the function. In general eye(m.n) is a unit diagonal m x n matrix.

You can set up a matrix of 1s using ones(n) or ones(m,n) for a n x n or m x n matrix. There is also zeros that works in the same way.

There are various other functions that will create other constant, periodic or well known matrices. These you can look up in the documentation and they are not difficult to use.

There is one other general idiom used a lot in Octave to build larger matrices from smaller.  You can build a bigger matrix by putting together, i.e. concatenating, larger matrices. For this to worth the sub-matrices have to be the right size to fit together to make a matrix without any holes or ragged edges.

For example if

In [23]:
A=[1,2]
B=[3,4]

A =

   1   2

B =

   3   4



then

In [24]:
C=[A;B]

C =

   1   2
   3   4



i.e. a 2x2 matrix.

This can become confusing but as long as you keep a clear head and think of it as fitting a jigsaw together it all works.

For example if C is the 2x2 matrix given above then:

In [25]:
D=[C,C;C,C]

D =

   1   2   1   2
   3   4   3   4
   1   2   1   2
   3   4   3   4



is a 4x4 square matrix equal to

## Functions

There aren't many places that Octave differs from other languages but its treatment of functions is worth noting.

At its simplest a function is defined using:

```
function name
 Octave commands
endfunction
```

You can define a function interactively at the command prompt or in a file of the same name. You can include multiple functions within a file but if you want to load the function by using its name the file and the function have to have the same name.

If you want to include parameters you can in the usual way. It is the way return values are specified that makes Octave slightly different from the norm. In most languages you finish a function with a return value statement which not only brings the function to a close but returns the value specified. This is not how Octave works.

In Octave you specify the return variables at the start of the function. That is

```
function variable= name(parameters) 
 Octave commands
endfunction
```

When the function finishes any value stored in the specified variable is returned as the result of the function.

You can also exit a function using the return command but you don't have to specify the return value as this is simply the value in the return variable.

For example:

```
function myResult=myFunction(myParameter) 
 myResult=myParameter
endfunction
```

With this definition the command

```
value=myFunction(1)
```

stores 1 in value because myParameter is set to 1 and the function sets myResult to whatever myParameter is and this is the variable that has the return value.

A function to add two numbers or matrices would be:There aren't many places that Octave differs from other languages but its treatment of functions is worth noting.

At its simplest a function is defined using:

In [26]:
function c=add(a,b)
 c=a+b; 
endfunction

answer=add(1,2)

answer =  3


You can specify multiple return variables as a row vector.

For example if you want a function that returns the sum and difference of two value you could write:

In [27]:
function [s,d] =sumdiff(a,b)
 s=a+b;
 d=a-b;
endfunction

To use this you can either assign the result to variables in a row vector e.g.

In [28]:
[x,y]=sumdiff(1,2)

x =  3
y = -1


stores 3 in x and -1 in y.

There are some utility functions that modify the way multiple return values work but they are easy to work out once you have seen the basic idea.

All variables in functions are local to the function and if you don't assign a value to a return variable you will get an error message.

## Vectorization

In most languages for loops are the way to get iterative work done. Octave has for loops, if statements and everything else you would expect to find in a programming language but they tend not to be used as much. The reason is that the predominant mode of programming is to use matrix/vector operations.

Functions are automatically applied to entire vectors or matrices element-by-element. For example:


In [29]:
sin([1,2,3])

ans =

   0.84147   0.90930   0.14112



returns the a vector equal to `[sin(1),sin(2),sin(3)]` - this would normally need a for loop.

Even user defined functions work in this way without you having to do anything extra.

If a function has two or more parameters then the operation is still performed on the elements taken a set at a time.

For example even the sumdiff function we defined earlier will work with vectors:

In [30]:
[x,y]=sumdiff([1,2,3],[4,5,6])

x =

   5   7   9

y =

  -3  -3  -3



and you will discover that x is `[1+4,2+5,3+6]` and y is `[1-4,2-5,3-6]` .

If you try this with different sized vectors you will discover that the elements of the smaller are reused until the computation is complete - look up "broadcasting" in the documentation.

Finally the most difficult vectorization for general programmers to absorb is the inner product.

If you want to work out the sum of xi yi i.e the sum of the element wise product of two vectors, you don't need to use a loop. Instead you simply convert to an inner product form -

```
result=x*y'
```

This transposes x and then multiplies the two together - which in matrix terms is the sum of the elementwise product of the row by the column. Notice that this is not the same as x'*y which is a matrix called the outer product.

You can even use the inner product to sum a vector. For example:

In [31]:
x=[1,2,3];
y=ones(1,3);
s=x*y'

s =  6



This works because y is set to a row vector of 1s and thus we simply sum up the elements of y.

Again no for loops were harmed in this calculation.

Why do we bother - simply because vectorized forms are faster to work out and fit more naturally with the math.

## Where next?

There is a lot of goodness in Octave but this article has covered most of the ideas that a general programmer would find difficult at first.

You need to find out about the control instructions - for loop, if, etc. You also need to find out about strings and other data structures. Octave also has excellent plotting commands and lots of libraries that will simplify your math tasks. In particular the optimization functions are extremely general and make it possible to write programs that implement things like neural networks using custom measures of goodness of fit. 

It is worth mentioning that most Octave programs make minimal use of loops and this is perhaps the one thing that that is difficult for a programmer familiar with another language to master. Writing an Octave program is more like writing a mathematical expression within some code to get data and present results.  

Now that you have a start on programming in Octave, this is, in the main, just a matter of reading the documentation and always remember that Octave and MatLab are more or less identical.