# Week 2: Octave Tutorial

## Basic Operations

```octave
% Elementary Maths

5 + 6 
% ans = 11

3 - 2
% ans = 1

5 * 8
% ans = 40

1 / 2
% ans = 0.50000

2^6
% ans = 64

% also: >, <, >=, <=

% Logical Operations

1 == 2
% ans = 0

1 ~= 2
% ans = 1

1 && 0
% ans = 0

1 || 0
% ans = 1

xor(1, 0)
% ans = 1

% Formatting

PS1('>> '); % changes terminal prompt

a = 3 % prints out a = 3
% a = 3 
a = 3; % semicolon suppresses output

a = pi
a
% 3.1416
disp(a);
% 3.1416

disp(sprintf('2 decimals: %0.2f', a))
% 2 decimals: 3.14

format long
a
% a = 3.14159265358979
format short
a
% a = 3.1416

% Vectors & Matrices

A = [1 2; 3 4; 5 6]
% A = 
%     1 2
%     3 4
%     5 6

v = [1 2 3]
% v = 
%     1 2 3

v = [1; 2; 3]
% v = 
%     1
%     2
%     3

v = 1:0.2:2
% v = 
%     1.0000 1.2000 1.4000 1.6000 1.8000 2.0000

v = 1:3
% v = 
%    1 2 3

ones(2, 3)
% ans = 
%     1 1 1
%     1 1 1

c = 2*ones(2, 3)
% c =
%     2 2 2
%     2 2 2

w = zeros(1, 3)
% w = 
%     0 0 0

w = rand(1, 3)
% w =
%     0.91477 0.14359 0.84860

w = randn(1, 2) % Drawn from Gaussian distribution - mean 0; variance / standard distribution 1
% w =
%     -0.33517 1.26847 -0.28211

hist(w) % plots a histogram.

eye(3)
% I =
% Diagonal Matrix
%     1 0 0
%     0 1 0
%     0 0 1
```

## Moving Data Around

```octave
% Sizes

A = [1 2; 3 4; 5 6];

size(A)
% ans =
%     3 2

size(A, 1) % num rows
% ans = 3

size(A, 2) % num columns
% ans = 2

v = [1 2 3 4];

length(v) % gives you size of longest dimension
% ans = 4

% Loading and Saving External Data

load featuresX.dat
load('featuresX.dat') % equivalent commands for loading a file; data in variable called featuresX

who % shows variables currently in memory

whos % same as who, but more detailed

save hello.mat v; % saves variable v into file hello.mat in binary format
save hello.txt v --ascii; % saves variable v into file hello.txt in ascii format

clear % deletes all variables in workspace

% Indexing

A(3, 2) % element in 3rd row and 2nd column
% ans = 6

A(2, :) % all columns in second row
% ans = 
%     3 4

A(:, 2) % All rows in second column
% ans =
%     2
%     4
%     6

A([1 3], :) % First and third rows, all columns
% ans =
%     1 2
%     5 6

A(:, 2) = [10; 11; 12]
% A =
%     1 10
%     3 11
%     5 12

% Appending

A = [A, [100; 101; 102]] % append a column vector to the right
% A =
%     1 10 100
%     3 11 101
%     5 12 102
    
A(:) % put all elements of A into a single vector
% ans =
%     1
%     3
%     5
%     10
%     11
%     12
%     100
%     101
%     102

A = [1 2; 3 4; 5 6];
B = [11 12; 13 14; 15 16];
C = [A B]
% C = 
%     1 2 11 12
%     3 4 13 14
%     5 6 15 16

C = [A; B]
% C =
%     1  2
%     3  4
%     5  6
%     11 12
%     13 14
%     15 16
```

## Computing on Data

```octave
A = [1 2; 3 4; 5 6];
B = [11 12; 13 14; 15 16];
C = [1 1; 2 2];

A * C % dot product
% ans =
%     5    5
%    11   11
%    17   17

A .* B % element-wise multiplication
% ans =
%    11   24
%    39   56
%    75   96

A .^ 2 % element-wise square
% ans =
%     1    4
%     9   16
%    25   36

v = [1; 2; 3];

1 ./ v
% ans =
%     1.00000
%     0.50000
%     0.33333

log(v) % element-wise
% ans =
%     0.00000
%     0.69315
%     1.09861

% exp(v), abs(v), -v are all element-wise

v + ones(length(v), 1)
% ans =
%     2
%     3
%     4

A
% A =
%     1 2
%     3 4
%     5 6
    
A' % transpose
% ans =
%     1 3 5
%     2 4 6

a = [1 15 2 0.5]

max(a)
% ans = 15

[val, ind] = max(a)
% val = 15
% ind = 2

a < 3
% ans =
%     1 0 1 1

find(a < 3)
% ans =
%     1 3 4

A = magic(3) % Magic square: all rows, columns and diagonals sum up to the same thing.
% A =
%     8 1 6
%     3 5 7
%     4 9 2

[r,c] = find(A >= 7)
% r =
%     1
%     3
%     2
% c =
%     1
%     2
%     3
A(1,1)
% ans = 8
A(3,2)
% ans = 9
A(2,3)
% ans = 7

sum(a)
% ans = 18.500
prod(a)
% ans = 15
floor(a)
% ans = 1 15 2 0
ceil(a)
% ans = 1 15 2 1

max(rand(2), rand(2)) % takes the largest, element-wise
% ans =
%    0.69557   0.77169
%    0.72736   0.53449

max(A, [], 1) % column-wise max
% ans =
%     8 9 7

max(A, [], 2) % row-wise max
% ans =
%     8
%     7
%     9

max(max(A))
% ans = 9
max(A(:)) % two ways of finding largest element in a matrix.
% ans = 9

sum(A, 1) % column-wise sum
% ans =
%    15   15   15
sum(A, 2) % row-wise sum
% ans =
%    15
%    15
%    15
sum(sum(A .* eye(3))) % diagonal sum
% ans = 15
sum(sum(A .* flipup(eye(3)))) % other diagonal sum

pinv(A) % inverse

% ans =
%    0.147222  -0.144444   0.063889
%   -0.061111   0.022222   0.105556
%   -0.019444   0.188889  -0.102778
```

## Plotting Data

![fig 1](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-1.png)
![fig 2](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-2.png)
![fig 3](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-3.png)
![fig 4-5](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-4-5.png)
![fig 6](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-6.png)
![fig 7](https://beths3test.s3.amazonaws.com/machine-learning-notes/FIGURE-7.png)

```octave
% Plot sin
t = [0:0.01:0.98];
y1 = sin(2*pi*4*t);
plot(t, y1);
title('FIGURE 1');

% Plot cos
y2 = cos(2*pi*4*t);
plot(t, y2);
title('FIGURE 2');

% Plot both
plot(t, y1);
hold on;
plot(t, y2, 'r');
xlabel('time');
ylabel('value');
legend('sin', 'cos');
title('FIGURE 3');
print -dpng 'FIGURE-3.png'; % save
close

% Multiple figures
figure(1); plot(t, y1);
figure(2); plot(t, y2);

# Subplots
subplot(1,2,1); % Divides plot into a 1x2 grid, access first element.
plot(t, y1);
title('FIGURE 4');
subplot(1, 2, 2);
plot(t, y2);
title('FIGURE 5');

clf; % clears a figure
A = magic(5);
imagesc(A); % 5 x 5 grid of colours where colours correspond to different values.
imagesc(A), colorbar, colormap gray; % comma chaining of function calls
```

## Control Statements: for, while, if statement

```octave
% for loop
v = zeros(10, 1);
for i = 1:10, % can also use break and continue
    v(i) = 2^i;
end;
v
% v =
%     2
%     4
%     8
%     16
%     32
%     64
%     128
%     256
%     512
%     1024

% while loop
i = 1;
while i <= 5,
    v(i) = 100;
    i = i + 1;
end;
v
% v =
%     100
%     100
%     100
%     100
%     100
%     64
%     128
%     256
%     512
%     1024

% if and break
i = 1;
while true,
    v(i) = 999;
    i = i + 1;
    if i == 6,
        break;
    end;
end;
v
% v =
%     999
%     999
%     999
%     999
%     999
%     64
%     128
%     256
%     512
%     1024

% if / else
v(1) = 2;
if v(1) == 1,
    disp('The value is one');
elseif v(1) == 2,
    disp('The value is two');
else
    disp('The value is not one or two');
end;
% The value is two

% exit octave
exit
quit

% Functions

% create a file with name of function as title; have to be in right directory or modify octave search path (addpath(path))
% in squareThisNumber.m
function y = squareThisNumber(x)
y = x^2;
% y is return value, x is argument

% return multiple values
function [y1, y2] = squareAndCubeThisNumber(x)
y1 = x^2;
y2 = x^3;
%%%%
[a, b] = squareAndCubeThisNumber(5);
a
% a = 25
b
% b = 125
```

## Vectorization

The hypothesis function for linear regression can be thought of as:

$$ h_\theta(x) = \sum^n_{j=0}\theta_jx_j $$

Which might be implemented in octave with this unvectorized code:

```octave
prediction = 0.0;
for j = 1:n + 1,
    prediction = prediction + theta(j) * x(j);
end;
```

Or it can be thought of as:

$$ h_\theta(x) = \theta^Tx $$

Which might be implemented in octave with this vectorized code:

```octave
prediction = theta' * x;
```

Similarly, gradient descent can be thought of as:

$$ \theta_j := \theta_j - \alpha\frac{1}{m}\sum^m_{i=1}(h_\theta(x^{(i)}) - y^{(i)})x_j^{(i)} $$

But a vectorized form might look like:

$$ \theta := \theta - \alpha\delta \\
\text{where: } \\
\delta = \frac{1}{m}\sum^m_{i=1}(h_\theta(x^{(i)}) - y^{(i)})x
$$

On this implementation, $\theta$ and $\delta$ are both $\mathbb{R}^{n + 1}$ vectors.