# Multivariate Linear Regression - [Week 2](https://www.coursera.org/learn/machine-learning/home/week/2)

## Previous Hypothesis equation for a single feature (Linear Regression with one variable) was this...


$h_\theta(x) = \theta_0 + \theta_1x$

## Similarly, the equation for multiple features (Multivariate Linear Regression) begins like this ...

$h_\theta(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n$

## This can be reduced down to this (See lecture slides for how we got here)...

$h_\theta(x) = \theta^Tx$

## New algorithm for Gradient Descent...

$\theta_j := \theta_j - \alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)}) - y^{(i)})x^{(i)}_j$

## For Gradient Descent in Practice you can use [```feature scaling```](https://www.coursera.org/learn/machine-learning/supplement/CTA0D/gradient-descent-in-practice-i-feature-scaling) and [```mean normalization```](https://www.coursera.org/learn/machine-learning/supplement/CTA0D/gradient-descent-in-practice-i-feature-scaling) to apply the same scale and allow for faster convergence...

### Feature Scaling...values should be within 1 or .5 of one another ideally...

-1 <= $x_{(i)}$ <= 1

-.5 <= $x_{(i)}$ <= .5

Do this by dividing the feature value by the largest feature value.  For instance, if predicting house prices and x1 = the size in $feet^2$ and x2 = the number of bedrooms...

$x_1 = \frac{size(feet^2)}{<range of bedrooms in dataset>}$

$x_2 = \frac{number of bedrooms}{<rangeofbedroomsindataset>}$

Basically defined as...

$x_i := \frac{x_i}{S_i}$

### Mean Normalization...subtract the average value from a value and divide by the range of values...

$x_i := \frac{x_i - \mu_i}{S_i}$

## Computing Parameters Analytically...

### Normal Equation...

$\theta =  (X^TX)^{-1}X^Ty$

### *NOTE!!!* Doesn't require feature scaling, but isn't efficient if number of features gets beyond 10,000.  Need to use Gradient Descent if that's the case.

## Octave Tutorial...

### Create an all ones matrix with dimensions of 3 x 1...

In [63]:
ones(3,1)

ans =

   1
   1
   1



### Create a 3 x 3 identity matrix...

In [5]:
eye(3)

ans =

Diagonal Matrix

   1   0   0
   0   1   0
   0   0   1



### Ad-hoc Math...

In [6]:
1 + 1

ans =  2


### Suppress output with ```;``` but tell Octave to display by listing the variable...

In [9]:
a = sqrt(9);
a

a =  3


### Create a matrix...

In [10]:
X = [1,2; 3,4; 5,6]

X =

   1   2
   3   4
   5   6



### Octave is case-sensitive...

In [11]:
size(x)

error: 'x' undefined near line 1 column 6
error: evaluating argument list element number 1


### Get the dimensions of X...

In [12]:
size(X)

ans =

   3   2



### Get the value of the specified index in the matrix X...

In [14]:
size(X,1)

ans =  3


### Gets the length of the longest dimension...

In [15]:
length(X)

ans =  3


### This is confusing for matrices, but this works better for vectors...

In [18]:
V = [1,2,3,4,5];
length(V)

ans =  5


### The ```help``` documentation confirms as much...

In [16]:
help length

'length' is a built-in function from the file libinterp/corefcn/data.cc

 -- Built-in Function: length (A)
     Return the length of the object A.

     The length is 0 for empty objects, 1 for scalars, and the number of
     elements for vectors.  For matrix objects, the length is the number
     of rows or columns, whichever is greater (this odd definition is
     used for compatibility with MATLAB).

     See also: numel, size.


Additional help for built-in functions and operators is
available in the online version of the manual.  Use the command
'doc <topic>' to search the manual index.

Help and information about Octave is also available on the WWW
at http://www.octave.org and via the help@octave.org
mailing list.


### Get existing variables (although the command name is weird IMHO)...

In [21]:
who

Variables in the current scope:

V    X    a    ans



### Get variables with more details...

In [25]:
whos

Variables in the current scope:

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  ===== 
        V           1x5                         40  double
        X           3x2                         48  double
        ans         1x1                          8  double

Total is 12 elements using 96 bytes



### Get rid of variable a...

In [24]:
clear a

In [None]:
pwd

In [29]:
ls

MultivariateLinearRegression (Week 2).ipynb
README.md
testvector.mat


### Save a variable to disk...

In [28]:
save testvector.mat V

### Get the file contents...

In [31]:
type testvector.mat

# Created by Octave 3.8.0, Fri Apr 10 23:20:34 2020 EDT <bdavis@macpro.local>
# name: V
# type: matrix
# rows: 1
# columns: 5
 1 2 3 4 5





### Load V back in...

In [36]:
clear V
load testvector.mat

In [37]:
whos

Variables in the current scope:

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  ===== 
        V           1x5                         40  double
        X           3x2                         48  double
        ans         1x1                          8  double

Total is 12 elements using 96 bytes



In [38]:
V

V =

   1   2   3   4   5



### Save in ascii formatted (human readable) text...

In [39]:
save testvector.txt V -ascii

In [40]:
X

X =

   1   2
   3   4
   5   6



### Get only values in the 2nd column; note the use of the colon...

In [42]:
X(:,2)

ans =

   2
   4
   6



### Append values...in this case another column...

In [43]:
X = [X, [100;101;102]]

X =

     1     2   100
     3     4   101
     5     6   102



In [45]:
size(X)

ans =

   3   3



### Put all elements of X in a column vector...

In [46]:
X(:)

ans =

     1
     3
     5
     2
     4
     6
   100
   101
   102



In [48]:
who

Variables in the current scope:

V    X    ans



In [49]:
whos

Variables in the current scope:

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  ===== 
        V           1x5                         40  double
        X           3x3                         72  double
        ans         9x1                         72  double

Total is 23 elements using 184 bytes



In [50]:
A = [1,2;3,4;5,6]

A =

   1   2
   3   4
   5   6



In [51]:
B = [11 12; 13 14; 15 16]

B =

   11   12
   13   14
   15   16



In [52]:
C = [1 1; 2 2]

C =

   1   1
   2   2



### Multiply 2 matrices together...

In [53]:
A*C

ans =

    5    5
   11   11
   17   17



### Multiply element-wise.  Use of the ```.``` denotes for each element...

In [54]:
A .* B

ans =

   11   24
   39   56
   75   96



In [55]:
v = [1; 2; 3]

v =

   1
   2
   3



In [56]:
1 ./ v

ans =

   1.00000
   0.50000
   0.33333



### Use a logarithmic scale...

In [57]:
log(v)

ans =

   0.00000
   0.69315
   1.09861



### Get exponents of base ```e```

In [58]:
exp(v)

ans =

    2.7183
    7.3891
   20.0855



### Get the absolute value...

In [59]:
abs(v)

ans =

   1
   2
   3



In [60]:
-(v)

ans =

  -1
  -2
  -3



### Add one to each element...

In [61]:
v + 1

ans =

   2
   3
   4



In [62]:
A

A =

   1   2
   3   4
   5   6



### Get the transposed matrix of $A^T$...

In [64]:
A'

ans =

   1   3   5
   2   4   6



### Get the max value...

In [65]:
a = [1 15 2 0.5]

a =

    1.00000   15.00000    2.00000    0.50000



In [66]:
val = max(a)

val =  15


```max()``` also has an overload that returns a tuple including the index...

In [67]:
[val, ind] = max(a)

val =  15
ind =  2


Taking the max of a matrix returns the max value of each column...again sort o strange behavior if not performing the operation in a certain context...

In [68]:
max(A)

ans =

   5   6



Element-wise comparison returning booleans...

In [69]:
a < 3

ans =

   1   0   1   1



### Magics...

Magics have the special property where the sum of each row, column, and diagonal equal the same number...

In [71]:
help magic

'magic' is a function from the file /usr/local/octave/3.8.0/share/octave/3.8.0/m/special-matrix/magic.m

 -- Function File: magic (N)

     Create an N-by-N magic square.  A magic square is an arrangement of
     the integers '1:n^2' such that the row sums, column sums, and
     diagonal sums are all equal to the same value.

     Note: N must be greater than 2 for the magic square to exist.


Additional help for built-in functions and operators is
available in the online version of the manual.  Use the command
'doc <topic>' to search the manual index.

Help and information about Octave is also available on the WWW
at http://www.octave.org and via the help@octave.org
mailing list.


In [73]:
A = magic(3)

A =

   8   1   6
   3   5   7
   4   9   2



Return the row and column indices where the values match the condition...

In [74]:
[r,c] = find(A >= 7)

r =

   1
   3
   2

c =

   1
   2
   3



### Other operations...

In [77]:
a

a =

    1.00000   15.00000    2.00000    0.50000



Sum each element...

In [75]:
sum(a)

ans =  18.500


Multiply each element...

In [76]:
prod(a)

ans =  15


Get the ceiling values of each element rounded up to the nearest int...

In [79]:
ceil(a)

ans =

    1   15    2    1



Get the floor values of each element rounded down to the nearest int...

In [80]:
floor(a)

ans =

    1   15    2    0



Create a random 3 x 3 matrix...

In [81]:
rand(3)

ans =

   0.70554   0.98675   0.81799
   0.71289   0.41434   0.93955
   0.67318   0.47208   0.77214



Take the max element-wise value of 2 3x3 matrices...

In [82]:
max(rand(3), rand(3))

ans =

   0.76846   0.47103   0.29120
   0.93325   0.45812   0.87762
   0.93984   0.41801   0.38614



In [83]:
A

A =

   8   1   6
   3   5   7
   4   9   2



Take the max column-wise value...denoted by the ```[]``` and ```1```

In [84]:
max(A,[],1)

ans =

   8   9   7



Take the max row-wise value...denoted by the ```[]``` and ```2```

In [85]:
max(A,[],2)

ans =

   8
   7
   9



Default is column-wise which is the same thing as ```max(A,[],1)```...

In [86]:
max(A)

ans =

   8   9   7



To get global max, can do one of two ways...

In [87]:
max(max(A))

ans =  9


*or* convert the matrix to a vector...

In [88]:
A(:)

ans =

   8
   3
   4
   1
   5
   9
   6
   7
   2



In [89]:
max(A(:))

ans =  9


## More fun with magics...😁

In [90]:
A = magic(9)

A =

   47   58   69   80    1   12   23   34   45
   57   68   79    9   11   22   33   44   46
   67   78    8   10   21   32   43   54   56
   77    7   18   20   31   42   53   55   66
    6   17   19   30   41   52   63   65   76
   16   27   29   40   51   62   64   75    5
   26   28   39   50   61   72   74    4   15
   36   38   49   60   71   73    3   14   25
   37   48   59   70   81    2   13   24   35



### Verify sums of each column are the same...

In [91]:
sum(A,1)

ans =

   369   369   369   369   369   369   369   369   369



### Verify sums of each row are the same...

In [92]:
sum(A,2)

ans =

   369
   369
   369
   369
   369
   369
   369
   369
   369



### Verify sums of each diagonal are the same...

Create an identity matrix of the same dimensions...

In [94]:
eye(9)

ans =

Diagonal Matrix

   1   0   0   0   0   0   0   0   0
   0   1   0   0   0   0   0   0   0
   0   0   1   0   0   0   0   0   0
   0   0   0   1   0   0   0   0   0
   0   0   0   0   1   0   0   0   0
   0   0   0   0   0   1   0   0   0
   0   0   0   0   0   0   1   0   0
   0   0   0   0   0   0   0   1   0
   0   0   0   0   0   0   0   0   1



Multiply each diagonal element by the identity matrix which should leave only non-zero diagonal values...

In [95]:
A .* eye(9)

ans =

   47    0    0    0    0    0    0    0    0
    0   68    0    0    0    0    0    0    0
    0    0    8    0    0    0    0    0    0
    0    0    0   20    0    0    0    0    0
    0    0    0    0   41    0    0    0    0
    0    0    0    0    0   62    0    0    0
    0    0    0    0    0    0   74    0    0
    0    0    0    0    0    0    0   14    0
    0    0    0    0    0    0    0    0   35



Sum each column, which should just return the diagonal value in each column...

In [96]:
sum(A.*eye(9))

ans =

   47   68    8   20   41   62   74   14   35



Sum the sums...

In [97]:
sum(sum(A.*eye(9)))

ans =  369


Can also sum the other diagonals...

In [101]:
sum(sum(A.*eye(9)))

ans =  369


Alternate syntax...

In [100]:
sum(sum(A.*flipud(eye(9))))

ans =  369


In [99]:
flipud(eye(9))

ans =

Permutation Matrix

   0   0   0   0   0   0   0   0   1
   0   0   0   0   0   0   0   1   0
   0   0   0   0   0   0   1   0   0
   0   0   0   0   0   1   0   0   0
   0   0   0   0   1   0   0   0   0
   0   0   0   1   0   0   0   0   0
   0   0   1   0   0   0   0   0   0
   0   1   0   0   0   0   0   0   0
   1   0   0   0   0   0   0   0   0



## Inverses...

In [102]:
A = magic(3)

A =

   8   1   6
   3   5   7
   4   9   2



### Get the inverse matrix of A ($A^{-1}$)...

In [103]:
pinv(A)

ans =

   0.147222  -0.144444   0.063889
  -0.061111   0.022222   0.105556
  -0.019444   0.188889  -0.102778



In [104]:
temp = pinv(A)

temp =

   0.147222  -0.144444   0.063889
  -0.061111   0.022222   0.105556
  -0.019444   0.188889  -0.102778



### Multiplying $A^{-1}$ by A basically returns the identity matrix...

In [105]:
temp * A

ans =

   1.00000   0.00000  -0.00000
  -0.00000   1.00000   0.00000
   0.00000   0.00000   1.00000

