# Linear Regression (continued)

In a [previous report](), we went into some detail about linear regression and the batch gradient descent algorithm.  We also worked through an example in the case of simple linear regression where we had 1 feature and we were trying to fit a line to the data.  Now we will consider multiple linear regression where we have at least 2 features, which means we are trying to fit a hyperplane to the data in higher dimensions.  

A picture for the case of 2 features is shown below 

<img src="least_squares_plane.png">
<br/>

...

In [1]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [50]:
filepath = os.getcwd() + '/ex1data2.txt'
df = pd.read_csv(filepath, names = ['size', 'bedrooms', 'price'])
df.head()

Unnamed: 0,size,bedrooms,price
0,2104,3,399900
1,1600,3,329900
2,2400,3,369000
3,1416,2,232000
4,3000,4,539900


In [52]:
# column means and standard deviations
print( 'means: \n', df.mean() )
print()
print( 'standard deviations: \n', df.std() )

means: 
 size          2000.680851
bedrooms         3.170213
price       340412.659574
dtype: float64

standard deviations: 
 size           794.702354
bedrooms         0.760982
price       125039.899586
dtype: float64


In [48]:
# standardize each column
df = ( df - df.mean() ) / df.std()
df.head()

Unnamed: 0,size,bedrooms,price
0,0.13001,-0.223675,0.475747
1,-0.50419,-0.223675,-0.084074
2,0.502476,-0.223675,0.228626
3,-0.735723,-1.537767,-0.867025
4,1.257476,1.090417,1.595389


In [53]:
# insert column of ones
df.insert(0, 'ones', 1)
df.head()

Unnamed: 0,ones,size,bedrooms,price
0,1,2104,3,399900
1,1,1600,3,329900
2,1,2400,3,369000
3,1,1416,2,232000
4,1,3000,4,539900
