# Machine Learning With Python: Linear Regression Multiple Variables
## Sample problem of predicting home price in monroe, new jersey (USA)

Below is the table containing home prices in monroe twp, NJ. Here price depends on area (square feet), bed rooms and age of the home (in years). Given these prices we have to predict prices of new homes based on area, bed rooms and age.

![image.png](attachment:image.png)

#### Given these home prices find out price of a home that has,

###### 3000 sqr ft area, 3 bedrooms, 40 year old

###### 2500 sqr ft area, 4 bedrooms, 5 year old

#### We will use regression with multiple variables here. Price can be calculated using following equation,

![image.png](attachment:image.png)

## Here area, bedrooms, age are called independant variables or features whereas price is a dependant variable

In [3]:
import pandas as pd
import numpy as np
from sklearn import linear_model

In [4]:
df = pd.read_csv('homeprices.csv')
df

Unnamed: 0,area,bedrooms,age,price
0,2600,3.0,20,550000
1,3000,4.0,15,565000
2,3200,,18,610000
3,3600,3.0,30,595000
4,4000,5.0,8,760000
5,4100,6.0,8,810000


In [6]:
# Handling NaN
import math
median_bedrooms = math.floor(df.bedrooms.median())
median_bedrooms

4

In [7]:
df.bedrooms = df.bedrooms.fillna(median_bedrooms)
df

Unnamed: 0,area,bedrooms,age,price
0,2600,3.0,20,550000
1,3000,4.0,15,565000
2,3200,4.0,18,610000
3,3600,3.0,30,595000
4,4000,5.0,8,760000
5,4100,6.0,8,810000


In [8]:
reg = linear_model.LinearRegression()

In [9]:
reg.fit(df[['area','bedrooms','age']],df.price)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [10]:
reg.coef_

array([  112.06244194, 23388.88007794, -3231.71790863])

In [11]:
reg.intercept_

221323.00186540408

In [12]:
reg.predict([[3000,3,40]])

array([498408.25158031])

In [13]:
reg.predict([[2500,4,5]])

array([578876.03748933])