# Deliverables

Problem statement: What is the recipe for a valuable home in Ames, Iowa? If I were to build a house on an empty lot, where should my focus be?

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression, Ridge, RidgeCV,Lasso, LassoCV
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler,PolynomialFeatures
from scipy import stats
import math
import statsmodels.api as sm

In [6]:
train = pd.read_csv('../datasets/train.csv')

In [7]:
train.drop(columns='Pool QC', inplace=True)

In [8]:
train['Garage Type'].fillna('None',inplace=True)
train['Garage Yr Blt'].fillna('None',inplace=True)
train['Garage Finish'].fillna('None',inplace=True)
train['Garage Qual'].fillna('None',inplace=True)
train['Garage Cond'].fillna('None', inplace=True)
train['Alley'].fillna('None',inplace=True)
train['Fireplace Qu'].fillna('None', inplace=True)
# train['Pool QC'].fillna('None', inplace=True)
train['Fence'].fillna('None', inplace=True)
train['Misc Feature'].fillna('None',inplace=True)
train['Bsmt Qual'].fillna('None', inplace=True)
train['Bsmt Cond'].fillna('None',inplace=True)
train['Bsmt Exposure'].fillna('None',inplace=True)
train['BsmtFin Type 2'].fillna('None',inplace=True)
train['BsmtFin Type 1'].fillna('None',inplace=True)
train['Mas Vnr Type'].fillna('None',inplace=True)

In [9]:
train['Lot Frontage'] = train.groupby('Neighborhood')['Lot Frontage'].transform(lambda x: x.fillna(x.mean())) 

In [14]:
train['Overall Garage'] = train['Garage Area'] * train['Garage Cars']
train['Overall Great Fireplace'] = (train['Fireplace Qu_Ex'] + train['Fireplace Qu_Gd']) + train['Fireplaces']
train['TotalSF'] = train['2nd Flr SF'] + train['BsmtFin SF 1'] + train['BsmtFin SF 2'] + train['Bsmt Unf SF'] + train['1st Flr SF']
train['Recently Blt/Remod'] = train['Year Built'] + train['Year Remod/Add']
train['Bathrooms'] = train['Bsmt Half Bath'] + train['Full Bath'] + (train['Half Bath'] * .5) + train['Bsmt Full Bath']
train['Year Built/Remod'] = train['Year Built'] + train['Year Remod/Add']
train['Total Porch SF'] = train['Enclosed Porch'] + train['Screen Porch'] + train['Open Porch SF'] + train['3Ssn Porch'] + train['Wood Deck SF']

In [15]:
train.fillna(0,inplace=True)

In [16]:
train = pd.get_dummies(train)

In [17]:
X = train[['TotalSF', 'Overall Qual', 'Gr Liv Area',
          'Overall Garage', 'Year Built/Remod', 'Bathrooms',
           'TotRms AbvGrd', 'Foundation_PConc',
          'Overall Great Fireplace', 'Exter Qual_Gd', 'Heating QC_Ex',
          'Total Porch SF']]

In [18]:
y = train['SalePrice']

In [19]:
X_train, X_test, y_train, y_test = train_test_split(X,y)

In [20]:
lr = LinearRegression()

In [21]:
lr.fit(X_train, y_train)

LinearRegression()

#### With all else held constant ...

In [22]:
list(zip(X.columns, lr.coef_))

[('TotalSF', 30.22909230789904),
 ('Overall Qual', 16373.713811474783),
 ('Gr Liv Area', 12.55392867894122),
 ('Overall Garage', 16.794933404261204),
 ('Year Built/Remod', 263.64780584937586),
 ('Bathrooms', 4199.589012219453),
 ('TotRms AbvGrd', -1064.6615558431065),
 ('Foundation_PConc', 4429.741905537656),
 ('Overall Great Fireplace', 9040.014843749264),
 ('Exter Qual_Gd', -12830.937193669943),
 ('Heating QC_Ex', 6213.155860560595),
 ('Total Porch SF', 21.00596804370724)]

In [23]:
train['Lot Area'].mean() ## Average Sq footage

10065.20819112628

## Conclusion

In Ames, Iowa, overall quality of home plays a huge part in the final price of a home. With an average of 10,065 sq footage of lots in Ames at 30.22 dollars an hour, these homes, while all else is constant, start off at a whopping price of around $30,000 dollars. Having poured concrete foundation, and a fireplace also plays a huge part in the price of a home. Adding in an excellent quality heating system and a fireplace will make for your home to have optimal value.