## Wrangling Test Set Data

In this notebook, the test data set is wrangled in the same way as the training data. As there are a couple differences in the null values, there are a couple differences in the way the data is wrangled.

In [1]:
# import packages needed for wrangling

import pandas as pd
import numpy as np
from sklearn.preprocessing import RobustScaler

In [2]:
# read in all of the data files

df_test = pd.read_csv('data/test.csv')
df_train = pd.read_csv('data/train.csv')
df_train_imputed = pd.read_csv('data/train_imputed.csv')

# join test and training data to ensure all columns in training data end up in the test set as well
df = df_test.append(df_train).drop('SalePrice', axis=1)

In [3]:
# examine how many nulls there are 

df_nulls = (df.isnull().sum() / len(df)) * 100
df_nulls = df_nulls.drop(df_nulls[df_nulls == 0].index).sort_values(ascending=False)
pd.DataFrame(df_nulls)

Unnamed: 0,0
PoolQC,99.657417
MiscFeature,96.402878
Alley,93.216855
Fence,80.438506
FireplaceQu,48.646797
LotFrontage,16.649538
GarageQual,5.447071
GarageCond,5.447071
GarageFinish,5.447071
GarageYrBlt,5.447071


In [4]:
df_imputed = df.copy()

In [5]:
# according to the data description, null values mean that these features are not present. We will fill them with "None"

df_imputed[['PoolQC','MiscFeature','Alley','Fence','FireplaceQu']] = df_imputed[['PoolQC','MiscFeature','Alley','Fence','FireplaceQu']].fillna("None")

In [6]:
# for the lot frontage, the mean value is inserted into the null entries

df_imputed['LotFrontage'] = df_imputed['LotFrontage'].fillna(value=df_imputed.LotFrontage.mean())

In [7]:
# The categorical features for garages were left blank if the house did not have a garage, so they will be filled with "None"

df_imputed[['GarageType','GarageFinish','GarageQual','GarageCond']] = df_imputed[['GarageType','GarageFinish','GarageQual','GarageCond']].fillna("None")

In [8]:
# The numerical features for garages were left blank if the house did not have a garage, so they will be filled with zeroes

df_imputed['GarageYrBlt'] = df_imputed['GarageYrBlt'].fillna(0)

In [9]:
df_imputed[['BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2']] = df_imputed[['BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2']].fillna('None')

In [10]:
df_imputed["MasVnrType"] = df_imputed["MasVnrType"].fillna("None")
df_imputed["MasVnrArea"] = df_imputed["MasVnrArea"].fillna(0)

In [11]:
df_imputed['Electrical'] = df_imputed['Electrical'].fillna(df_imputed['Electrical'].mode()[0])

In [12]:
df_imputed.isnull().sum().sum()

22

The test data set varies from the training set and has a couple extra fields which have nulls. Let's examine what those fields are.

In [13]:
df_nulls = (df_imputed.isnull().sum() / len(df_imputed)) * 100
df_nulls = df_nulls.drop(df_nulls[df_nulls == 0].index).sort_values(ascending=False)
pd.DataFrame(df_nulls)

Unnamed: 0,0
MSZoning,0.137033
Utilities,0.068517
Functional,0.068517
BsmtHalfBath,0.068517
BsmtFullBath,0.068517
TotalBsmtSF,0.034258
SaleType,0.034258
KitchenQual,0.034258
GarageCars,0.034258
GarageArea,0.034258


The majority of the remaining null values are in categorical columns, and will be filled in with the most common values. The fields pertaining to basements and garages are blank if the home does not have one, so those will be filled with zeroes.

In [14]:
df_imputed['MSZoning'] = df_imputed['MSZoning'].fillna(df_imputed['MSZoning'].mode()[0])

In [15]:
df_imputed['Functional'] = df_imputed['Functional'].fillna(df_imputed['Functional'].mode()[0])

In [16]:
df_imputed[['BsmtHalfBath', 'BsmtFullBath', 'TotalBsmtSF', 'BsmtUnfSF', 'BsmtFinSF2', 'BsmtFinSF1']] = df_imputed[['BsmtHalfBath', 'BsmtFullBath', 'TotalBsmtSF', 'BsmtUnfSF', 'BsmtFinSF2', 'BsmtFinSF1']].fillna(0)

In [17]:
df_imputed['Utilities'] = df_imputed['Utilities'].fillna(df_imputed['Utilities'].mode()[0])

In [18]:
df_imputed['SaleType'] = df_imputed['SaleType'].fillna(df_imputed['SaleType'].mode()[0])

In [19]:
df_imputed[['GarageArea', 'GarageCars']] = df_imputed[['GarageArea', 'GarageCars']].fillna(0)

In [20]:
df_imputed['KitchenQual'] = df_imputed['KitchenQual'].fillna(df_imputed['KitchenQual'].mode()[0])

In [21]:
df_imputed['Exterior2nd'] = df_imputed['Exterior2nd'].fillna(df_imputed['Exterior2nd'].mode()[0])

In [22]:
df_imputed['Exterior1st'] = df_imputed['Exterior1st'].fillna(df_imputed['Exterior1st'].mode()[0])

In [23]:
df_imputed.isnull().sum().sum()

0

In [24]:
cond_nums = {'LotShape': {'IR3':0, 'IR2':1, 'IR1':2, 'Reg':3},
                  'LandSlope': {'Gtl':0, 'Mod':1, 'Sev':2},
                  'ExterQual': {'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'ExterCond': {'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'BsmtQual': {'None':0, 'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'BsmtCond': {'None':0, 'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'BsmtExposure': {'None':0, 'No':1, 'Mn':2, 'Av':3, 'Gd':4}, 
                  'BsmtFinType1': {'None':0, 'Unf':1, 'LwQ':2, 'Rec':3, 'BLQ':4, 'ALQ':5, 'GLQ':6},
                  'BsmtFinType2': {'None':0, 'Unf':1, 'LwQ':2, 'Rec':3, 'BLQ':4, 'ALQ':5, 'GLQ':6},
                  'HeatingQC': {'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'CentralAir': {'N':0, 'Y':1},
                  'KitchenQual': {'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5},
                  'Functional': {'Sal':0, 'Sev':1, 'Maj2':2, 'Maj1':3, 'Mod':4, 'Min2':5, 'Min1':6, 'Typ':7}, 
                  'FireplaceQu': {'None':0, 'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'GarageFinish': {'None':0, 'Unf':1, 'RFn':2, 'Fin':3},
                  'GarageQual': {'None':0, 'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}, 
                  'GarageCond': {'None':0, 'Po':1, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5},
                  'PavedDrive': {'N':0, 'P':1, 'Y':2}, 
                  'PoolQC': {'None':0, 'Fa':2, 'TA':3, 'Gd':4, 'Ex':5}}
                  

In [25]:
df_imputed.replace(cond_nums, inplace=True)

In [26]:
df_imputed_cat = pd.get_dummies(df_imputed.select_dtypes(include=[object]))

In [27]:
pd.options.display.max_columns = None
df_imputed_cat.head()

Unnamed: 0,Alley_Grvl,Alley_None,Alley_Pave,BldgType_1Fam,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,Condition1_Artery,Condition1_Feedr,Condition1_Norm,Condition1_PosA,Condition1_PosN,Condition1_RRAe,Condition1_RRAn,Condition1_RRNe,Condition1_RRNn,Condition2_Artery,Condition2_Feedr,Condition2_Norm,Condition2_PosA,Condition2_PosN,Condition2_RRAe,Condition2_RRAn,Condition2_RRNn,Electrical_FuseA,Electrical_FuseF,Electrical_FuseP,Electrical_Mix,Electrical_SBrkr,Exterior1st_AsbShng,Exterior1st_AsphShn,Exterior1st_BrkComm,Exterior1st_BrkFace,Exterior1st_CBlock,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_ImStucc,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stone,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior2nd_AsbShng,Exterior2nd_AsphShn,Exterior2nd_Brk Cmn,Exterior2nd_BrkFace,Exterior2nd_CBlock,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_ImStucc,Exterior2nd_MetalSd,Exterior2nd_Other,Exterior2nd_Plywood,Exterior2nd_Stone,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,Fence_GdPrv,Fence_GdWo,Fence_MnPrv,Fence_MnWw,Fence_None,Foundation_BrkTil,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_Stone,Foundation_Wood,GarageType_2Types,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageType_None,Heating_Floor,Heating_GasA,Heating_GasW,Heating_Grav,Heating_OthW,Heating_Wall,HouseStyle_1.5Fin,HouseStyle_1.5Unf,HouseStyle_1Story,HouseStyle_2.5Fin,HouseStyle_2.5Unf,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,LandContour_Bnk,LandContour_HLS,LandContour_Low,LandContour_Lvl,LotConfig_Corner,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,MSZoning_C (all),MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,MasVnrType_BrkCmn,MasVnrType_BrkFace,MasVnrType_None,MasVnrType_Stone,MiscFeature_Gar2,MiscFeature_None,MiscFeature_Othr,MiscFeature_Shed,MiscFeature_TenC,Neighborhood_Blmngtn,Neighborhood_Blueste,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NPkVill,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_Veenker,RoofMatl_ClyTile,RoofMatl_CompShg,RoofMatl_Membran,RoofMatl_Metal,RoofMatl_Roll,RoofMatl_Tar&Grv,RoofMatl_WdShake,RoofMatl_WdShngl,RoofStyle_Flat,RoofStyle_Gable,RoofStyle_Gambrel,RoofStyle_Hip,RoofStyle_Mansard,RoofStyle_Shed,SaleCondition_Abnorml,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial,SaleType_COD,SaleType_CWD,SaleType_Con,SaleType_ConLD,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,Street_Grvl,Street_Pave,Utilities_AllPub,Utilities_NoSeWa
0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
2,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
3,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
4,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0


In [28]:
df_num = df_imputed.select_dtypes(include=[np.number])

scaler = RobustScaler()

for column in df_num:
    df_num[column] = scaler.fit_transform(df_num[column].values.reshape(-1, 1))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [29]:
df_num.head()

Unnamed: 0,1stFlrSF,2ndFlrSF,3SsnPorch,BedroomAbvGr,BsmtCond,BsmtExposure,BsmtFinSF1,BsmtFinSF2,BsmtFinType1,BsmtFinType2,BsmtFullBath,BsmtHalfBath,BsmtQual,BsmtUnfSF,CentralAir,EnclosedPorch,ExterCond,ExterQual,FireplaceQu,Fireplaces,FullBath,Functional,GarageArea,GarageCars,GarageCond,GarageFinish,GarageQual,GarageYrBlt,GrLivArea,HalfBath,HeatingQC,Id,KitchenAbvGr,KitchenQual,LandSlope,LotArea,LotFrontage,LotShape,LowQualFinSF,MSSubClass,MasVnrArea,MiscVal,MoSold,OpenPorchSF,OverallCond,OverallQual,PavedDrive,PoolArea,PoolQC,ScreenPorch,TotRmsAbvGrd,TotalBsmtSF,WoodDeckSF,YearBuilt,YearRemodAdd,YrSold
0,-0.363636,0.0,0.0,-1.0,0.0,0.0,0.136426,144.0,-0.2,1.0,0.0,0.0,-1.0,-0.336752,0.0,0.0,0.0,0.0,-0.25,-1.0,-1.0,0.0,0.976562,-1.0,0.0,-1.0,0.0,-0.363636,-0.887449,0.0,-1.0,0.000685,0.0,0.0,0.0,0.530059,0.594122,0.0,0.0,-0.6,0.0,0.0,0.0,-0.371429,1.0,-0.5,0.0,0.0,0.0,120.0,-0.5,-0.210216,0.833333,-0.252632,-0.820513,1.0
1,0.482893,0.0,0.0,0.0,0.0,0.0,0.757162,0.0,0.2,0.0,0.0,0.0,-1.0,-0.104274,0.0,0.0,0.0,0.0,-0.25,-1.0,-1.0,0.0,-0.65625,-1.0,0.0,-1.0,0.0,-0.431818,-0.186235,1.0,-1.0,0.001371,0.0,1.0,0.0,1.176442,0.649678,-1.0,0.0,-0.6,0.66055,12500.0,0.0,0.142857,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.667976,2.339286,-0.315789,-0.897436,1.0
2,-0.301075,0.995739,0.0,0.0,0.0,0.0,0.57708,0.0,0.4,0.0,0.0,0.0,0.0,-0.564103,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.007812,0.0,0.0,1.0,0.0,0.454545,0.299595,1.0,-0.5,0.002056,0.0,0.0,0.0,1.069648,0.260789,-1.0,0.0,0.2,0.0,0.0,-0.75,0.114286,0.0,-0.5,0.0,0.0,0.0,0.0,0.0,-0.119843,1.261905,0.505263,0.128205,1.0
3,-0.304985,0.963068,0.0,0.0,0.0,0.0,0.319236,0.0,0.4,0.0,0.0,0.0,-1.0,-0.244444,0.0,0.0,0.0,0.0,0.75,0.0,0.0,0.0,-0.039062,0.0,0.0,1.0,0.0,0.477273,0.259109,1.0,0.0,0.002742,0.0,1.0,0.0,0.128299,0.483011,-1.0,0.0,0.2,0.122324,0.0,0.0,0.142857,1.0,0.0,0.0,0.0,0.0,0.0,0.5,-0.123772,2.142857,0.526316,0.128205,1.0
4,0.387097,0.0,0.0,-1.0,0.0,0.0,-0.143247,0.0,0.2,0.0,0.0,0.0,0.0,0.940171,0.0,0.0,0.0,1.0,-0.25,-1.0,0.0,0.0,0.101562,0.0,0.0,0.0,0.0,0.340909,-0.265587,0.0,0.0,0.003427,0.0,1.0,0.0,-1.086999,-1.461433,-1.0,0.0,1.4,0.0,0.0,-1.25,0.8,0.0,1.0,0.0,0.0,0.0,144.0,-0.5,0.571709,0.0,0.4,-0.025641,1.0


In [30]:
df_imputed = pd.concat([df_num, df_imputed_cat], axis=1)

In [31]:
df_imputed.head()

Unnamed: 0,1stFlrSF,2ndFlrSF,3SsnPorch,BedroomAbvGr,BsmtCond,BsmtExposure,BsmtFinSF1,BsmtFinSF2,BsmtFinType1,BsmtFinType2,BsmtFullBath,BsmtHalfBath,BsmtQual,BsmtUnfSF,CentralAir,EnclosedPorch,ExterCond,ExterQual,FireplaceQu,Fireplaces,FullBath,Functional,GarageArea,GarageCars,GarageCond,GarageFinish,GarageQual,GarageYrBlt,GrLivArea,HalfBath,HeatingQC,Id,KitchenAbvGr,KitchenQual,LandSlope,LotArea,LotFrontage,LotShape,LowQualFinSF,MSSubClass,MasVnrArea,MiscVal,MoSold,OpenPorchSF,OverallCond,OverallQual,PavedDrive,PoolArea,PoolQC,ScreenPorch,TotRmsAbvGrd,TotalBsmtSF,WoodDeckSF,YearBuilt,YearRemodAdd,YrSold,Alley_Grvl,Alley_None,Alley_Pave,BldgType_1Fam,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,Condition1_Artery,Condition1_Feedr,Condition1_Norm,Condition1_PosA,Condition1_PosN,Condition1_RRAe,Condition1_RRAn,Condition1_RRNe,Condition1_RRNn,Condition2_Artery,Condition2_Feedr,Condition2_Norm,Condition2_PosA,Condition2_PosN,Condition2_RRAe,Condition2_RRAn,Condition2_RRNn,Electrical_FuseA,Electrical_FuseF,Electrical_FuseP,Electrical_Mix,Electrical_SBrkr,Exterior1st_AsbShng,Exterior1st_AsphShn,Exterior1st_BrkComm,Exterior1st_BrkFace,Exterior1st_CBlock,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_ImStucc,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stone,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior2nd_AsbShng,Exterior2nd_AsphShn,Exterior2nd_Brk Cmn,Exterior2nd_BrkFace,Exterior2nd_CBlock,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_ImStucc,Exterior2nd_MetalSd,Exterior2nd_Other,Exterior2nd_Plywood,Exterior2nd_Stone,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,Fence_GdPrv,Fence_GdWo,Fence_MnPrv,Fence_MnWw,Fence_None,Foundation_BrkTil,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_Stone,Foundation_Wood,GarageType_2Types,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageType_None,Heating_Floor,Heating_GasA,Heating_GasW,Heating_Grav,Heating_OthW,Heating_Wall,HouseStyle_1.5Fin,HouseStyle_1.5Unf,HouseStyle_1Story,HouseStyle_2.5Fin,HouseStyle_2.5Unf,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,LandContour_Bnk,LandContour_HLS,LandContour_Low,LandContour_Lvl,LotConfig_Corner,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,MSZoning_C (all),MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,MasVnrType_BrkCmn,MasVnrType_BrkFace,MasVnrType_None,MasVnrType_Stone,MiscFeature_Gar2,MiscFeature_None,MiscFeature_Othr,MiscFeature_Shed,MiscFeature_TenC,Neighborhood_Blmngtn,Neighborhood_Blueste,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NPkVill,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_Veenker,RoofMatl_ClyTile,RoofMatl_CompShg,RoofMatl_Membran,RoofMatl_Metal,RoofMatl_Roll,RoofMatl_Tar&Grv,RoofMatl_WdShake,RoofMatl_WdShngl,RoofStyle_Flat,RoofStyle_Gable,RoofStyle_Gambrel,RoofStyle_Hip,RoofStyle_Mansard,RoofStyle_Shed,SaleCondition_Abnorml,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial,SaleType_COD,SaleType_CWD,SaleType_Con,SaleType_ConLD,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,Street_Grvl,Street_Pave,Utilities_AllPub,Utilities_NoSeWa
0,-0.363636,0.0,0.0,-1.0,0.0,0.0,0.136426,144.0,-0.2,1.0,0.0,0.0,-1.0,-0.336752,0.0,0.0,0.0,0.0,-0.25,-1.0,-1.0,0.0,0.976562,-1.0,0.0,-1.0,0.0,-0.363636,-0.887449,0.0,-1.0,0.000685,0.0,0.0,0.0,0.530059,0.594122,0.0,0.0,-0.6,0.0,0.0,0.0,-0.371429,1.0,-0.5,0.0,0.0,0.0,120.0,-0.5,-0.210216,0.833333,-0.252632,-0.820513,1.0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
1,0.482893,0.0,0.0,0.0,0.0,0.0,0.757162,0.0,0.2,0.0,0.0,0.0,-1.0,-0.104274,0.0,0.0,0.0,0.0,-0.25,-1.0,-1.0,0.0,-0.65625,-1.0,0.0,-1.0,0.0,-0.431818,-0.186235,1.0,-1.0,0.001371,0.0,1.0,0.0,1.176442,0.649678,-1.0,0.0,-0.6,0.66055,12500.0,0.0,0.142857,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.667976,2.339286,-0.315789,-0.897436,1.0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
2,-0.301075,0.995739,0.0,0.0,0.0,0.0,0.57708,0.0,0.4,0.0,0.0,0.0,0.0,-0.564103,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.007812,0.0,0.0,1.0,0.0,0.454545,0.299595,1.0,-0.5,0.002056,0.0,0.0,0.0,1.069648,0.260789,-1.0,0.0,0.2,0.0,0.0,-0.75,0.114286,0.0,-0.5,0.0,0.0,0.0,0.0,0.0,-0.119843,1.261905,0.505263,0.128205,1.0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
3,-0.304985,0.963068,0.0,0.0,0.0,0.0,0.319236,0.0,0.4,0.0,0.0,0.0,-1.0,-0.244444,0.0,0.0,0.0,0.0,0.75,0.0,0.0,0.0,-0.039062,0.0,0.0,1.0,0.0,0.477273,0.259109,1.0,0.0,0.002742,0.0,1.0,0.0,0.128299,0.483011,-1.0,0.0,0.2,0.122324,0.0,0.0,0.142857,1.0,0.0,0.0,0.0,0.0,0.0,0.5,-0.123772,2.142857,0.526316,0.128205,1.0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0
4,0.387097,0.0,0.0,-1.0,0.0,0.0,-0.143247,0.0,0.2,0.0,0.0,0.0,0.0,0.940171,0.0,0.0,0.0,1.0,-0.25,-1.0,0.0,0.0,0.101562,0.0,0.0,0.0,0.0,0.340909,-0.265587,0.0,0.0,0.003427,0.0,1.0,0.0,-1.086999,-1.461433,-1.0,0.0,1.4,0.0,0.0,-1.25,0.8,0.0,1.0,0.0,0.0,0.0,144.0,-0.5,0.571709,0.0,0.4,-0.025641,1.0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0


In [32]:
df_imputed.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2919 entries, 0 to 1459
Columns: 230 entries, 1stFlrSF to Utilities_NoSeWa
dtypes: float64(56), uint8(174)
memory usage: 1.8 MB


In [33]:
col_order = df_train_imputed.columns.tolist()
col_order

# reorder columns to match other DataFrame
df_imputed = df_imputed.reindex(columns=col_order)

In [34]:
# remove rows from training dataset

df_imputed = df_imputed.head(1459)
df_imputed.head()

Unnamed: 0,MSSubClass,LotFrontage,LotArea,LotShape,LandSlope,OverallQual,OverallCond,YearBuilt,YearRemodAdd,MasVnrArea,ExterQual,ExterCond,BsmtQual,BsmtCond,BsmtExposure,BsmtFinType1,BsmtFinSF1,BsmtFinType2,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,HeatingQC,CentralAir,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,KitchenQual,TotRmsAbvGrd,Functional,Fireplaces,FireplaceQu,GarageYrBlt,GarageFinish,GarageCars,GarageArea,GarageQual,GarageCond,PavedDrive,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,PoolQC,MiscVal,MoSold,YrSold,SalePrice,MSZoning_C (all),MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,Street_Grvl,Street_Pave,Alley_Grvl,Alley_None,Alley_Pave,LandContour_Bnk,LandContour_HLS,LandContour_Low,LandContour_Lvl,Utilities_AllPub,Utilities_NoSeWa,LotConfig_Corner,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,Neighborhood_Blmngtn,Neighborhood_Blueste,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NPkVill,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_Veenker,Condition1_Artery,Condition1_Feedr,Condition1_Norm,Condition1_PosA,Condition1_PosN,Condition1_RRAe,Condition1_RRAn,Condition1_RRNe,Condition1_RRNn,Condition2_Artery,Condition2_Feedr,Condition2_Norm,Condition2_PosA,Condition2_PosN,Condition2_RRAe,Condition2_RRAn,Condition2_RRNn,BldgType_1Fam,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,HouseStyle_1.5Fin,HouseStyle_1.5Unf,HouseStyle_1Story,HouseStyle_2.5Fin,HouseStyle_2.5Unf,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,RoofStyle_Flat,RoofStyle_Gable,RoofStyle_Gambrel,RoofStyle_Hip,RoofStyle_Mansard,RoofStyle_Shed,RoofMatl_ClyTile,RoofMatl_CompShg,RoofMatl_Membran,RoofMatl_Metal,RoofMatl_Roll,RoofMatl_Tar&Grv,RoofMatl_WdShake,RoofMatl_WdShngl,Exterior1st_AsbShng,Exterior1st_AsphShn,Exterior1st_BrkComm,Exterior1st_BrkFace,Exterior1st_CBlock,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_ImStucc,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stone,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior2nd_AsbShng,Exterior2nd_AsphShn,Exterior2nd_Brk Cmn,Exterior2nd_BrkFace,Exterior2nd_CBlock,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_ImStucc,Exterior2nd_MetalSd,Exterior2nd_Other,Exterior2nd_Plywood,Exterior2nd_Stone,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,MasVnrType_BrkCmn,MasVnrType_BrkFace,MasVnrType_None,MasVnrType_Stone,Foundation_BrkTil,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_Stone,Foundation_Wood,Heating_Floor,Heating_GasA,Heating_GasW,Heating_Grav,Heating_OthW,Heating_Wall,Electrical_FuseA,Electrical_FuseF,Electrical_FuseP,Electrical_Mix,Electrical_SBrkr,GarageType_2Types,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageType_None,Fence_GdPrv,Fence_GdWo,Fence_MnPrv,Fence_MnWw,Fence_None,MiscFeature_Gar2,MiscFeature_None,MiscFeature_Othr,MiscFeature_Shed,MiscFeature_TenC,SaleType_COD,SaleType_CWD,SaleType_Con,SaleType_ConLD,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,SaleCondition_Abnorml,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial
0,-0.6,0.594122,0.530059,0.0,0.0,-0.5,1.0,-0.252632,-0.820513,0.0,0.0,0.0,-1.0,0.0,0.0,-0.2,0.136426,1.0,144.0,-0.336752,-0.210216,-1.0,0.0,-0.363636,0.0,0.0,-0.887449,0.0,0.0,-1.0,0.0,-1.0,0.0,0.0,-0.5,0.0,-1.0,-0.25,-0.363636,-1.0,-1.0,0.976562,0.0,0.0,0.0,0.833333,-0.371429,0.0,0.0,120.0,0.0,0.0,0.0,0.0,1.0,,0,0,1,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0
1,-0.6,0.649678,1.176442,-1.0,0.0,0.0,1.0,-0.315789,-0.897436,0.66055,0.0,0.0,-1.0,0.0,0.0,0.2,0.757162,0.0,0.0,-0.104274,0.667976,-1.0,0.0,0.482893,0.0,0.0,-0.186235,0.0,0.0,-1.0,1.0,0.0,0.0,1.0,0.0,0.0,-1.0,-0.25,-0.431818,-1.0,-1.0,-0.65625,0.0,0.0,0.0,2.339286,0.142857,0.0,0.0,0.0,0.0,0.0,12500.0,0.0,1.0,,0,0,0,1,0,0,1,0,1,0,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0
2,0.2,0.260789,1.069648,-1.0,0.0,-0.5,0.0,0.505263,0.128205,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.57708,0.0,0.0,-0.564103,-0.119843,-0.5,0.0,-0.301075,0.995739,0.0,0.299595,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.454545,1.0,0.0,0.007812,0.0,0.0,0.0,1.261905,0.114286,0.0,0.0,0.0,0.0,0.0,0.0,-0.75,1.0,,0,0,0,1,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0
3,0.2,0.483011,0.128299,-1.0,0.0,0.0,1.0,0.526316,0.128205,0.122324,0.0,0.0,-1.0,0.0,0.0,0.4,0.319236,0.0,0.0,-0.244444,-0.123772,0.0,0.0,-0.304985,0.963068,0.0,0.259109,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.5,0.0,0.0,0.75,0.477273,1.0,0.0,-0.039062,0.0,0.0,0.0,2.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,,0,0,0,1,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0
4,1.4,-1.461433,-1.086999,-1.0,0.0,1.0,0.0,0.4,-0.025641,0.0,1.0,0.0,0.0,0.0,0.0,0.2,-0.143247,0.0,0.0,0.940171,0.571709,0.0,0.0,0.387097,0.0,0.0,-0.265587,0.0,0.0,0.0,0.0,-1.0,0.0,1.0,-0.5,0.0,-1.0,-0.25,0.340909,0.0,0.0,0.101562,0.0,0.0,0.0,0.0,0.8,0.0,0.0,144.0,0.0,0.0,0.0,-1.25,1.0,,0,0,0,1,0,0,1,0,1,0,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0


In [42]:
df_imputed = df_imputed.drop('SalePrice', axis=1)

In [43]:
df_imputed.to_csv('Data/test_imputed.csv', index=False)

In [45]:
df_imputed_VIF = df_imputed.drop([ 
        'BsmtFinSF1',
        '1stFlrSF',
        'MSZoning_C (all)',
        'Street_Grvl',
        'Alley_Grvl',
        'LandContour_Bnk',
        'Utilities_AllPub',
        'LotConfig_Corner',
        'Neighborhood_Blmngtn',
        'Condition1_Artery',
        'Condition2_Artery',
        'BldgType_1Fam',
        'HouseStyle_1.5Fin',
        'RoofStyle_Flat',
        'RoofMatl_CompShg',
        'Exterior1st_AsbShng',
        'Exterior1st_CBlock',
        'Exterior2nd_AsbShng',
        'MasVnrType_BrkCmn',
        'Foundation_BrkTil',
        'Heating_Floor',
        'Electrical_FuseA',
        'GarageType_2Types',
        'Fence_GdPrv',
        'MiscFeature_Gar2',
        'SaleType_COD',
        'SaleCondition_Normal',
        'MiscFeature_None',
        'GarageYrBlt',
        'Heating_GasA',
        'Condition2_Norm',
        'RoofStyle_Gable',
        'Street_Pave',
        'Exterior1st_VinylSd',
        'MSZoning_RL',
        'GarageType_Attchd',
        'MasVnrType_None',
        'SaleType_New',
        'Exterior2nd_MetalSd',
        'Alley_None',
        '2ndFlrSF',
        'SaleType_WD',
        'Condition1_Norm',
        'MSSubClass',
        'LandContour_Lvl',
        'Exterior2nd_CmentBd',
        'Fence_None',
        'GarageCond',
        'Exterior2nd_VinylSd',
        'Electrical_SBrkr',
        'Exterior1st_HdBoard',
        'GarageType_None',
        'YearBuilt',
        'HouseStyle_1Story',
        'PoolArea',
        'Foundation_PConc',
        'GrLivArea',
        'Exterior2nd_Wd Sdng',
        'GarageCars',
        'BsmtQual',
        'Fireplaces',
        'Neighborhood_Somerst',
        'TotalBsmtSF',
        'Neighborhood_OldTown',
        'ExterQual',
        'LotConfig_Inside',
        'FullBath'], axis=1)

In [46]:
df_imputed_VIF.to_csv('Data/test_imputed_VIF.csv', index=False)