#### Michael Perrine
#### DSC 550 Data Mining
#### Professor Werner
#### Week 7 Assignment

<h1><center>Dimensionality Reduction and Feature Selection</center></h1>

### Part 1 PCA and Variance Threshold in a Linear Regression

In [28]:
# Import libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import warnings
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, root_mean_squared_error

In [2]:
# This code will supress minor warnings
warnings.filterwarnings("ignore")

1. Import the housing data and ensure that the data is loaded properly

In [3]:
# This code imports the data and validates it is loaded properly
housing = pd.read_csv(r"housing.csv")
housing.head()

Unnamed: 0,Id,MSSubClass,MSZoning,LotFrontage,LotArea,Street,Alley,LotShape,LandContour,Utilities,...,PoolArea,PoolQC,Fence,MiscFeature,MiscVal,MoSold,YrSold,SaleType,SaleCondition,SalePrice
0,1,60,RL,65.0,8450,Pave,,Reg,Lvl,AllPub,...,0,,,,0,2,2008,WD,Normal,208500
1,2,20,RL,80.0,9600,Pave,,Reg,Lvl,AllPub,...,0,,,,0,5,2007,WD,Normal,181500
2,3,60,RL,68.0,11250,Pave,,IR1,Lvl,AllPub,...,0,,,,0,9,2008,WD,Normal,223500
3,4,70,RL,60.0,9550,Pave,,IR1,Lvl,AllPub,...,0,,,,0,2,2006,WD,Abnorml,140000
4,5,60,RL,84.0,14260,Pave,,IR1,Lvl,AllPub,...,0,,,,0,12,2008,WD,Normal,250000


2. Drop the "Id" column and any features that are missing more than 40% of their values.

In [4]:
# This code drops the Id column
housing = housing.drop(columns=["Id"])

In [5]:
# This code removes columns with greater than 40% of their data missing
new_housing = housing.drop(housing.columns[housing.isnull().mean() >0.40], axis = 1)
new_housing.head()

Unnamed: 0,MSSubClass,MSZoning,LotFrontage,LotArea,Street,LotShape,LandContour,Utilities,LotConfig,LandSlope,...,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,MiscVal,MoSold,YrSold,SaleType,SaleCondition,SalePrice
0,60,RL,65.0,8450,Pave,Reg,Lvl,AllPub,Inside,Gtl,...,0,0,0,0,0,2,2008,WD,Normal,208500
1,20,RL,80.0,9600,Pave,Reg,Lvl,AllPub,FR2,Gtl,...,0,0,0,0,0,5,2007,WD,Normal,181500
2,60,RL,68.0,11250,Pave,IR1,Lvl,AllPub,Inside,Gtl,...,0,0,0,0,0,9,2008,WD,Normal,223500
3,70,RL,60.0,9550,Pave,IR1,Lvl,AllPub,Corner,Gtl,...,272,0,0,0,0,2,2006,WD,Abnorml,140000
4,60,RL,84.0,14260,Pave,IR1,Lvl,AllPub,FR2,Gtl,...,0,0,0,0,0,12,2008,WD,Normal,250000


3. For numerical columns, fill in any missing data with the median value.

In [6]:
# This code displays the dimension of the data frame
new_housing.shape

(1460, 74)

In [7]:
# This code displays columns that are missing values
new_housing.isnull().sum()

MSSubClass         0
MSZoning           0
LotFrontage      259
LotArea            0
Street             0
                ... 
MoSold             0
YrSold             0
SaleType           0
SaleCondition      0
SalePrice          0
Length: 74, dtype: int64

In [8]:
# This code creates a for loop that fills columns with the median value
for column in new_housing.select_dtypes(include='number').columns:
    median_value = new_housing[column].median()
    new_housing[column].fillna(median_value, inplace= True)

In [9]:
# This code displays missing values and confirms that the missing values have been replaced
new_housing.isnull().sum()

MSSubClass       0
MSZoning         0
LotFrontage      0
LotArea          0
Street           0
                ..
MoSold           0
YrSold           0
SaleType         0
SaleCondition    0
SalePrice        0
Length: 74, dtype: int64

4. For categorical columns, fill in any missing data with the most common value (mode).

In [None]:
# This code shows no missing data in the columns
new_housing.isna().sum()

MSSubClass       0
MSZoning         0
LotFrontage      0
LotArea          0
Street           0
                ..
MoSold           0
YrSold           0
SaleType         0
SaleCondition    0
SalePrice        0
Length: 74, dtype: int64

5. Convert the categorical columns to dummy variables.

In [None]:
# This code creates a new data frame that isolates the categorical columns
# and displays the first 5 columns
new_housing_1 = new_housing.select_dtypes(include=["object"])
new_housing_1.head()

Unnamed: 0,MSZoning,Street,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,Condition1,Condition2,...,Electrical,KitchenQual,Functional,GarageType,GarageFinish,GarageQual,GarageCond,PavedDrive,SaleType,SaleCondition
0,RL,Pave,Reg,Lvl,AllPub,Inside,Gtl,CollgCr,Norm,Norm,...,SBrkr,Gd,Typ,Attchd,RFn,TA,TA,Y,WD,Normal
1,RL,Pave,Reg,Lvl,AllPub,FR2,Gtl,Veenker,Feedr,Norm,...,SBrkr,TA,Typ,Attchd,RFn,TA,TA,Y,WD,Normal
2,RL,Pave,IR1,Lvl,AllPub,Inside,Gtl,CollgCr,Norm,Norm,...,SBrkr,Gd,Typ,Attchd,RFn,TA,TA,Y,WD,Normal
3,RL,Pave,IR1,Lvl,AllPub,Corner,Gtl,Crawfor,Norm,Norm,...,SBrkr,Gd,Typ,Detchd,Unf,TA,TA,Y,WD,Abnorml
4,RL,Pave,IR1,Lvl,AllPub,FR2,Gtl,NoRidge,Norm,Norm,...,SBrkr,Gd,Typ,Attchd,RFn,TA,TA,Y,WD,Normal


In [None]:
# This code shows the unique labels in each column
for column in new_housing_1.columns:
    print(column, ':', len(new_housing_1[column].unique()), "labels" )

MSZoning : 5 labels
Street : 2 labels
LotShape : 4 labels
LandContour : 4 labels
Utilities : 2 labels
LotConfig : 5 labels
LandSlope : 3 labels
Neighborhood : 25 labels
Condition1 : 9 labels
Condition2 : 8 labels
BldgType : 5 labels
HouseStyle : 8 labels
RoofStyle : 6 labels
RoofMatl : 8 labels
Exterior1st : 15 labels
Exterior2nd : 16 labels
ExterQual : 4 labels
ExterCond : 5 labels
Foundation : 6 labels
BsmtQual : 5 labels
BsmtCond : 5 labels
BsmtExposure : 5 labels
BsmtFinType1 : 7 labels
BsmtFinType2 : 7 labels
Heating : 6 labels
HeatingQC : 5 labels
CentralAir : 2 labels
Electrical : 6 labels
KitchenQual : 4 labels
Functional : 7 labels
GarageType : 7 labels
GarageFinish : 4 labels
GarageQual : 6 labels
GarageCond : 6 labels
PavedDrive : 3 labels
SaleType : 9 labels
SaleCondition : 6 labels


In [None]:
# This code creates a data frame with dummy variables and displays the first 5 rows
df = pd.get_dummies(new_housing_1, drop_first= True)
df.head()

Unnamed: 0,MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,Street_Pave,LotShape_IR2,LotShape_IR3,LotShape_Reg,LandContour_HLS,LandContour_Low,...,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial
0,False,False,True,False,True,False,False,True,False,False,...,False,False,False,False,True,False,False,False,True,False
1,False,False,True,False,True,False,False,True,False,False,...,False,False,False,False,True,False,False,False,True,False
2,False,False,True,False,True,False,False,False,False,False,...,False,False,False,False,True,False,False,False,True,False
3,False,False,True,False,True,False,False,False,False,False,...,False,False,False,False,True,False,False,False,False,False
4,False,False,True,False,True,False,False,False,False,False,...,False,False,False,False,True,False,False,False,True,False


In [None]:
# This code drops the original categorical variables in the housing data frame
new_housing.drop(['MSZoning','Street','LotShape','LandContour','Utilities',
                'LotConfig', 'LandSlope', 'Neighborhood', 'Condition1',
                'Condition2', 'BldgType', 'HouseStyle', 'RoofStyle', 'RoofMatl',
                'Exterior1st', 'Exterior2nd', 'ExterQual', 'ExterCond',
                'Foundation', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1',
                'BsmtFinType2', 'Heating', 'GarageCond', 'PavedDrive',
                'SaleType', 'SaleCondition'], axis = 1, inplace = True)

In [None]:
# This code located the remaining categorical columns
new_housing_2 = new_housing.select_dtypes(include=["object"])
new_housing_2.head()


Unnamed: 0,BsmtQual,HeatingQC,CentralAir,Electrical,KitchenQual,Functional,GarageType,GarageFinish,GarageQual
0,Gd,Ex,Y,SBrkr,Gd,Typ,Attchd,RFn,TA
1,Gd,Ex,Y,SBrkr,TA,Typ,Attchd,RFn,TA
2,Gd,Ex,Y,SBrkr,Gd,Typ,Attchd,RFn,TA
3,TA,Gd,Y,SBrkr,Gd,Typ,Detchd,Unf,TA
4,Gd,Ex,Y,SBrkr,Gd,Typ,Attchd,RFn,TA


In [None]:
# This code created a secondary data frame with dummy variables and displays the first 5 rows
df1 = pd.get_dummies(new_housing_2, drop_first= True)
df1.head()

Unnamed: 0,BsmtQual_Fa,BsmtQual_Gd,BsmtQual_TA,HeatingQC_Fa,HeatingQC_Gd,HeatingQC_Po,HeatingQC_TA,CentralAir_Y,Electrical_FuseF,Electrical_FuseP,...,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageFinish_RFn,GarageFinish_Unf,GarageQual_Fa,GarageQual_Gd,GarageQual_Po,GarageQual_TA
0,False,True,False,False,False,False,False,True,False,False,...,False,False,False,False,True,False,False,False,False,True
1,False,True,False,False,False,False,False,True,False,False,...,False,False,False,False,True,False,False,False,False,True
2,False,True,False,False,False,False,False,True,False,False,...,False,False,False,False,True,False,False,False,False,True
3,False,False,True,False,True,False,False,True,False,False,...,False,False,False,True,False,True,False,False,False,True
4,False,True,False,False,False,False,False,True,False,False,...,False,False,False,False,True,False,False,False,False,True


In [None]:
# This code drops the remaining categorical columns from the new_housing dataframe
new_housing.drop(['BsmtQual', 'HeatingQC', 'CentralAir', 'Electrical', 'KitchenQual',
                'Functional', 'GarageType', 'GarageFinish', 'GarageQual'], axis = 1, inplace = True)

In [None]:
# This code merges all three data frames into one 
new_housing = pd.concat([new_housing, df, df1], axis = 1).replace({True : 1, False : 0})
new_housing.head()

Unnamed: 0,MSSubClass,LotFrontage,LotArea,OverallQual,OverallCond,YearBuilt,YearRemodAdd,MasVnrArea,BsmtFinSF1,BsmtFinSF2,...,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageFinish_RFn,GarageFinish_Unf,GarageQual_Fa,GarageQual_Gd,GarageQual_Po,GarageQual_TA
0,60,65.0,8450,7,5,2003,2003,196.0,706,0,...,0,0,0,0,1,0,0,0,0,1
1,20,80.0,9600,6,8,1976,1976,0.0,978,0,...,0,0,0,0,1,0,0,0,0,1
2,60,68.0,11250,7,5,2001,2002,162.0,486,0,...,0,0,0,0,1,0,0,0,0,1
3,70,60.0,9550,7,5,1915,1970,0.0,216,0,...,0,0,0,1,0,1,0,0,0,1
4,60,84.0,14260,8,5,2000,2000,350.0,655,0,...,0,0,0,0,1,0,0,0,0,1


In [None]:
# This code shows the dimension of the new_housing data frame
new_housing.shape

(1460, 262)

6. Split the data into a training and test set, where the SalePrice column is the target.

In [30]:
# This series of code splits the data between a training and testing set
X = new_housing.drop(columns=["SalePrice"], axis = 1) 
y = new_housing['SalePrice']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state= 19, test_size = 0.20)

In [None]:
# This code displays the first 5 rows in the X data frame
X.head()

Unnamed: 0,MSSubClass,LotFrontage,LotArea,OverallQual,OverallCond,YearBuilt,YearRemodAdd,MasVnrArea,BsmtFinSF1,BsmtFinSF2,...,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageFinish_RFn,GarageFinish_Unf,GarageQual_Fa,GarageQual_Gd,GarageQual_Po,GarageQual_TA
0,60,65.0,8450,7,5,2003,2003,196.0,706,0,...,0,0,0,0,1,0,0,0,0,1
1,20,80.0,9600,6,8,1976,1976,0.0,978,0,...,0,0,0,0,1,0,0,0,0,1
2,60,68.0,11250,7,5,2001,2002,162.0,486,0,...,0,0,0,0,1,0,0,0,0,1
3,70,60.0,9550,7,5,1915,1970,0.0,216,0,...,0,0,0,1,0,1,0,0,0,1
4,60,84.0,14260,8,5,2000,2000,350.0,655,0,...,0,0,0,0,1,0,0,0,0,1


In [None]:
# This code displays the first 5 rows in the X data frame
y.head()

0    208500
1    181500
2    223500
3    140000
4    250000
Name: SalePrice, dtype: int64

7. Run a linear regression and report the R2-value and RMSE on the test set.

In [31]:
# This code runs a linear regression
lr = LinearRegression()

In [32]:
# This code fits the training data into the linear regression
lr.fit(X_train, y_train)

In [None]:
# This code creates the predict value
y_pred = lr.predict(X_test)

In [None]:
# This code calculates the rmse for the regression
root_mean_squared_error(y_test, y_pred)

57991.347929554184

In [None]:
# This code calculates the R-squared for the regression
r2_score(y_test, y_pred)

0.13156622022341757

8. Fit and transform the training features with a PCA so that 90% of the variance is retained

In [None]:
# This code builds the standard scaler
scaleStandard = StandardScaler()

In [None]:
# This code transforms the X_train data
X_train = scaleStandard.fit_transform(X_train)

In [None]:
# This code creates a new data frame without the Sale price column
new_housing_1 = new_housing.drop(['SalePrice'], axis=1)

In [None]:
# This code prints a list of the columns in the housing 1 data frame
print(new_housing_1.columns.tolist())

['MSSubClass', 'LotFrontage', 'LotArea', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF', '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 'FullBath', 'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces', 'GarageYrBlt', 'GarageCars', 'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal', 'MoSold', 'YrSold', 'MSZoning_FV', 'MSZoning_RH', 'MSZoning_RL', 'MSZoning_RM', 'Street_Pave', 'LotShape_IR2', 'LotShape_IR3', 'LotShape_Reg', 'LandContour_HLS', 'LandContour_Low', 'LandContour_Lvl', 'Utilities_NoSeWa', 'LotConfig_CulDSac', 'LotConfig_FR2', 'LotConfig_FR3', 'LotConfig_Inside', 'LandSlope_Mod', 'LandSlope_Sev', 'Neighborhood_Blueste', 'Neighborhood_BrDale', 'Neighborhood_BrkSide', 'Neighborhood_ClearCr', 'Neighborhood_CollgCr', 'Neighborhood_Crawfor', 'Neighborhood_Edwards', 'Neighborhood_Gilbert',

In [None]:
# This code creates a new data frame from the X_train data and displays the top 5 columns
X_train = pd.DataFrame(X_train, columns=['MSSubClass', 'LotFrontage', 'LotArea', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF', '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 'FullBath', 'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces', 'GarageYrBlt', 'GarageCars', 'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal', 'MoSold', 'YrSold', 'MSZoning_FV', 'MSZoning_RH', 'MSZoning_RL', 'MSZoning_RM', 'Street_Pave', 'LotShape_IR2', 'LotShape_IR3', 'LotShape_Reg', 'LandContour_HLS', 'LandContour_Low', 'LandContour_Lvl', 'Utilities_NoSeWa', 'LotConfig_CulDSac', 'LotConfig_FR2', 'LotConfig_FR3', 'LotConfig_Inside', 'LandSlope_Mod', 'LandSlope_Sev', 'Neighborhood_Blueste', 'Neighborhood_BrDale', 'Neighborhood_BrkSide', 'Neighborhood_ClearCr', 'Neighborhood_CollgCr', 'Neighborhood_Crawfor', 'Neighborhood_Edwards', 'Neighborhood_Gilbert', 'Neighborhood_IDOTRR', 'Neighborhood_MeadowV', 'Neighborhood_Mitchel', 'Neighborhood_NAmes', 'Neighborhood_NPkVill', 'Neighborhood_NWAmes', 'Neighborhood_NoRidge', 'Neighborhood_NridgHt', 'Neighborhood_OldTown', 'Neighborhood_SWISU', 'Neighborhood_Sawyer', 'Neighborhood_SawyerW', 'Neighborhood_Somerst', 'Neighborhood_StoneBr', 'Neighborhood_Timber', 'Neighborhood_Veenker', 'Condition1_Feedr', 'Condition1_Norm', 'Condition1_PosA', 'Condition1_PosN', 'Condition1_RRAe', 'Condition1_RRAn', 'Condition1_RRNe', 'Condition1_RRNn', 'Condition2_Feedr', 'Condition2_Norm', 'Condition2_PosA', 'Condition2_PosN', 'Condition2_RRAe', 'Condition2_RRAn', 'Condition2_RRNn', 'BldgType_2fmCon', 'BldgType_Duplex', 'BldgType_Twnhs', 'BldgType_TwnhsE', 'HouseStyle_1.5Unf', 'HouseStyle_1Story', 'HouseStyle_2.5Fin', 'HouseStyle_2.5Unf', 'HouseStyle_2Story', 'HouseStyle_SFoyer', 'HouseStyle_SLvl', 'RoofStyle_Gable', 'RoofStyle_Gambrel', 'RoofStyle_Hip', 'RoofStyle_Mansard', 'RoofStyle_Shed', 'RoofMatl_CompShg', 'RoofMatl_Membran', 'RoofMatl_Metal', 'RoofMatl_Roll', 'RoofMatl_Tar&Grv', 'RoofMatl_WdShake', 'RoofMatl_WdShngl', 'Exterior1st_AsphShn', 'Exterior1st_BrkComm', 'Exterior1st_BrkFace', 'Exterior1st_CBlock', 'Exterior1st_CemntBd', 'Exterior1st_HdBoard', 'Exterior1st_ImStucc', 'Exterior1st_MetalSd', 'Exterior1st_Plywood', 'Exterior1st_Stone', 'Exterior1st_Stucco', 'Exterior1st_VinylSd', 'Exterior1st_Wd Sdng', 'Exterior1st_WdShing', 'Exterior2nd_AsphShn', 'Exterior2nd_Brk Cmn', 'Exterior2nd_BrkFace', 'Exterior2nd_CBlock', 'Exterior2nd_CmentBd', 'Exterior2nd_HdBoard', 'Exterior2nd_ImStucc', 'Exterior2nd_MetalSd', 'Exterior2nd_Other', 'Exterior2nd_Plywood', 'Exterior2nd_Stone', 'Exterior2nd_Stucco', 'Exterior2nd_VinylSd', 'Exterior2nd_Wd Sdng', 'Exterior2nd_Wd Shng', 'ExterQual_Fa', 'ExterQual_Gd', 'ExterQual_TA', 'ExterCond_Fa', 'ExterCond_Gd', 'ExterCond_Po', 'ExterCond_TA', 'Foundation_CBlock', 'Foundation_PConc', 'Foundation_Slab', 'Foundation_Stone', 'Foundation_Wood', 'BsmtQual_Fa', 'BsmtQual_Gd', 'BsmtQual_TA', 'BsmtCond_Gd', 'BsmtCond_Po', 'BsmtCond_TA', 'BsmtExposure_Gd', 'BsmtExposure_Mn', 'BsmtExposure_No', 'BsmtFinType1_BLQ', 'BsmtFinType1_GLQ', 'BsmtFinType1_LwQ', 'BsmtFinType1_Rec', 'BsmtFinType1_Unf', 'BsmtFinType2_BLQ', 'BsmtFinType2_GLQ', 'BsmtFinType2_LwQ', 'BsmtFinType2_Rec', 'BsmtFinType2_Unf', 'Heating_GasA', 'Heating_GasW', 'Heating_Grav', 'Heating_OthW', 'Heating_Wall', 'HeatingQC_Fa', 'HeatingQC_Gd', 'HeatingQC_Po', 'HeatingQC_TA', 'CentralAir_Y', 'Electrical_FuseF', 'Electrical_FuseP', 'Electrical_Mix', 'Electrical_SBrkr', 'KitchenQual_Fa', 'KitchenQual_Gd', 'KitchenQual_TA', 'Functional_Maj2', 'Functional_Min1', 'Functional_Min2', 'Functional_Mod', 'Functional_Sev', 'Functional_Typ', 'GarageType_Attchd', 'GarageType_Basment', 'GarageType_BuiltIn', 'GarageType_CarPort', 'GarageType_Detchd', 'GarageFinish_RFn', 'GarageFinish_Unf', 'GarageQual_Fa', 'GarageQual_Gd', 'GarageQual_Po', 'GarageQual_TA', 'GarageCond_Fa', 'GarageCond_Gd', 'GarageCond_Po', 'GarageCond_TA', 'PavedDrive_P', 'PavedDrive_Y', 'SaleType_CWD', 'SaleType_Con', 'SaleType_ConLD', 'SaleType_ConLI', 'SaleType_ConLw', 'SaleType_New', 'SaleType_Oth', 'SaleType_WD', 'SaleCondition_AdjLand', 'SaleCondition_Alloca', 'SaleCondition_Family', 'SaleCondition_Normal', 'SaleCondition_Partial', 'BsmtQual_Fa', 'BsmtQual_Gd', 'BsmtQual_TA', 'HeatingQC_Fa', 'HeatingQC_Gd', 'HeatingQC_Po', 'HeatingQC_TA', 'CentralAir_Y', 'Electrical_FuseF', 'Electrical_FuseP', 'Electrical_Mix', 'Electrical_SBrkr', 'KitchenQual_Fa', 'KitchenQual_Gd', 'KitchenQual_TA', 'Functional_Maj2', 'Functional_Min1', 'Functional_Min2', 'Functional_Mod', 'Functional_Sev', 'Functional_Typ', 'GarageType_Attchd', 'GarageType_Basment', 'GarageType_BuiltIn', 'GarageType_CarPort', 'GarageType_Detchd', 'GarageFinish_RFn', 'GarageFinish_Unf', 'GarageQual_Fa', 'GarageQual_Gd', 'GarageQual_Po', 'GarageQual_TA'])
X_train.head()

Unnamed: 0,MSSubClass,LotFrontage,LotArea,OverallQual,OverallCond,YearBuilt,YearRemodAdd,MasVnrArea,BsmtFinSF1,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,TotRmsAbvGrd,Fireplaces,GarageYrBlt,GarageCars,GarageArea,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,MiscVal,MoSold,YrSold,MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,Street_Pave,LotShape_IR2,LotShape_IR3,LotShape_Reg,LandContour_HLS,LandContour_Low,LandContour_Lvl,Utilities_NoSeWa,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,LandSlope_Mod,LandSlope_Sev,Neighborhood_Blueste,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NPkVill,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_Veenker,Condition1_Feedr,Condition1_Norm,Condition1_PosA,Condition1_PosN,Condition1_RRAe,Condition1_RRAn,Condition1_RRNe,Condition1_RRNn,Condition2_Feedr,Condition2_Norm,Condition2_PosA,Condition2_PosN,Condition2_RRAe,Condition2_RRAn,Condition2_RRNn,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,HouseStyle_1.5Unf,HouseStyle_1Story,HouseStyle_2.5Fin,HouseStyle_2.5Unf,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,RoofStyle_Gable,RoofStyle_Gambrel,RoofStyle_Hip,RoofStyle_Mansard,RoofStyle_Shed,RoofMatl_CompShg,RoofMatl_Membran,RoofMatl_Metal,RoofMatl_Roll,RoofMatl_Tar&Grv,RoofMatl_WdShake,RoofMatl_WdShngl,Exterior1st_AsphShn,Exterior1st_BrkComm,Exterior1st_BrkFace,Exterior1st_CBlock,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_ImStucc,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stone,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior2nd_AsphShn,Exterior2nd_Brk Cmn,Exterior2nd_BrkFace,Exterior2nd_CBlock,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_ImStucc,Exterior2nd_MetalSd,Exterior2nd_Other,Exterior2nd_Plywood,Exterior2nd_Stone,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,ExterQual_Fa,ExterQual_Gd,ExterQual_TA,ExterCond_Fa,ExterCond_Gd,ExterCond_Po,ExterCond_TA,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_Stone,Foundation_Wood,BsmtQual_Fa,BsmtQual_Gd,BsmtQual_TA,BsmtCond_Gd,BsmtCond_Po,BsmtCond_TA,BsmtExposure_Gd,BsmtExposure_Mn,BsmtExposure_No,BsmtFinType1_BLQ,BsmtFinType1_GLQ,BsmtFinType1_LwQ,BsmtFinType1_Rec,BsmtFinType1_Unf,BsmtFinType2_BLQ,BsmtFinType2_GLQ,BsmtFinType2_LwQ,BsmtFinType2_Rec,BsmtFinType2_Unf,Heating_GasA,Heating_GasW,Heating_Grav,Heating_OthW,Heating_Wall,HeatingQC_Fa,HeatingQC_Gd,HeatingQC_Po,HeatingQC_TA,CentralAir_Y,Electrical_FuseF,Electrical_FuseP,Electrical_Mix,Electrical_SBrkr,KitchenQual_Fa,KitchenQual_Gd,KitchenQual_TA,Functional_Maj2,Functional_Min1,Functional_Min2,Functional_Mod,Functional_Sev,Functional_Typ,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageFinish_RFn,GarageFinish_Unf,GarageQual_Fa,GarageQual_Gd,GarageQual_Po,GarageQual_TA,GarageCond_Fa,GarageCond_Gd,GarageCond_Po,GarageCond_TA,PavedDrive_P,PavedDrive_Y,SaleType_CWD,SaleType_Con,SaleType_ConLD,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial,BsmtQual_Fa.1,BsmtQual_Gd.1,BsmtQual_TA.1,HeatingQC_Fa.1,HeatingQC_Gd.1,HeatingQC_Po.1,HeatingQC_TA.1,CentralAir_Y.1,Electrical_FuseF.1,Electrical_FuseP.1,Electrical_Mix.1,Electrical_SBrkr.1,KitchenQual_Fa.1,KitchenQual_Gd.1,KitchenQual_TA.1,Functional_Maj2.1,Functional_Min1.1,Functional_Min2.1,Functional_Mod.1,Functional_Sev.1,Functional_Typ.1,GarageType_Attchd.1,GarageType_Basment.1,GarageType_BuiltIn.1,GarageType_CarPort.1,GarageType_Detchd.1,GarageFinish_RFn.1,GarageFinish_Unf.1,GarageQual_Fa.1,GarageQual_Gd.1,GarageQual_Po.1,GarageQual_TA.1
0,-0.867509,0.898971,0.080493,2.030692,-0.508216,1.002825,0.827234,0.564376,1.314169,-0.291359,-0.006947,1.243695,1.273195,-0.790231,-0.120143,0.282163,1.10532,-0.232833,0.774047,-0.744987,0.167762,-0.222835,0.280581,0.593242,0.965225,1.60173,1.026478,1.286049,-0.021601,-0.366942,-0.121247,-0.265156,-0.065435,-0.087919,0.266967,-0.608569,-0.213677,-0.114059,0.516538,-0.416976,0.065568,-0.178331,-0.083045,-1.324657,-0.195505,-0.153829,0.330472,-0.029273,-0.258199,-0.18586,-0.050746,-1.634918,-0.215859,-0.088121,-0.029273,-0.106092,-0.200178,-0.135309,2.969028,-0.188311,-0.269339,-0.230654,-0.159565,-0.110144,-0.193133,-0.42262,-0.077648,-0.234742,-0.170514,-0.246686,-0.292407,-0.128593,-0.228588,-0.20248,-0.256307,-0.138554,-0.16512,-0.083045,-0.242755,0.391166,-0.071858,-0.121531,-0.077648,-0.125109,-0.029273,-0.058621,-0.058621,0.083045,0.0,-0.029273,0.0,0.0,-0.041416,-0.153829,-0.193133,-0.178331,-0.2872,-0.092928,1.0,-0.083045,-0.083045,-0.676888,-0.167836,-0.211477,-1.878066,-0.077648,1.987247,-0.077648,-0.029273,0.131991,-0.029273,0.0,-0.029273,-0.077648,-0.065568,-0.065568,-0.029273,-0.041416,-0.18586,0.0,-0.211477,-0.411302,-0.029273,-0.415561,-0.28545,-0.041416,-0.131991,1.332046,-0.407024,-0.135309,-0.050746,-0.071858,-0.135309,0.0,-0.211477,-0.40129,-0.088121,-0.407024,-0.029273,-0.320823,-0.050746,-0.135309,1.354604,-0.39262,-0.170514,-0.101885,1.412399,-1.267731,-0.135309,-0.341518,-0.029273,0.377964,-0.860631,1.093384,-0.138554,-0.071858,-0.041416,-0.153829,1.170115,-0.886528,-0.222297,-0.029273,0.347737,-0.312641,-0.28545,-1.367399,-0.330472,1.544509,-0.228588,-0.327275,-0.643462,-0.16512,-0.088121,-0.180871,-0.193133,0.408452,0.150888,-0.106092,-0.071858,-0.041416,-0.058621,-0.188311,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,1.229564,-0.988085,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,-0.822042,-0.180871,-0.097506,-0.029273,0.344636,-0.156721,-0.077648,-0.065568,0.328876,-0.147893,0.30933,-0.029273,0.0,-0.077648,-0.058621,-0.058621,-0.305995,-0.041416,0.383859,-0.041416,-0.083045,-0.121531,0.46275,-0.30933,-0.153829,1.170115,-0.886528,-0.188311,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,1.229564,-0.988085,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,-0.822042,-0.180871,-0.097506,-0.029273,0.344636
1,-0.867509,-0.448159,-0.392026,-0.791257,-0.508216,-0.310254,-1.11275,-0.576525,-0.965702,-0.291359,0.726593,-0.377887,-0.687972,-0.790231,-0.120143,-1.171664,-0.818694,-0.232833,-1.037751,-0.744987,-1.034359,-0.222835,-0.926487,-0.946551,-0.739057,-1.011148,-0.772861,-0.768497,-0.731637,-0.366942,-0.121247,-0.265156,-0.065435,-0.087919,0.639955,0.89261,-0.213677,-0.114059,0.516538,-0.416976,0.065568,-0.178331,-0.083045,0.754912,-0.195505,-0.153829,0.330472,-0.029273,-0.258199,-0.18586,-0.050746,0.611652,-0.215859,-0.088121,-0.029273,-0.106092,-0.200178,-0.135309,-0.336811,-0.188311,-0.269339,-0.230654,-0.159565,-0.110144,-0.193133,2.366193,-0.077648,-0.234742,-0.170514,-0.246686,-0.292407,-0.128593,-0.228588,-0.20248,-0.256307,-0.138554,-0.16512,-0.083045,-0.242755,-2.556459,-0.071858,8.228358,-0.077648,-0.125109,-0.029273,-0.058621,-0.058621,0.083045,0.0,-0.029273,0.0,0.0,-0.041416,-0.153829,-0.193133,-0.178331,-0.2872,-0.092928,1.0,-0.083045,-0.083045,-0.676888,-0.167836,-0.211477,-1.878066,-0.077648,1.987247,-0.077648,-0.029273,0.131991,-0.029273,0.0,-0.029273,-0.077648,-0.065568,-0.065568,-0.029273,-0.041416,-0.18586,0.0,-0.211477,-0.411302,-0.029273,2.406387,-0.28545,-0.041416,-0.131991,-0.750725,-0.407024,-0.135309,-0.050746,-0.071858,-0.135309,0.0,-0.211477,-0.40129,-0.088121,2.456857,-0.029273,-0.320823,-0.050746,-0.135309,-0.738223,-0.39262,-0.170514,-0.101885,-0.708015,0.788811,-0.135309,-0.341518,-0.029273,0.377964,1.161938,-0.914592,-0.138554,-0.071858,-0.041416,-0.153829,-0.854617,1.127995,-0.222297,-0.029273,0.347737,-0.312641,-0.28545,0.731316,-0.330472,-0.647455,-0.228588,-0.327275,1.554093,-0.16512,-0.088121,-0.180871,-0.193133,0.408452,0.150888,-0.106092,-0.071858,-0.041416,-0.058621,-0.188311,2.2,-0.029273,-0.626206,-3.639217,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,-1.222999,-0.101885,-0.246686,-0.077648,1.685854,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636,-0.156721,-0.077648,-0.065568,0.328876,-0.147893,0.30933,-0.029273,0.0,-0.077648,-0.058621,-0.058621,-0.305995,-0.041416,0.383859,-0.041416,-0.083045,-0.121531,0.46275,-0.30933,-0.153829,-0.854617,1.127995,-0.188311,2.2,-0.029273,-0.626206,-3.639217,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,-1.222999,-0.101885,-0.246686,-0.077648,1.685854,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636
2,-0.867509,-0.04402,-0.311714,-0.791257,1.307062,-0.540043,-1.452247,0.224759,0.976171,-0.291359,-0.839189,0.07403,-0.171075,-0.790231,-0.120143,-0.788485,1.10532,-0.232833,-1.037751,-0.744987,0.167762,-0.222835,-0.322953,-0.946551,-1.037307,-1.011148,-0.99207,-0.768497,-0.731637,-0.366942,-0.121247,-0.265156,-0.065435,-0.087919,0.266967,0.142021,-0.213677,-0.114059,0.516538,-0.416976,0.065568,-0.178331,-0.083045,-1.324657,-0.195505,-0.153829,0.330472,-0.029273,-0.258199,-0.18586,-0.050746,0.611652,-0.215859,-0.088121,-0.029273,-0.106092,-0.200178,-0.135309,-0.336811,-0.188311,-0.269339,-0.230654,-0.159565,-0.110144,-0.193133,2.366193,-0.077648,-0.234742,-0.170514,-0.246686,-0.292407,-0.128593,-0.228588,-0.20248,-0.256307,-0.138554,-0.16512,-0.083045,-0.242755,0.391166,-0.071858,-0.121531,-0.077648,-0.125109,-0.029273,-0.058621,-0.058621,0.083045,0.0,-0.029273,0.0,0.0,-0.041416,-0.153829,-0.193133,-0.178331,-0.2872,-0.092928,1.0,-0.083045,-0.083045,-0.676888,-0.167836,-0.211477,-1.878066,-0.077648,1.987247,-0.077648,-0.029273,0.131991,-0.029273,0.0,-0.029273,-0.077648,-0.065568,-0.065568,-0.029273,-0.041416,-0.18586,0.0,-0.211477,-0.411302,-0.029273,-0.415561,-0.28545,-0.041416,-0.131991,-0.750725,2.456857,-0.135309,-0.050746,-0.071858,-0.135309,0.0,-0.211477,-0.40129,-0.088121,-0.407024,-0.029273,-0.320823,-0.050746,-0.135309,-0.738223,2.546994,-0.170514,-0.101885,-0.708015,0.788811,-0.135309,-0.341518,-0.029273,0.377964,1.161938,-0.914592,-0.138554,-0.071858,-0.041416,-0.153829,-0.854617,1.127995,-0.222297,-0.029273,0.347737,-0.312641,-0.28545,0.731316,-0.330472,-0.647455,-0.228588,-0.327275,-0.643462,-0.16512,-0.088121,-0.180871,-0.193133,0.408452,0.150888,-0.106092,-0.071858,-0.041416,-0.058621,-0.188311,-0.454545,-0.029273,1.59692,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636,-0.156721,-0.077648,-0.065568,0.328876,-0.147893,0.30933,-0.029273,0.0,-0.077648,-0.058621,-0.058621,-0.305995,-0.041416,0.383859,-0.041416,-0.083045,-0.121531,0.46275,-0.30933,-0.153829,-0.854617,1.127995,-0.188311,-0.454545,-0.029273,1.59692,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636
3,3.109234,0.225406,0.118819,-0.791257,-1.415855,-0.211773,-0.967251,-0.576525,0.844847,-0.291359,-0.834703,-0.056672,-0.32057,-0.790231,-0.120143,-0.899307,1.10532,-0.232833,-1.037751,1.250758,0.167762,-0.222835,-0.322953,-0.946551,-0.611236,0.295291,0.122241,-0.768497,-0.731637,-0.366942,-0.121247,-0.265156,-0.065435,-0.087919,-0.851996,1.6432,-0.213677,-0.114059,0.516538,-0.416976,0.065568,-0.178331,-0.083045,0.754912,-0.195505,-0.153829,0.330472,-0.029273,-0.258199,-0.18586,-0.050746,0.611652,-0.215859,-0.088121,-0.029273,-0.106092,-0.200178,-0.135309,-0.336811,-0.188311,-0.269339,-0.230654,-0.159565,-0.110144,-0.193133,-0.42262,-0.077648,-0.234742,-0.170514,-0.246686,-0.292407,-0.128593,4.374692,-0.20248,-0.256307,-0.138554,-0.16512,-0.083045,-0.242755,0.391166,-0.071858,-0.121531,-0.077648,-0.125109,-0.029273,-0.058621,-0.058621,0.083045,0.0,-0.029273,0.0,0.0,-0.041416,6.500712,-0.193133,-0.178331,-0.2872,-0.092928,1.0,-0.083045,-0.083045,-0.676888,-0.167836,-0.211477,-1.878066,-0.077648,1.987247,-0.077648,-0.029273,0.131991,-0.029273,0.0,-0.029273,-0.077648,-0.065568,-0.065568,-0.029273,-0.041416,-0.18586,0.0,-0.211477,-0.411302,-0.029273,-0.415561,3.503245,-0.041416,-0.131991,-0.750725,-0.407024,-0.135309,-0.050746,-0.071858,-0.135309,0.0,-0.211477,2.491962,-0.088121,-0.407024,-0.029273,-0.320823,-0.050746,-0.135309,-0.738223,-0.39262,-0.170514,-0.101885,-0.708015,0.788811,-0.135309,-0.341518,-0.029273,0.377964,-0.860631,1.093384,-0.138554,-0.071858,-0.041416,-0.153829,-0.854617,1.127995,-0.222297,-0.029273,0.347737,-0.312641,3.503245,-1.367399,3.025975,-0.647455,-0.228588,-0.327275,-0.643462,-0.16512,-0.088121,-0.180871,-0.193133,0.408452,0.150888,-0.106092,-0.071858,-0.041416,-0.058621,-0.188311,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636,-0.156721,-0.077648,-0.065568,0.328876,-0.147893,0.30933,-0.029273,0.0,-0.077648,-0.058621,-0.058621,-0.305995,-0.041416,0.383859,-0.041416,-0.083045,-0.121531,0.46275,-0.30933,-0.153829,-0.854617,1.127995,-0.188311,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636
4,2.407455,-2.064715,-0.831201,-0.08577,-0.508216,0.214977,-0.336757,-0.576525,-0.296165,-0.291359,-0.058541,-0.464282,-0.786791,0.581646,-0.120143,-0.116043,-0.818694,-0.232833,0.774047,1.250758,0.167762,-0.222835,0.280581,0.593242,-0.057344,0.295291,-0.170037,-0.560644,-0.731637,-0.366942,-0.121247,-0.265156,-0.065435,-0.087919,0.266967,1.6432,-0.213677,-0.114059,0.516538,-0.416976,0.065568,-0.178331,-0.083045,0.754912,-0.195505,-0.153829,0.330472,-0.029273,-0.258199,-0.18586,-0.050746,0.611652,-0.215859,-0.088121,-0.029273,-0.106092,-0.200178,-0.135309,-0.336811,-0.188311,-0.269339,-0.230654,-0.159565,-0.110144,-0.193133,-0.42262,12.878554,-0.234742,-0.170514,-0.246686,-0.292407,-0.128593,-0.228588,-0.20248,-0.256307,-0.138554,-0.16512,-0.083045,-0.242755,0.391166,-0.071858,-0.121531,-0.077648,-0.125109,-0.029273,-0.058621,-0.058621,0.083045,0.0,-0.029273,0.0,0.0,-0.041416,-0.153829,-0.193133,5.607535,-0.2872,-0.092928,-1.0,-0.083045,-0.083045,1.477349,-0.167836,-0.211477,0.532463,-0.077648,-0.503209,-0.077648,-0.029273,0.131991,-0.029273,0.0,-0.029273,-0.077648,-0.065568,-0.065568,-0.029273,-0.041416,-0.18586,0.0,-0.211477,-0.411302,-0.029273,-0.415561,3.503245,-0.041416,-0.131991,-0.750725,-0.407024,-0.135309,-0.050746,13.916417,-0.135309,0.0,-0.211477,-0.40129,-0.088121,-0.407024,-0.029273,-0.320823,-0.050746,-0.135309,-0.738223,-0.39262,-0.170514,-0.101885,-0.708015,0.788811,-0.135309,-0.341518,-0.029273,0.377964,1.161938,-0.914592,-0.138554,-0.071858,-0.041416,-0.153829,1.170115,-0.886528,-0.222297,-0.029273,0.347737,-0.312641,-0.28545,0.731316,-0.330472,-0.647455,-0.228588,-0.327275,-0.643462,-0.16512,-0.088121,-0.180871,-0.193133,0.408452,0.150888,-0.106092,-0.071858,-0.041416,-0.058621,5.310367,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636,-0.156721,-0.077648,-0.065568,0.328876,-0.147893,0.30933,-0.029273,0.0,-0.077648,-0.058621,-0.058621,-0.305995,-0.041416,0.383859,-0.041416,-0.083045,-0.121531,0.46275,-0.30933,-0.153829,1.170115,-0.886528,5.310367,-0.454545,-0.029273,-0.626206,0.274784,-0.147893,-0.050746,0.0,0.319197,-0.17576,-0.813296,1.012059,-0.050746,-0.144841,-0.156721,-0.106092,-0.029273,0.272978,0.817662,-0.101885,-0.246686,-0.077648,-0.593171,-0.646124,1.216483,-0.180871,-0.097506,-0.029273,0.344636


In [None]:
# This code displays the summary statistics for the data frame
X_train.describe().round(3)

Unnamed: 0,MSSubClass,LotFrontage,LotArea,OverallQual,OverallCond,YearBuilt,YearRemodAdd,MasVnrArea,BsmtFinSF1,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,TotRmsAbvGrd,Fireplaces,GarageYrBlt,GarageCars,GarageArea,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,MiscVal,MoSold,YrSold,MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,Street_Pave,LotShape_IR2,LotShape_IR3,LotShape_Reg,LandContour_HLS,LandContour_Low,LandContour_Lvl,Utilities_NoSeWa,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,LandSlope_Mod,LandSlope_Sev,Neighborhood_Blueste,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NPkVill,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_Veenker,Condition1_Feedr,Condition1_Norm,Condition1_PosA,Condition1_PosN,Condition1_RRAe,Condition1_RRAn,Condition1_RRNe,Condition1_RRNn,Condition2_Feedr,Condition2_Norm,Condition2_PosA,Condition2_PosN,Condition2_RRAe,Condition2_RRAn,Condition2_RRNn,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,HouseStyle_1.5Unf,HouseStyle_1Story,HouseStyle_2.5Fin,HouseStyle_2.5Unf,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,RoofStyle_Gable,RoofStyle_Gambrel,RoofStyle_Hip,RoofStyle_Mansard,RoofStyle_Shed,RoofMatl_CompShg,RoofMatl_Membran,RoofMatl_Metal,RoofMatl_Roll,RoofMatl_Tar&Grv,RoofMatl_WdShake,RoofMatl_WdShngl,Exterior1st_AsphShn,Exterior1st_BrkComm,Exterior1st_BrkFace,Exterior1st_CBlock,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_ImStucc,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stone,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior2nd_AsphShn,Exterior2nd_Brk Cmn,Exterior2nd_BrkFace,Exterior2nd_CBlock,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_ImStucc,Exterior2nd_MetalSd,Exterior2nd_Other,Exterior2nd_Plywood,Exterior2nd_Stone,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,ExterQual_Fa,ExterQual_Gd,ExterQual_TA,ExterCond_Fa,ExterCond_Gd,ExterCond_Po,ExterCond_TA,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_Stone,Foundation_Wood,BsmtQual_Fa,BsmtQual_Gd,BsmtQual_TA,BsmtCond_Gd,BsmtCond_Po,BsmtCond_TA,BsmtExposure_Gd,BsmtExposure_Mn,BsmtExposure_No,BsmtFinType1_BLQ,BsmtFinType1_GLQ,BsmtFinType1_LwQ,BsmtFinType1_Rec,BsmtFinType1_Unf,BsmtFinType2_BLQ,BsmtFinType2_GLQ,BsmtFinType2_LwQ,BsmtFinType2_Rec,BsmtFinType2_Unf,Heating_GasA,Heating_GasW,Heating_Grav,Heating_OthW,Heating_Wall,HeatingQC_Fa,HeatingQC_Gd,HeatingQC_Po,HeatingQC_TA,CentralAir_Y,Electrical_FuseF,Electrical_FuseP,Electrical_Mix,Electrical_SBrkr,KitchenQual_Fa,KitchenQual_Gd,KitchenQual_TA,Functional_Maj2,Functional_Min1,Functional_Min2,Functional_Mod,Functional_Sev,Functional_Typ,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_CarPort,GarageType_Detchd,GarageFinish_RFn,GarageFinish_Unf,GarageQual_Fa,GarageQual_Gd,GarageQual_Po,GarageQual_TA,GarageCond_Fa,GarageCond_Gd,GarageCond_Po,GarageCond_TA,PavedDrive_P,PavedDrive_Y,SaleType_CWD,SaleType_Con,SaleType_ConLD,SaleType_ConLI,SaleType_ConLw,SaleType_New,SaleType_Oth,SaleType_WD,SaleCondition_AdjLand,SaleCondition_Alloca,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial,BsmtQual_Fa.1,BsmtQual_Gd.1,BsmtQual_TA.1,HeatingQC_Fa.1,HeatingQC_Gd.1,HeatingQC_Po.1,HeatingQC_TA.1,CentralAir_Y.1,Electrical_FuseF.1,Electrical_FuseP.1,Electrical_Mix.1,Electrical_SBrkr.1,KitchenQual_Fa.1,KitchenQual_Gd.1,KitchenQual_TA.1,Functional_Maj2.1,Functional_Min1.1,Functional_Min2.1,Functional_Mod.1,Functional_Sev.1,Functional_Typ.1,GarageType_Attchd.1,GarageType_Basment.1,GarageType_BuiltIn.1,GarageType_CarPort.1,GarageType_Detchd.1,GarageFinish_RFn.1,GarageFinish_Unf.1,GarageQual_Fa.1,GarageQual_Gd.1,GarageQual_Po.1,GarageQual_TA.1
count,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0,1168.0
mean,-0.0,-0.0,0.0,-0.0,-0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,0.0,-0.0,-0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,-0.0,-0.0,0.0,-0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,-0.0,-0.0,-0.0,0.0,-0.0,-0.0,-0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,-0.0,0.0,0.0,-0.0,-0.0,-0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,-0.0,-0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,-0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,0.0,-0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,-0.0,-0.0
std,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
min,-0.868,-2.199,-0.913,-3.613,-4.139,-3.265,-1.695,-0.577,-0.966,-0.291,-1.279,-2.358,-2.107,-0.79,-0.12,-2.224,-0.819,-0.233,-2.85,-0.745,-3.439,-0.223,-2.737,-0.947,-2.955,-2.318,-2.179,-0.768,-0.732,-0.367,-0.121,-0.265,-0.065,-0.088,-1.971,-1.359,-0.214,-0.114,-1.936,-0.417,-15.251,-0.178,-0.083,-1.325,-0.196,-0.154,-3.026,-0.029,-0.258,-0.186,-0.051,-1.635,-0.216,-0.088,-0.029,-0.106,-0.2,-0.135,-0.337,-0.188,-0.269,-0.231,-0.16,-0.11,-0.193,-0.423,-0.078,-0.235,-0.171,-0.247,-0.292,-0.129,-0.229,-0.202,-0.256,-0.139,-0.165,-0.083,-0.243,-2.556,-0.072,-0.122,-0.078,-0.125,-0.029,-0.059,-0.059,-12.042,0.0,-0.029,0.0,0.0,-0.041,-0.154,-0.193,-0.178,-0.287,-0.093,-1.0,-0.083,-0.083,-0.677,-0.168,-0.211,-1.878,-0.078,-0.503,-0.078,-0.029,-7.576,-0.029,0.0,-0.029,-0.078,-0.066,-0.066,-0.029,-0.041,-0.186,0.0,-0.211,-0.411,-0.029,-0.416,-0.285,-0.041,-0.132,-0.751,-0.407,-0.135,-0.051,-0.072,-0.135,0.0,-0.211,-0.401,-0.088,-0.407,-0.029,-0.321,-0.051,-0.135,-0.738,-0.393,-0.171,-0.102,-0.708,-1.268,-0.135,-0.342,-0.029,-2.646,-0.861,-0.915,-0.139,-0.072,-0.041,-0.154,-0.855,-0.887,-0.222,-0.029,-2.876,-0.313,-0.285,-1.367,-0.33,-0.647,-0.229,-0.327,-0.643,-0.165,-0.088,-0.181,-0.193,-2.448,-6.627,-0.106,-0.072,-0.041,-0.059,-0.188,-0.455,-0.029,-0.626,-3.639,-0.148,-0.051,0.0,-3.133,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,-3.663,-1.223,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,-2.902,-0.157,-0.078,-0.066,-3.041,-0.148,-3.233,-0.029,0.0,-0.078,-0.059,-0.059,-0.306,-0.041,-2.605,-0.041,-0.083,-0.122,-2.161,-0.309,-0.154,-0.855,-0.887,-0.188,-0.455,-0.029,-0.626,-3.639,-0.148,-0.051,0.0,-3.133,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,-3.663,-1.223,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,-2.902
25%,-0.868,-0.448,-0.294,-0.791,-0.508,-0.573,-0.87,-0.577,-0.966,-0.291,-0.754,-0.595,-0.718,-0.79,-0.12,-0.744,-0.819,-0.233,-1.038,-0.745,-1.034,-0.223,-0.926,-0.947,-0.654,-1.011,-0.645,-0.768,-0.732,-0.367,-0.121,-0.265,-0.065,-0.088,-0.479,-0.609,-0.214,-0.114,0.517,-0.417,0.066,-0.178,-0.083,-1.325,-0.196,-0.154,0.33,-0.029,-0.258,-0.186,-0.051,-1.635,-0.216,-0.088,-0.029,-0.106,-0.2,-0.135,-0.337,-0.188,-0.269,-0.231,-0.16,-0.11,-0.193,-0.423,-0.078,-0.235,-0.171,-0.247,-0.292,-0.129,-0.229,-0.202,-0.256,-0.139,-0.165,-0.083,-0.243,0.391,-0.072,-0.122,-0.078,-0.125,-0.029,-0.059,-0.059,0.083,0.0,-0.029,0.0,0.0,-0.041,-0.154,-0.193,-0.178,-0.287,-0.093,-1.0,-0.083,-0.083,-0.677,-0.168,-0.211,0.532,-0.078,-0.503,-0.078,-0.029,0.132,-0.029,0.0,-0.029,-0.078,-0.066,-0.066,-0.029,-0.041,-0.186,0.0,-0.211,-0.411,-0.029,-0.416,-0.285,-0.041,-0.132,-0.751,-0.407,-0.135,-0.051,-0.072,-0.135,0.0,-0.211,-0.401,-0.088,-0.407,-0.029,-0.321,-0.051,-0.135,-0.738,-0.393,-0.171,-0.102,-0.708,-1.268,-0.135,-0.342,-0.029,0.378,-0.861,-0.915,-0.139,-0.072,-0.041,-0.154,-0.855,-0.887,-0.222,-0.029,0.348,-0.313,-0.285,-1.367,-0.33,-0.647,-0.229,-0.327,-0.643,-0.165,-0.088,-0.181,-0.193,0.408,0.151,-0.106,-0.072,-0.041,-0.059,-0.188,-0.455,-0.029,-0.626,0.275,-0.148,-0.051,0.0,0.319,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,-1.223,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,0.345,-0.157,-0.078,-0.066,0.329,-0.148,0.309,-0.029,0.0,-0.078,-0.059,-0.059,-0.306,-0.041,0.384,-0.041,-0.083,-0.122,0.463,-0.309,-0.154,-0.855,-0.887,-0.188,-0.455,-0.029,-0.626,0.275,-0.148,-0.051,0.0,0.319,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,-1.223,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,0.345
50%,-0.166,-0.044,-0.094,-0.086,-0.508,0.051,0.439,-0.577,-0.139,-0.291,-0.208,-0.156,-0.201,-0.79,-0.12,-0.087,-0.819,-0.233,0.774,-0.745,0.168,-0.223,-0.323,0.593,0.028,0.295,0.013,-0.768,-0.337,-0.367,-0.121,-0.265,-0.065,-0.088,-0.106,0.142,-0.214,-0.114,0.517,-0.417,0.066,-0.178,-0.083,0.755,-0.196,-0.154,0.33,-0.029,-0.258,-0.186,-0.051,0.612,-0.216,-0.088,-0.029,-0.106,-0.2,-0.135,-0.337,-0.188,-0.269,-0.231,-0.16,-0.11,-0.193,-0.423,-0.078,-0.235,-0.171,-0.247,-0.292,-0.129,-0.229,-0.202,-0.256,-0.139,-0.165,-0.083,-0.243,0.391,-0.072,-0.122,-0.078,-0.125,-0.029,-0.059,-0.059,0.083,0.0,-0.029,0.0,0.0,-0.041,-0.154,-0.193,-0.178,-0.287,-0.093,0.0,-0.083,-0.083,-0.677,-0.168,-0.211,0.532,-0.078,-0.503,-0.078,-0.029,0.132,-0.029,0.0,-0.029,-0.078,-0.066,-0.066,-0.029,-0.041,-0.186,0.0,-0.211,-0.411,-0.029,-0.416,-0.285,-0.041,-0.132,-0.751,-0.407,-0.135,-0.051,-0.072,-0.135,0.0,-0.211,-0.401,-0.088,-0.407,-0.029,-0.321,-0.051,-0.135,-0.738,-0.393,-0.171,-0.102,-0.708,0.789,-0.135,-0.342,-0.029,0.378,-0.861,-0.915,-0.139,-0.072,-0.041,-0.154,-0.855,-0.887,-0.222,-0.029,0.348,-0.313,-0.285,0.731,-0.33,-0.647,-0.229,-0.327,-0.643,-0.165,-0.088,-0.181,-0.193,0.408,0.151,-0.106,-0.072,-0.041,-0.059,-0.188,-0.455,-0.029,-0.626,0.275,-0.148,-0.051,0.0,0.319,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,0.818,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,0.345,-0.157,-0.078,-0.066,0.329,-0.148,0.309,-0.029,0.0,-0.078,-0.059,-0.059,-0.306,-0.041,0.384,-0.041,-0.083,-0.122,0.463,-0.309,-0.154,-0.855,-0.887,-0.188,-0.455,-0.029,-0.626,0.275,-0.148,-0.051,0.0,0.319,-0.176,-0.813,-0.988,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,0.818,-0.102,-0.247,-0.078,-0.593,-0.646,-0.822,-0.181,-0.098,-0.029,0.345
75%,0.302,0.405,0.117,0.62,0.399,0.97,0.924,0.357,0.593,-0.291,0.529,0.528,0.601,0.872,-0.12,0.509,1.105,-0.233,0.774,1.251,0.168,-0.223,0.281,0.593,0.923,0.295,0.506,0.575,0.357,-0.367,-0.121,-0.265,-0.065,-0.088,0.64,0.893,-0.214,-0.114,0.517,-0.417,0.066,-0.178,-0.083,0.755,-0.196,-0.154,0.33,-0.029,-0.258,-0.186,-0.051,0.612,-0.216,-0.088,-0.029,-0.106,-0.2,-0.135,-0.337,-0.188,-0.269,-0.231,-0.16,-0.11,-0.193,-0.423,-0.078,-0.235,-0.171,-0.247,-0.292,-0.129,-0.229,-0.202,-0.256,-0.139,-0.165,-0.083,-0.243,0.391,-0.072,-0.122,-0.078,-0.125,-0.029,-0.059,-0.059,0.083,0.0,-0.029,0.0,0.0,-0.041,-0.154,-0.193,-0.178,-0.287,-0.093,1.0,-0.083,-0.083,1.477,-0.168,-0.211,0.532,-0.078,-0.503,-0.078,-0.029,0.132,-0.029,0.0,-0.029,-0.078,-0.066,-0.066,-0.029,-0.041,-0.186,0.0,-0.211,-0.411,-0.029,-0.416,-0.285,-0.041,-0.132,1.332,-0.407,-0.135,-0.051,-0.072,-0.135,0.0,-0.211,-0.401,-0.088,-0.407,-0.029,-0.321,-0.051,-0.135,1.355,-0.393,-0.171,-0.102,1.412,0.789,-0.135,-0.342,-0.029,0.378,1.162,1.093,-0.139,-0.072,-0.041,-0.154,1.17,1.128,-0.222,-0.029,0.348,-0.313,-0.285,0.731,-0.33,1.545,-0.229,-0.327,1.554,-0.165,-0.088,-0.181,-0.193,0.408,0.151,-0.106,-0.072,-0.041,-0.059,-0.188,-0.455,-0.029,1.597,0.275,-0.148,-0.051,0.0,0.319,-0.176,1.23,1.012,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,0.818,-0.102,-0.247,-0.078,1.686,1.548,1.216,-0.181,-0.098,-0.029,0.345,-0.157,-0.078,-0.066,0.329,-0.148,0.309,-0.029,0.0,-0.078,-0.059,-0.059,-0.306,-0.041,0.384,-0.041,-0.083,-0.122,0.463,-0.309,-0.154,1.17,1.128,-0.188,-0.455,-0.029,1.597,0.275,-0.148,-0.051,0.0,0.319,-0.176,1.23,1.012,-0.051,-0.145,-0.157,-0.106,-0.029,0.273,0.818,-0.102,-0.247,-0.078,1.686,1.548,1.216,-0.181,-0.098,-0.029,0.345
max,3.109,10.913,20.819,2.736,3.122,1.265,1.215,7.914,11.185,9.062,3.961,11.177,8.935,3.923,11.195,7.747,4.953,8.54,2.586,3.247,6.178,8.6,4.505,3.673,1.306,2.908,4.296,6.083,7.521,8.531,15.806,7.688,16.597,31.187,2.132,1.643,4.68,8.767,0.517,2.398,0.066,5.608,12.042,0.755,5.115,6.501,0.33,34.161,3.873,5.38,19.706,0.612,4.633,11.348,34.161,9.426,4.996,7.39,2.969,5.31,3.713,4.336,6.267,9.079,5.178,2.366,12.879,4.26,5.865,4.054,3.42,7.776,4.375,4.939,3.902,7.217,6.056,12.042,4.119,0.391,13.916,8.228,12.879,7.993,34.161,17.059,17.059,0.083,0.0,34.161,0.0,0.0,24.145,6.501,5.178,5.608,3.482,10.761,1.0,12.042,12.042,1.477,5.958,4.729,0.532,12.879,1.987,12.879,34.161,0.132,34.161,0.0,34.161,12.879,15.251,15.251,34.161,24.145,5.38,0.0,4.729,2.431,34.161,2.406,3.503,24.145,7.576,1.332,2.457,7.39,19.706,13.916,7.39,0.0,4.729,2.492,11.348,2.457,34.161,3.117,19.706,7.39,1.355,2.547,5.865,9.815,1.412,0.789,7.39,2.928,34.161,0.378,1.162,1.093,7.217,13.916,24.145,6.501,1.17,1.128,4.498,34.161,0.348,3.199,3.503,0.731,3.026,1.545,4.375,3.056,1.554,6.056,11.348,5.529,5.178,0.408,0.151,9.426,13.916,24.145,17.059,5.31,2.2,34.161,1.597,0.275,6.762,19.706,0.0,0.319,5.69,1.23,1.012,19.706,6.904,6.381,9.426,34.161,0.273,0.818,9.815,4.054,12.879,1.686,1.548,1.216,5.529,10.256,34.161,0.345,6.381,12.879,15.251,0.329,6.762,0.309,34.161,0.0,12.879,17.059,17.059,3.268,24.145,0.384,24.145,12.042,8.228,0.463,3.233,6.501,1.17,1.128,5.31,2.2,34.161,1.597,0.275,6.762,19.706,0.0,0.319,5.69,1.23,1.012,19.706,6.904,6.381,9.426,34.161,0.273,0.818,9.815,4.054,12.879,1.686,1.548,1.216,5.529,10.256,34.161,0.345


In [None]:
# This code creates the PCA
pca1 = PCA()

In [None]:
# This code fits and transforms the X_train data to the PCA
X_pca1 = pca1.fit_transform(X_train)

In [None]:
# This code displays the variance ratio for all columns
pca1.explained_variance_ratio_

In [None]:
# This code recreates the PCA with a with a 90 percent threshold
pca2 = PCA(0.90)

In [None]:
# This code transforms the X_train data with the 90 percent threshold
X_pca2 = pca2.fit_transform(X_train)


9. How many features are in the PCA-transformed matrix?

In [None]:
# This code displays the number of features in the PCA
X_pca2.shape

(1168, 122)

10. Transform but DO NOT fit the test features with the same PCA.

11. Repeat step 7 with your PCA transformed data.

12. Take your original training features (from step 6) and apply a min-max scaler to them.

13. Find the min-max scaled features in your training set that have a variance above 0.1

14. Transform but DO NOT fit the test features with the same steps applied in steps 11 and 12.

15. Repeat step 7 with the high variance data.

16. Summarize your findings.

### Part 2 Categorical Feature Selection

1. Import the data as a data frame and ensure it is loaded correctly.

2. Convert the categorical features (all of them) to dummy variables.

3. Split the data into a training and test set.

4. Fit a decision tree classifier on the training set.

5. Report the accuracy and create a confusion matrix for the model prediction on the test set.

6. Create a visualization of the decision tree.

7. Use a χ2-statistic selector to pick the five best features for this data 

8. Which five features were selected in step 7? Hint: Use the get_support function.

9. Repeat steps 4 and 5 with the five best features selected in step 7.

10. Summarize your findings.