# House Price Prediction - Comprehensive Data Science Project
### —— Data Science Project for Exploratory Data Analysis (EDA), Statistical Inference, Feature Engineering, and Machine Learning.

#### Dataset downloaded from Kaggle
[<span style='color:#1f77b4'>**House Prices - Advanced Regression Techniques**</span>](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data)

### Life cycle of the Project:
- Understand the Problem
- Exploratory Data Analysis (Part 1)
  - whole picture of the data
  - univariate analysis

- Exploratory Data Analysis (Part 2)
  - Variable Correlation/Dependency Analysis
  - inferential statistics with hypothesis testing

- <span style='color:blue'>**Feature Engineering**</span>
- Feature Importance Analysis
- Machine Learning Model Building, evaluation, and deployment

# Feature Engineering Phase
**Based on results, conclusion, and insights from Exploratory Data Analysis (EDA) phase, customize feature engineering pipeline to prepare the training and test data for Machine Learning model building.**

### Feature Engineering Outline:
**1. before pipeline**
- remove 4 outliers from train.csv (**reset index**)

**2. separate feature (predictor) variables and target (response) variable in train.csv**   
- separate X and y: X_main, y_main

**3. feature preprocessing pipeline**
- [data type transformation](#2.2.1)   
  for feature `MSSubClass` and `CentralAir`  
  - `MSSubClass` is a nominal feature, from int to str  
  - `CentralAir` is a nominal feature, from str ('Y'/'N') to int (1/0)
- [missing values](#2.2.2) replace with meaning
- [missing values](#2.2.3) imputation
- [create new features](#2.2.4)      
  - 3 year features,   
  - `TotalBath` from discrete features  
  - new nominal features from continuous features  
  - new continuous features via interaction
- [log1p transformation for some continuous features](#2.2.5)  
  - some original continuous features  
  - new continuous features from interaction
- [range binning and ordinal encoding with pd.cut()](#2.2.6)     
  for some discrete features and 2 ordinal features  
  - some discrete features (including new created `TotalBath`) 
  - 2 ordinal features (`OverallQual` and `OverallCond`)
- [range binning for some ordinal features with dtype 'o'](#2.2.7)
- [ordinal encoding with OrdinalEncoder() for some ordinal features](#2.2.8)
- [category binning for some nominal features](#2.2.9)
- [onehot encoding for non 0/1 nominal features](#2.2.10)
- [drop features](#2.2.11)      
  - `Id`, `Utilities`, `MoSold`, 4 original year features, 
- [MinMax Scaling](#2.2.12)

**4. use customized pipline transformers to fit and transform data in train.csv, only transform data in test.csv**

**5. concat target variable y_main to preprocessed X_main**   
- y_main contains original target values, no log transformation


<a name="0"></a>
# Content Outline
- [1. Packages, Datasets, and Preparation](#1)
  - [Packages](#1.1)
  - [Version Information](#1.2)
  - [Datasets](#1.3)
  - [Preparation](#1.4)
- [2. Create custom scikit-learn Transformers](#2)
  - [2.1 Feature Lists and Dictionaries Used for Setting Up Custom Transformers](#2.1)
  - [2.2 Define Custom Transformers for Feature Engineering pipeline](#2.2)
    - [data type transformation](#2.2.1)
    - [missing data replacement](#2.2.2)
    - [missing data imputation](#2.2.3)
    - [create new features](#2.2.4)
    - [log1p transformation](#2.2.5)
    - [range binning and ordinal encoding for some discrete and ordinal features](#2.2.6)
    - [binning for ordinal features with dtype 'o'](#2.2.7)
    - [ordinal encoding for ordinal features with dtype 'o'](#2.2.8)
    - [category binning for some nominal features](#2.2.9)
    - [onehot encoding for non 0/1 nominal features](#2.2.10)
    - [drop some features](#2.2.11)
    - [MinMax scaling](#2.2.12)
- [3. Feature Engineering Pipeline and Data Preprocessing](#3)



<a name="1"></a>
# 1. Packages, Datasets, and Preparation
- [Packages](#1.1)
- [Version Information](#1.2)
- [Datasets](#1.3)
- [Preparation](#1.4)

<a name="1.1"></a>
### Packages

In [2]:
###### import modules ####################################
%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import OrdinalEncoder, OneHotEncoder, MinMaxScaler

from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.utils.validation import check_is_fitted
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

import session_info
import warnings
import json

# warnings.filterwarnings('ignore') ## ignore warnings
pd.pandas.set_option('display.max_columns',None) ## Display all the columns of the dataframe

<a name="1.2"></a>
### version information for all imported modules
- use `session_info` to output version information for modules loaded in the current session, Python, the OS, and the CPU.
- version information is provided to increase reproducibility

In [3]:
session_info.show(std_lib=True, html=False)

-----
io                  NA
json                2.0.9
matplotlib          3.6.2
numpy               1.23.4
pandas              1.5.1
seaborn             0.12.1
session_info        1.0.0
sklearn             1.1.3
-----
IPython             8.6.0
jupyter_client      7.4.5
jupyter_core        5.0.0
jupyterlab          3.5.0
notebook            6.5.2
-----
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)]
Windows-10-10.0.22000-SP0
-----
Session information updated at 2022-11-14 10:48


<a name="1.3"></a>
### Datasets

In [4]:
###### import datasets: train.csv, test.csv #######################
main_df = pd.read_csv('datasets/train.csv')
test_df = pd.read_csv('datasets/test.csv')

In [5]:
main_copy = main_df.copy()
test_copy = test_df.copy()

<a name="1.4"></a>
### Preparation
- basic preprocessing before pipeline

#### remove outliers

In [6]:
###### remove outliers from training data #############################
main_copy = main_copy[main_copy.GrLivArea <= 4000].reset_index(drop=True)

#### separate feature (predictor) variables and target (response) variable

In [7]:
###### prepare training + test data for model training ############################
X_main, y_main = main_copy.drop('SalePrice', axis=1), main_copy.SalePrice

#### read in original feature type dictionary (created in EDA)

In [8]:
with open('EDA_files/feat_type_dict.json', 'r') as f:
    feat_type_dict = json.load(f)

*back to [content outline](#0)*

<a name="2"></a>
# 2. Create custom scikit-learn Transformers
- [2.1 Feature Lists and Dictionaries Used for Setting Up Custom Transformers](#2.1)
- [2.2 Define Custom Transformers for Feature Engineering pipeline](#2.2)

<a name="2.1"></a>
## 2.1 Feature Lists and Dictionaries Used for Setting Up Custom Transformers
- some lists and dictionaries are created and saved as .json files in EDA process and used here

In [9]:
###### NAReplaceHandler: list of features with specific NA meanings #################
## categorical features (in both train.csv and test.csv)
feats_na_replace_cat = ['Alley', 'BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 
                       'BsmtFinType2', 'FireplaceQu', 'GarageType', 'GarageFinish', 'GarageQual', 
                       'GarageCond', 'PoolQC', 'Fence', 'MiscFeature']

## numerical features (featrues only have na in test.csv) + 'MasVnrArea'
feats_na_replace_num = ['BsmtFullBath', 'BsmtHalfBath', 'GarageCars', 'GarageArea', 'TotalBsmtSF', 
                       'BsmtUnfSF', 'BsmtFinSF2', 'BsmtFinSF1', 'MasVnrArea']

##### NAImputer: list of features for imputation ###################################
feats_imput_mode = ['MasVnrType', 'MSZoning', 'Functional', 'Exterior1st',
                    'Exterior2nd', 'KitchenQual', 'Electrical', 'SaleType']

###### FeatGenerator: lists/dicts used for creating new features ##############################
# continuous features for generating nominal features
with open('EDA_files/continuous_to_cat_feat_lst.json', 'r') as f:
    cont_to_cat_feats = json.load(f)

# 3 new year features + TotalBath + new continuous features via interaction
new_feats_other = (['AgebySale', 'GarageAge', 'RemodAge'] 
                   + ['TotalBath'] 
                   + ['TotalLivArea', 'TotalFlrSF', 'TotalPorchSF'])

feat_gen_dict = {'AgebySale'   : ['YrSold', 'YearBuilt'], 
                 'GarageAge'   : ['YrSold', 'GarageYrBlt'], 
                 'RemodAge'    : ['YrSold', 'YearRemodAdd'], 
                 'TotalBath'   : ['BsmtFullBath', 'BsmtHalfBath', 'FullBath', 'HalfBath'], 
                 'TotalLivArea': ['GrLivArea', 'BsmtFinSF1', 'BsmtFinSF2'], 
                 'TotalFlrSF'  : ['TotalBsmtSF', '1stFlrSF', '2ndFlrSF'], 
                 'TotalPorchSF': ['OpenPorchSF', 'EnclosedPorch', '3SsnPorch', 'ScreenPorch']}

###### Log1pTransformer: list of features for log1p transformation #########################
with open('EDA_files/continuous_feat_for_log.json', 'r') as f:
    cont_feats_log_orig = json.load(f)

cont_feats_log = cont_feats_log_orig + ['TotalLivArea', 'TotalFlrSF', 'TotalPorchSF']

###### DiscreteBinEncoder: list of features for discrete range binning + ordinal encoding with pd.cut() ########
bin_encode_feats = ([feat for feat in feat_type_dict['discrete'] if feat != 'MoSold'] # original discrete features
                    + ['TotalBath'] # new discrete feature
                    + ['OverallQual', 'OverallCond']) # 2 ordinal features with dtype 'int'

###### OrdinalBinHandler: lists/dicts for ordinal feature range binning #####################
with open('EDA_files/cat_map_dict_bin.json', 'r') as f:
    cat_map_dict_bin = json.load(f)

ord_feats_bin = list(cat_map_dict_bin.keys())

with open('EDA_files/ord_bin_dict.json', 'r') as f:
    ord_bin_dict = json.load(f)

###### CustomOrdinalEncoder: ordinal encoding for ordinal features with dtype 'o' #############
ord_feats_encode = [feat for feat in feat_type_dict['ordinal'] if feat not in ['OverallQual', 'OverallCond']]

with open('EDA_files/cat_map_dict_full.json', 'r') as f:
    cat_map_dict_full = json.load(f)
    
###### NominalBinHandler: nominal feature category binning #########################
with open('EDA_files/merge_cat_dict.json', 'r') as f:
    merge_cat_dict = json.load(f)

###### CustomOneHotEncoder: non 0/1 nominal feature onehotencoding #####################
nom_feats_encode = [feat for feat in feat_type_dict['nominal'] if feat != 'CentralAir']

###### FeatureCleaner: drop some original features ####################
feats_drop = ['Id', 'Utilities', 'MoSold'] + feat_type_dict['year']

*back to [2. Create custom scikit-learn Transformers](#2)*

<a name="2.2"></a>
## 2.2 Define Custom Transformers for Feature Engineering pipeline

<a name="2.2.1"></a>
### data type transformation

In [10]:
###### data type transformation ############################3
class DtypeTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, typ_trans_feats = ['MSSubClass', 'CentralAir']):
        self.typ_trans_feats = typ_trans_feats
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.typ_trans_feats:
            if feat in dat.columns:
                if feat == 'MSSubClass':
                    dat[feat] = dat[feat].astype(str)
                else: # feat == 'CentralAir'
                    dat[feat] = dat[feat].replace({'Y': 1, 'N': 0})
            else:
                print("Feature '{}' is not in the data frame.\n"
                      "No processing with this feature by DtypeTransformer.".format(feat))
        
        dat[self.typ_trans_feats] = dat[self.typ_trans_feats].astype(str)
        return dat   

<a name="2.2.2"></a>
### missing data replacement

In [11]:
###### missing data replace #################################
class NAReplaceHandler(BaseEstimator, TransformerMixin):
    def __init__(self, feats_na_replace_cat = feats_na_replace_cat, 
                 feats_na_replace_num = feats_na_replace_num):
        
        self.feats_na_replace_cat = feats_na_replace_cat
        self.feats_na_replace_num = feats_na_replace_num
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.feats_na_replace_cat:
            if feat not in dat.columns:
                print("Feature '{}' is not in the data frame.\n"
                      "No processing with this feature by NAReplaceHandler.".format(feat))
            elif 'Bsmt' in feat:   ## features with 'Bsmt' in name
                dat[feat] = dat[feat].fillna('No Basement')
            elif 'Garage' in feat:   ## features with 'Garage' in name
                dat[feat] = dat[feat].fillna('No Garage')
            elif feat == 'Alley':
                dat[feat] = dat[feat].fillna('No alley access')
            elif feat == 'FireplaceQu':
                dat[feat] = dat[feat].fillna('No Fireplace')
            elif feat == 'PoolQC':
                dat[feat] = dat[feat].fillna('No Pool')
            elif feat == 'Fence':
                dat[feat] = dat[feat].fillna('No Fence')
            else:  ## feature == 'MiscFeature'
                dat[feat] = dat[feat].fillna('None')
        
        dat[self.feats_na_replace_num] = dat[self.feats_na_replace_num].fillna(0) ## only works for test.csv data
        return dat


<a name="2.2.3"></a>
### missing data imputation

In [12]:
###### missing data imputation #################################
class NAImputer(BaseEstimator, TransformerMixin):
    def __init__(self, feats_imput_median = 'LotFrontage', median_group_feat = 'Neighborhood',
                 feats_imput_mode = feats_imput_mode, 
                 feats_imput_year = 'GarageYrBlt', year_imput_by = 'YearBuilt'):
        
        self.feats_imput_median = feats_imput_median
        self.median_group_feat = median_group_feat
        self.feats_imput_mode = feats_imput_mode
        self.feats_imput_year = feats_imput_year
        self.year_imput_by = year_imput_by
    
    def fit(self, X, y = None):
        dat = X.copy()
        if set([self.feats_imput_median, self.median_group_feat]).issubset(dat.columns):
            self.group_median_dict = dat.groupby(self.median_group_feat)[self.feats_imput_median].median().to_dict()
        else:
            print(('Feature {} or/and Feature {} are not in the data frame.\nNAImputer instance is not fully fitted.'
                   .format(self.feats_imput_median, self.median_group_feat)))
        if set(self.feats_imput_mode).issubset(dat.columns):
            self.modes_dict = dat[self.feats_imput_mode].mode().loc[0, :].to_dict()
        else:
            print('One or more features in `feats_imput_mode` are not in the data frame.\n' \
                  'NAImputer instance is not fully fitted.')
        return self
    
    def transform(self, X, y = None):
        check_is_fitted(self, ['group_median_dict', 'modes_dict'])
        dat = X.copy()
        if set([self.feats_imput_median, self.median_group_feat]).issubset(dat.columns):
            dat[self.feats_imput_median] = dat.apply(lambda row: self.group_median_dict[row[self.median_group_feat]] 
                                                     if np.isnan(row[self.feats_imput_median]) 
                                                     else row[self.feats_imput_median], axis = 1)
        else:
            print(('Feature {} or/and Feature {} are not in the data frame.\n' \
                   'Imputation by NAImputer is not fully completed.'
                   .format(self.feats_imput_median, self.median_group_feat)))
        
        if set(self.feats_imput_mode).issubset(dat.columns):
            dat[self.feats_imput_mode] = dat[self.feats_imput_mode].fillna(self.modes_dict)
        else:
            print('One or more features in `feats_imput_mode` are not in the data frame.\n' \
                  'Imputation by NAImputer is not fully completed.')
        
        if set([self.feats_imput_year, self.year_imput_by]).issubset(dat.columns):
            dat[self.feats_imput_year] = dat.apply(lambda row: row[self.year_imput_by] 
                                                   if np.isnan(row[self.feats_imput_year]) 
                                                   else row[self.feats_imput_year], axis = 1)
        else:
            print(('Feature {} or/and Feature {} are not in the data frame.\n' \
                   'Imputation by NAImputer is not fully completed.'
                   .format(self.feats_imput_year, self.year_imput_by)))
        return dat


<a name="2.2.4"></a>
### create new features

In [13]:
###### create new features ######################################
class FeatGenerator(BaseEstimator, TransformerMixin):
    def __init__(self, cont_to_cat_feats = cont_to_cat_feats, 
                 new_feats_other = new_feats_other, feat_gen_dict = feat_gen_dict):
        
        self.cont_to_cat_feats = cont_to_cat_feats
        self.new_feats_other = new_feats_other
        self.feat_gen_dict = feat_gen_dict
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.cont_to_cat_feats:
            if feat not in dat.columns:
                print("Feature {} is not in the data frame.\n"\
                      "New feature 'has_{}' was not generated.".format(feat, feat))
            else:
                new_feat = 'has_' + feat
                dat[new_feat] = (dat[feat] > 0).astype('int64')
        
        for feat in self.new_feats_other:
            if set(self.feat_gen_dict[feat]).issubset(dat.columns):
                if feat == 'AgebySale':
                    dat[feat] = dat['YrSold'] - dat['YearBuilt']
                elif feat == 'GarageAge':
                    dat[feat] = dat['YrSold'] - dat['GarageYrBlt']
                elif feat == 'RemodAge':
                    dat[feat] = dat['YrSold'] - dat['YearRemodAdd']
                elif feat == 'TotalBath':
                    dat[feat] = (dat['BsmtFullBath'] + 0.5 * dat['BsmtHalfBath'] 
                                 + dat['FullBath'] + 0.5 * dat['HalfBath'])
                elif feat == 'TotalLivArea':
                    dat[feat] = dat['GrLivArea'] + dat['BsmtFinSF1'] + dat['BsmtFinSF2']
                elif feat == 'TotalFlrSF':
                    dat[feat] = dat['TotalBsmtSF'] + dat['1stFlrSF'] + dat['2ndFlrSF']
                else: # feat == 'TotalPorchSF'
                    dat[feat] = (dat['OpenPorchSF'] + dat['EnclosedPorch'] 
                                 + dat['3SsnPorch'] + dat['ScreenPorch'])
                    
            else:
                print("One or more features for generating '{}' are not in the data frame.\n"\
                      "New feature '{}' was not generated.".format(feat, feat))
        return dat


<a name="2.2.5"></a>
### log1p transformation

In [14]:
###### log1p transformation for some continuous features ######################
class Log1pTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, cont_feats_log = cont_feats_log):
        self.cont_feats_log = cont_feats_log
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        if set(self.cont_feats_log).issubset(dat.columns):
            dat[self.cont_feats_log] = dat[self.cont_feats_log].apply(np.log1p)
        else:
            print('One or more features in `cont_feats_log` are not in the data frame.\n' \
                  'log1p transform by Log1pTransformer is not performed.')
        return dat


<a name="2.2.6"></a>
### range binning and ordinal encoding for some discrete and ordinal features

In [15]:
###### range binning + ordinal encoding for some discrete and ordinal features: pd.cut() ##########################
class DiscreteBinEncoder(BaseEstimator, TransformerMixin):
    def __init__(self, bin_encode_feats = bin_encode_feats):
        self.bin_encode_feats = bin_encode_feats
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.bin_encode_feats:
            if feat in dat.columns:
                if feat in ['BsmtFullBath', 'BsmtHalfBath', 'HalfBath']:
                    dat[feat] = pd.cut(dat[feat], bins=[0, 1, 5], right=False, labels=False)
                elif feat in ['FullBath', 'KitchenAbvGr']:
                    dat[feat] = pd.cut(dat[feat], bins=[0, 2, 5], right=False, labels=False)
                elif feat in ['OverallQual', 'OverallCond']:
                    dat[feat] = pd.cut(dat[feat], bins=[0, 4, 5, 6, 7, 8, 9, 15], right=False, labels=False)
                elif feat == 'BedroomAbvGr':
                    dat[feat] = pd.cut(dat[feat], bins=[0, 3, 4, 10], right=False, labels=False)
                elif feat == 'TotRmsAbvGrd':
                    dat[feat] = pd.cut(dat[feat], bins=[0, 5, 6, 7, 8, 9, 20], right=False, labels=False)
                elif feat == 'Fireplaces':
                    dat[feat] = pd.cut(dat[feat], bins=[0, 1, 2, 6], right=False, labels=False)
                elif feat == 'GarageCars':
                    dat[feat] = pd.cut(dat[feat], bins=[0, 1, 2, 3, 8], right=False, labels=False)
                else: # feat == 'TotalBath':
                    dat[feat] = pd.cut(dat[feat], bins=[1, 1.5, 2, 2.5, 3, 3.5, 4, 10], right=False, labels=False)
            else:
                print("Feature {} is not in the data frame.\n" \
                      "No processing with this feature by DiscreteBinEncoder.".format(feat))
        return dat


<a name="2.2.7"></a>
### binning for ordinal features with dtype 'o'

In [16]:
###### binning for ordinal features with dtype 'o' #############################
class OrdinalBinHandler(BaseEstimator, TransformerMixin):
    def __init__(self, ord_feats_bin = ord_feats_bin, ord_bin_dict = ord_bin_dict):
        self.ord_feats_bin = ord_feats_bin
        self.ord_bin_dict = ord_bin_dict
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.ord_feats_bin:
            if feat in dat.columns:
                dat[feat] = dat[feat].replace(self.ord_bin_dict[feat])
            else:
                print("Feature {} is not in the data frame.\n" \
                      "No binning with this feature by OrdinalBinHandler.".format(feat))
        return dat


<a name="2.2.8"></a>
### ordinal encoding for ordinal features with dtype 'o'

In [17]:
###### ordinal encoding with OrdinalEncoder() ##################################
class CustomOrdinalEncoder(BaseEstimator, TransformerMixin):
    def __init__(self, ord_feats_encode = ord_feats_encode, cat_map_dict_full = cat_map_dict_full):
        
        self.ord_feats_encode = ord_feats_encode
        self.cat_map_dict_full = cat_map_dict_full
    
    def fit(self, X, y = None):          
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        feats_notfound = [feat for feat in self.ord_feats_encode if feat not in dat.columns]
        if len(feats_notfound) == len(self.ord_feats_encode):
            print("None of the ordinal features needed for encoding is in the data frame.\n"\
                  "No encoding performed by CustomOrdinalEncoder.")
            return dat
        
        if len(feats_notfound) > 0:
            print("Ordinal feature(s) {} is/are not in the data frame.\n" \
                  "No ordinal encoding with these features by CustomOrdinalEncoder.". format(feats_notfound))
            feat_lst = [feat for feat in self.ord_feats_encode if feat in dat.columns]
        else:
            feat_lst = self.ord_feats_encode
        
        category_lst = [self.cat_map_dict_full[feat] for feat in feat_lst]
        ord_enc = OrdinalEncoder(categories = category_lst)
        dat[feat_lst] = ord_enc.fit_transform(dat[feat_lst])
        
        return dat


<a name="2.2.9"></a>
### category binning for some nominal features

In [18]:
###### category binning for some nominal features: combine rare categories ################
class NominalBinHandler(BaseEstimator, TransformerMixin):
    def __init__(self, merge_cat_dict = merge_cat_dict):
        self.merge_cat_dict = merge_cat_dict
        self.nom_feats_bin = list(self.merge_cat_dict.keys())
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        feats_notfound = [feat for feat in self.nom_feats_bin if feat not in dat.columns]
        if len(feats_notfound) == len(self.nom_feats_bin):
            print("None of the nominal features needed for binning is in the data frame.\n"\
                  "No binning performed by NominalBinHandler.")
            return dat
        
        if len(feats_notfound) > 0:
            print("Nominal feature(s) {} is/are not in the data frame.\n" \
                  "No binning with these features by NominalBinHandler.". format(feats_notfound))
            feat_lst = [feat for feat in self.nom_feats_bin if feat in dat.columns]
        else:
            feat_lst = self.nom_feats_bin
        
        dat[feat_lst] = (dat[feat_lst]
                         .apply(lambda col: np.where(col.isin(self.merge_cat_dict[col.name]), 'other', col)))
        return dat

<a name="2.2.10"></a>
### onehot encoding for non 0/1 nominal features

In [19]:
###### onehot encoding for non 0/1 nominal features ##############################
class CustomOneHotEncoder(BaseEstimator, TransformerMixin):
    def __init__(self, nom_feats_encode = nom_feats_encode):
        self.nom_feats_encode = nom_feats_encode
    
    def fit(self, X, y = None):
        self.oh_enc = OneHotEncoder(sparse=False)
        self.oh_enc.fit(X[self.nom_feats_encode])
        self.ohe_fitted_cols = self.oh_enc.get_feature_names_out()
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        if set(self.nom_feats_encode).issubset(dat.columns):
            dat_nom = pd.DataFrame(self.oh_enc.transform(dat[self.nom_feats_encode]), columns = self.ohe_fitted_cols)
            dat_rest = dat[[feat for feat in dat.columns if feat not in self.nom_feats_encode]]
            dat_full = pd.concat([dat_rest, dat_nom], axis = 1)
            return dat_full
        else:
            print('One or more features in `nom_feats_encode` are not in the data frame.\n' \
                  'Onehot encoding by CustomOneHotEncoder is not performed.')
            return dat


<a name="2.2.11"></a>
### drop some features

In [20]:
###### drop some original features #######################
class FeatureCleaner(BaseEstimator, TransformerMixin):
    def __init__(self, feats_drop = feats_drop):
        self.feats_drop = feats_drop
    
    def fit(self, X, y = None):
        return self
    
    def transform(self, X, y = None):
        dat = X.copy()
        for feat in self.feats_drop:
            if feat in dat.columns:
                dat.drop(feat, axis = 1, inplace = True)
            else:
                print("Feature '{}' is not in the data frame.\n" \
                      "No dropping with this feature by FeatureCleaner".format(feat))
        return dat               

<a name="2.2.12"></a>
### MinMax scaling

In [21]:
###### MinMax scaling for all non 0/1 numerical features #####################
class CustomMinMaxScaler(MinMaxScaler):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
    
    def transform(self, X, y = None):
        check_is_fitted(self)
        dat = X.copy()
        if (self.feature_names_in_ == dat.columns).all():
            dat_trans = pd.DataFrame(super().transform(dat), columns = dat.columns)
            return dat_trans
        else:
            print('One or more features of the data frame *DO NOT* match the fitted features.\n' \
                  'MinMax scaling by CustomMinMaxScaler is not performed.')
            return dat

*back to [2. Create custom scikit-learn Transformers](#2)*     
*back to [content outline](#0)*

<a name="3"></a>
# 3. Feature Engineering Pipeline and Data Preprocessing

In [22]:
steps_feat_eng = [('dtype_transform', DtypeTransformer()),
                  ('na_replace_fill', NAReplaceHandler()), 
                  ('na_imputate', NAImputer()),
                  ('new_feats_generate', FeatGenerator()),
                  ('log1p_transform', Log1pTransformer()),
                  ('discrete_bin_encode', DiscreteBinEncoder()),
                  ('ordinal_bin', OrdinalBinHandler()), 
                  ('ordinal_encode', CustomOrdinalEncoder()), 
                  ('nominal_bin', NominalBinHandler()), 
                  ('nominal_encode', CustomOneHotEncoder()),
                  ('features_drop', FeatureCleaner()),
                  ('minmax_scale', CustomMinMaxScaler())]

pipeline_feat_eng = Pipeline(steps_feat_eng)

#### fit and transform train.csv data, only transform test.csv data

In [23]:
train_processed = pipeline_feat_eng.fit_transform(X_main)
test_processed = pipeline_feat_eng.transform(test_copy)

In [24]:
###### concat target variable y_main to train_processed #########################
train_processed = pd.concat([train_processed, y_main], axis = 1)

In [25]:
###### save the processed data to .csv files for model training ######################
# train_processed.to_csv('train_processed.csv',index=False)
# test_processed.to_csv('test_processed.csv',index=False)

In [26]:
train_processed

Unnamed: 0,LotFrontage,LotArea,LotShape,LandSlope,OverallQual,OverallCond,MasVnrArea,ExterQual,ExterCond,BsmtQual,BsmtCond,BsmtExposure,BsmtFinType1,BsmtFinSF1,BsmtFinType2,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,HeatingQC,CentralAir,Electrical,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,KitchenQual,TotRmsAbvGrd,Functional,Fireplaces,FireplaceQu,GarageFinish,GarageCars,GarageArea,GarageQual,GarageCond,PavedDrive,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,PoolQC,Fence,MiscVal,has_MasVnrArea,has_BsmtFinSF1,has_BsmtFinSF2,has_2ndFlrSF,has_LowQualFinSF,has_WoodDeckSF,has_OpenPorchSF,has_EnclosedPorch,has_3SsnPorch,has_ScreenPorch,has_PoolArea,has_MiscVal,has_TotalBsmtSF,has_GarageArea,AgebySale,GarageAge,RemodAge,TotalBath,TotalLivArea,TotalFlrSF,TotalPorchSF,MSSubClass_120,MSSubClass_160,MSSubClass_190,MSSubClass_20,MSSubClass_30,MSSubClass_50,MSSubClass_60,MSSubClass_70,MSSubClass_75,MSSubClass_80,MSSubClass_85,MSSubClass_90,MSSubClass_other,MSZoning_C (all),MSZoning_FV,MSZoning_RH,MSZoning_RL,MSZoning_RM,Street_Grvl,Street_Pave,Alley_Grvl,Alley_No alley access,Alley_Pave,LandContour_Bnk,LandContour_HLS,LandContour_Low,LandContour_Lvl,LotConfig_Corner,LotConfig_CulDSac,LotConfig_FR2,LotConfig_FR3,LotConfig_Inside,Neighborhood_Blmngtn,Neighborhood_BrDale,Neighborhood_BrkSide,Neighborhood_ClearCr,Neighborhood_CollgCr,Neighborhood_Crawfor,Neighborhood_Edwards,Neighborhood_Gilbert,Neighborhood_IDOTRR,Neighborhood_MeadowV,Neighborhood_Mitchel,Neighborhood_NAmes,Neighborhood_NWAmes,Neighborhood_NoRidge,Neighborhood_NridgHt,Neighborhood_OldTown,Neighborhood_SWISU,Neighborhood_Sawyer,Neighborhood_SawyerW,Neighborhood_Somerst,Neighborhood_StoneBr,Neighborhood_Timber,Neighborhood_other,Condition1_Artery,Condition1_Feedr,Condition1_Norm,Condition1_PosN,Condition1_RRAn,Condition1_other,Condition2_Norm,Condition2_other,BldgType_1Fam,BldgType_2fmCon,BldgType_Duplex,BldgType_Twnhs,BldgType_TwnhsE,HouseStyle_1.5Fin,HouseStyle_1Story,HouseStyle_2Story,HouseStyle_SFoyer,HouseStyle_SLvl,HouseStyle_other,RoofStyle_Gable,RoofStyle_Hip,RoofStyle_other,RoofMatl_CompShg,RoofMatl_other,Exterior1st_AsbShng,Exterior1st_BrkFace,Exterior1st_CemntBd,Exterior1st_HdBoard,Exterior1st_MetalSd,Exterior1st_Plywood,Exterior1st_Stucco,Exterior1st_VinylSd,Exterior1st_Wd Sdng,Exterior1st_WdShing,Exterior1st_other,Exterior2nd_AsbShng,Exterior2nd_BrkFace,Exterior2nd_CmentBd,Exterior2nd_HdBoard,Exterior2nd_MetalSd,Exterior2nd_Plywood,Exterior2nd_Stucco,Exterior2nd_VinylSd,Exterior2nd_Wd Sdng,Exterior2nd_Wd Shng,Exterior2nd_other,MasVnrType_BrkCmn,MasVnrType_BrkFace,MasVnrType_None,MasVnrType_Stone,Foundation_BrkTil,Foundation_CBlock,Foundation_PConc,Foundation_Slab,Foundation_other,Heating_GasA,Heating_GasW,Heating_other,GarageType_Attchd,GarageType_Basment,GarageType_BuiltIn,GarageType_Detchd,GarageType_No Garage,GarageType_other,MiscFeature_None,MiscFeature_Shed,MiscFeature_other,SaleType_COD,SaleType_New,SaleType_WD,SaleType_other,SaleCondition_Abnorml,SaleCondition_Family,SaleCondition_Normal,SaleCondition_Partial,SaleCondition_other,SalePrice
0,0.413268,0.366271,1.0,0.0,0.666667,0.333333,0.716038,0.666667,0.5,0.8,0.666667,0.25,1.000000,0.322669,0.166667,0.000000,0.064212,0.266999,1.000000,1.0,1.0,0.414559,0.469747,0.0,0.684506,1.0,0.0,1.0,1.0,0.5,0.0,0.666667,0.8,1.000000,0.0,0.0,0.666667,0.666667,0.394245,0.666667,0.666667,1.0,0.000000,0.654449,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,1.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.036765,0.036765,0.083333,0.833333,0.730270,0.689254,0.595085,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,208500
1,0.490307,0.391245,1.0,0.0,0.500000,0.833333,0.000000,0.333333,0.5,0.8,0.666667,1.00,0.833333,0.446984,0.166667,0.000000,0.121575,0.393637,1.000000,1.0,1.0,0.585716,0.000000,0.0,0.557071,0.0,1.0,1.0,0.0,0.5,0.0,0.333333,0.4,1.000000,0.5,0.6,0.666667,0.666667,0.330935,0.666667,0.666667,1.0,0.843935,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.227941,0.227941,0.516667,0.500000,0.702330,0.683670,0.000000,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,181500
2,0.429990,0.422289,0.5,0.0,0.666667,0.333333,0.690361,0.666667,0.5,0.8,0.666667,0.50,1.000000,0.222121,0.166667,0.000000,0.185788,0.286962,1.000000,1.0,1.0,0.446346,0.476348,0.0,0.702749,1.0,0.0,1.0,1.0,0.5,0.0,0.666667,0.4,1.000000,0.5,0.6,0.666667,0.666667,0.437410,0.666667,0.666667,1.0,0.000000,0.596422,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,1.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.051471,0.051471,0.100000,0.833333,0.707570,0.707228,0.542321,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,223500
3,0.383633,0.390223,0.5,0.0,0.666667,0.333333,0.000000,0.333333,0.5,0.6,1.000000,0.25,0.833333,0.098720,0.166667,0.000000,0.231164,0.235808,0.666667,1.0,1.0,0.465569,0.415842,0.0,0.686220,1.0,0.0,0.0,0.0,0.5,0.0,0.666667,0.6,1.000000,0.5,0.8,0.333333,1.000000,0.461871,0.666667,0.666667,1.0,0.000000,0.568247,0.492754,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.669118,0.058824,0.600000,0.333333,0.647885,0.676764,0.826214,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,140000
4,0.508439,0.468694,0.5,0.0,0.833333,0.333333,0.794318,0.666667,0.5,0.8,0.666667,0.75,1.000000,0.299360,0.166667,0.000000,0.209760,0.357143,1.000000,1.0,1.0,0.542812,0.579208,0.0,0.789834,1.0,0.0,1.0,1.0,1.0,0.0,0.666667,1.0,1.000000,0.5,0.6,0.666667,1.000000,0.601439,0.666667,0.666667,1.0,0.779126,0.704481,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,1.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.058824,0.058824,0.133333,0.833333,0.791685,0.778757,0.640579,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,250000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1451,0.395769,0.353519,1.0,0.0,0.500000,0.333333,0.000000,0.333333,0.5,0.8,0.666667,0.25,0.166667,0.000000,0.166667,0.000000,0.407962,0.297255,1.000000,1.0,1.0,0.461883,0.381738,0.0,0.668758,0.0,0.0,1.0,1.0,0.5,0.0,0.333333,0.6,1.000000,0.5,0.6,0.666667,0.666667,0.330935,0.666667,0.666667,1.0,0.000000,0.588869,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.058824,0.058824,0.116667,0.500000,0.588748,0.693708,0.535454,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,175000
1452,0.512839,0.453205,1.0,0.0,0.500000,0.500000,0.648854,0.333333,0.5,0.8,0.666667,0.25,0.833333,0.361060,0.500000,0.698955,0.252140,0.480973,0.333333,1.0,1.0,0.804619,0.000000,0.0,0.765268,1.0,0.0,1.0,0.0,0.5,0.0,0.333333,0.6,0.666667,1.0,0.6,0.333333,0.666667,0.359712,0.666667,0.666667,1.0,0.867250,0.000000,0.000000,0.0,0.0,0.0,0.0,0.666667,0.000000,1.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.235294,0.235294,0.366667,0.666667,0.813433,0.805225,0.000000,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,210000
1453,0.418925,0.379525,1.0,0.0,0.666667,1.000000,0.000000,1.000000,1.0,0.6,1.000000,0.25,1.000000,0.125686,0.166667,0.000000,0.375428,0.359326,1.000000,1.0,1.0,0.559069,0.633663,0.0,0.816101,0.0,0.0,1.0,0.0,1.0,0.0,0.666667,1.0,1.000000,1.0,0.8,0.666667,0.333333,0.181295,0.666667,0.666667,1.0,0.000000,0.651870,0.000000,0.0,0.0,0.0,0.0,1.000000,0.810936,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.507353,0.507353,0.066667,0.333333,0.759507,0.793512,0.592740,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,266500
1454,0.429990,0.393616,1.0,0.0,0.333333,0.500000,0.000000,0.333333,0.5,0.6,0.666667,0.50,1.000000,0.022395,0.500000,0.950784,0.000000,0.336245,0.666667,1.0,0.5,0.516224,0.000000,0.0,0.490978,1.0,0.0,0.0,0.0,0.0,0.0,0.666667,0.2,1.000000,0.0,0.0,0.333333,0.333333,0.172662,0.666667,0.666667,1.0,0.874272,0.000000,0.202899,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.441176,0.441176,0.233333,0.333333,0.688213,0.630353,0.681635,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,142125


*back to [content outline](#0)*