# Linear Regression Modelling with Elastic Net
Use 'quick & dirty' custom functions from my LinRegModel class to find an optimized baseline model. Build an evaluation pipeline to evaluate different treatments for the training data (using that baseline model). Save final pipeline / model for later feature importance evaluation (in notebook 4).

*Note: An issue that was not completely / elegantly solved: I use sklearns ElasticNetCV() class with integrated gridsearch for tuning of the baseline model. When I use it inside a pipeline I don't know how get the best params out of it. So I used ElasticNet() for the final model, but it did not work well with the params I had tuned for the baseline model ... Don't know why. I should have evaluated the final ElasticNet params with proper grid search in the pipeline, but was to lazy in the end to do that ... did some manual trial and error.*

**Data Sources**
- `data/raw/train.csv`: Training set from kaggle.

**Output**
- `feature_names`: List of column labels for preprocessed data.
- `models/full_pipe_final.pkl`: Best pipeline / model.

**Changes**
- 2019-03-22: Start notebook
- 2019-03-29: Finish notebook, save best model
- 2019-04-06: Big refactoring, OHE outside final Pipeline

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Import-libraries,-load-data" data-toc-modified-id="Import-libraries,-load-data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Import libraries, load data</a></span></li><li><span><a href="#Go-quick-&amp;-dirty" data-toc-modified-id="Go-quick-&amp;-dirty-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Go quick &amp; dirty</a></span></li><li><span><a href="#General-data-pre-processing-(outside-of-sklearn-pipeline)" data-toc-modified-id="General-data-pre-processing-(outside-of-sklearn-pipeline)-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>General data pre-processing (outside of sklearn pipeline)</a></span><ul class="toc-item"><li><span><a href="#General-pre-processing" data-toc-modified-id="General-pre-processing-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>General pre-processing</a></span></li></ul></li><li><span><a href="#Explore-different-feature-set-options" data-toc-modified-id="Explore-different-feature-set-options-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Explore different feature set options</a></span><ul class="toc-item"><li><span><a href="#Define-pipeline-with-CV-to-evaluate-different-options" data-toc-modified-id="Define-pipeline-with-CV-to-evaluate-different-options-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Define pipeline with CV to evaluate different options</a></span></li><li><span><a href="#Explore-feature-sets" data-toc-modified-id="Explore-feature-sets-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Explore feature sets</a></span><ul class="toc-item"><li><span><a href="#Standard-train-set-(with-NaN-and-all-Outliers)" data-toc-modified-id="Standard-train-set-(with-NaN-and-all-Outliers)-4.2.1"><span class="toc-item-num">4.2.1&nbsp;&nbsp;</span>Standard train set (with NaN and all Outliers)</a></span></li><li><span><a href="#Train-set-without-columns-containing-NaN" data-toc-modified-id="Train-set-without-columns-containing-NaN-4.2.2"><span class="toc-item-num">4.2.2&nbsp;&nbsp;</span>Train set without columns containing NaN</a></span></li><li><span><a href="#Full-train-set-with-outliers-removed-for-top-correlating-columns" data-toc-modified-id="Full-train-set-with-outliers-removed-for-top-correlating-columns-4.2.3"><span class="toc-item-num">4.2.3&nbsp;&nbsp;</span>Full train set with outliers removed for top correlating columns</a></span></li><li><span><a href="#Full-train-set-with-outliers-removed-and-multi-correlation-columns-removed" data-toc-modified-id="Full-train-set-with-outliers-removed-and-multi-correlation-columns-removed-4.2.4"><span class="toc-item-num">4.2.4&nbsp;&nbsp;</span>Full train set with outliers removed and multi-correlation columns removed</a></span></li></ul></li></ul></li><li><span><a href="#Final-tuning-&amp;-evaluation" data-toc-modified-id="Final-tuning-&amp;-evaluation-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Final tuning &amp; evaluation</a></span><ul class="toc-item"><li><span><a href="#Save-final-data-and-model" data-toc-modified-id="Save-final-data-and-model-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Save final data and model</a></span></li></ul></li><li><span><a href="#Apppendix:-Experiment-with-preprocessing-pipe-only" data-toc-modified-id="Apppendix:-Experiment-with-preprocessing-pipe-only-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Apppendix: Experiment with preprocessing pipe only</a></span></li></ul></div>

---

## Import libraries, load data

In [1]:
# Import libraries
import numpy as np
import pandas as pd
from tqdm import tqdm

from scipy import stats
from scipy.stats import norm, skew

from sklearn.linear_model import ElasticNetCV, ElasticNet
from sklearn.model_selection import train_test_split, learning_curve
from sklearn.metrics import r2_score, mean_squared_error, make_scorer
from sklearn.model_selection import StratifiedKFold, GridSearchCV
from sklearn.externals import joblib
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer

# My functions
import EDA_functions as EDA
import cleaning_functions as cleaning
from linRegModel_class import LinRegModel
import custom_transformers as transform

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns #, sns.set_style('whitegrid')
color = 'rebeccapurple'
%matplotlib inline

# Display settings
from IPython.display import display
pd.options.display.max_columns = 100

In [2]:
# Load data
raw_data = pd.read_csv('data/raw/train.csv')

# Check shape
display(raw_data.shape)

(1460, 81)

In [3]:
# Load variables from notebook 1
%store -r cols_to_del
%store -r cols_to_log
%store -r outliers_to_del
%store -r top_corr_columns

## Go quick & dirty
Use my 'quick & dirty' function for a baseline model on unprocessed data.

In [4]:
# Initialize a scikit-learn model object of choice - here ElasticNetCV for some param tuning
model_simple = ElasticNetCV(alphas=[0.03, 0.05, 0.09], copy_X=True, cv=5, eps=0.001, 
                            fit_intercept=True, l1_ratio=[0.6, 0.9, 1.0], max_iter=3000, 
                            n_alphas=100, n_jobs=-1)

# Create an instance of the LinRegModel class by passing df, target variable and model object
elastic_net_simple = LinRegModel(raw_data, 'SalePrice', model_simple)

# Output instance
display(elastic_net_simple)

ElasticNetCV(alphas=[0.03, 0.05, 0.09], copy_X=True, cv=5, eps=0.001,
       fit_intercept=True, l1_ratio=[0.6, 0.9, 1.0], max_iter=3000,
       n_alphas=100, n_jobs=-1, normalize=False, positive=False,
       precompute='auto', random_state=None, selection='cyclic',
       tol=0.0001, verbose=0)

In [5]:
# Perform the modelling
elastic_net_simple.go_quickDirty()





In [6]:
# Output result
elastic_net_simple

ElasticNetCV(alphas=[0.03, 0.05, 0.09], copy_X=True, cv=5, eps=0.001,
       fit_intercept=True, l1_ratio=[0.6, 0.9, 1.0], max_iter=3000,
       n_alphas=100, n_jobs=-1, normalize=False, positive=False,
       precompute='auto', random_state=None, selection='cyclic',
       tol=0.0001, verbose=0)

RMSE on test data 33373.76, r2-score 0.80.

In [7]:
# Check best values
print(model_simple.alpha_)
print(model_simple.l1_ratio_)
print(model_simple.n_iter_)

0.09
0.9
3000


## General data pre-processing (outside of sklearn pipeline)
Pre-processing steps that take place before data is pipelined

### General pre-processing

In [8]:
# Disable warning
pd.set_option('mode.chained_assignment', None)

# Create and clean training set with variables from the EDA notebook
train_data = (raw_data
              .pipe(cleaning.change_dtypes, cols_to_category=raw_data.select_dtypes(object))
              .pipe(cleaning.delete_columns, cols_to_delete=cols_to_del)
              .pipe(cleaning.apply_log, cols_to_transform=cols_to_log)
             )

train_data.drop(outliers_to_del, inplace=True)
train_data.dropna(subset=['MasVnrArea', 'MasVnrType', 'Electrical'], inplace=True);

'MiscFeature successfully deleted'

'PoolQC successfully deleted'

'FireplaceQu successfully deleted'

'Alley successfully deleted'

'Id successfully deleted'

'Fence successfully deleted'

In [9]:
# check results
display(train_data.shape)

(1447, 75)

## Explore different feature set options

### Define pipeline with CV to evaluate different options

In [10]:
def evaluate_feature_sets(df, reg, scorer, cv=StratifiedKFold(3)):
    """Build a pipeline for evaluating different combinations of data, models
    and scorers with stratified crossevaluation. The pipeline performs 
    necessary transformations onf categorical and numeric features and 
    evaluates the imputation strategy or if numerical data is to scale or
    not. In this notebook my only changing variable will be the input data
    
    ARGUMENTS:
        df: dataframe, input data for modelling
        reg: sklearn model instance, a baseline model
        scorer: string or make_score() object (?) 
        
    RETURNS:
        grid_results: dict, best parameters for model
        best_score: float, highest score value - watch out if you have a loss
            function. Then you have to search for the minimal score value
    """
    
    # Split input features and target label
    X_train = df.drop('SalePrice', axis=1)
    y_train = df['SalePrice'].copy()
    
    # Define cat and num feature columns
    categorical_features = X_train.select_dtypes(include=['category']).columns
    numeric_features = X_train.select_dtypes(include=['float64', 'int64']).columns
    assert len(categorical_features) + len(numeric_features) == df.shape[1] - 1
    
    ## Assemble pipeline (define function)
    
    # level 1 - two separate pipes for cat and num features
    numeric_transformer = Pipeline(steps=[
        ('imputer_n', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler()),
            ])

    categorical_transformer = Pipeline(steps=[
        ('imputer_c', SimpleImputer(strategy='constant', fill_value='missing')),
        ('ohe', OneHotEncoder(handle_unknown='ignore')),
            ])

    # level 2 - wrap the two level 1 pipes into a ColumnTransformer
    preprocessor = ColumnTransformer(
            transformers=[
                ('num', numeric_transformer, numeric_features),
                ('cat', categorical_transformer, categorical_features),
                         ])

    # level 3 - pipe it with a classifier
    full_pipe = Pipeline(steps=[
                       ('preprocessor', preprocessor),
                       ('reg', model_simple),
                               ]) 
    
    # Evaluate imputing strategy for missing num values and scaling
    parameters = {
        'preprocessor__num__imputer_n__strategy': ['mean', 'median'],
        'preprocessor__num__scaler' : [None, StandardScaler()]
                 }

    cv = GridSearchCV(full_pipe, param_grid=parameters, scoring=scorer, n_jobs=-1, iid=False,
                      cv=cv, error_score='raise', return_train_score=False, verbose=1)

    grid = cv.fit(X_train, y_train) 
    grid_results = grid.cv_results_

    # Here I have to go for the smallest score (CV expects utility function
    # and not cost function, see Hands-OnML p. 70)
    best_score = np.sqrt(np.min(grid_results['mean_test_score']))
    
    return grid_results, best_score

In [11]:
# Define input parameters
scorer = make_scorer(mean_squared_error)
reg = elastic_net_simple # 'optimized baseline model'
cv = 3

### Explore feature sets

#### Standard train set (with NaN and all Outliers)

In [12]:
# Run pipeline
grid_results, best_score = evaluate_feature_sets(train_data, reg=reg, scorer=scorer, cv=cv)

# Print best score
print(best_score)

Fitting 3 folds for each of 4 candidates, totalling 12 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 out of  12 | elapsed:   12.9s finished


0.13817449507839052


In [13]:
display(pd.DataFrame(grid_results).nsmallest(1, 'mean_test_score'))

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_preprocessor__num__imputer_n__strategy,param_preprocessor__num__scaler,params,split0_test_score,split1_test_score,split2_test_score,mean_test_score,std_test_score,rank_test_score
1,1.568139,0.436887,0.089018,0.01974,mean,"StandardScaler(copy=True, with_mean=True, with...",{'preprocessor__num__imputer_n__strategy': 'me...,0.018117,0.020913,0.018247,0.019092,0.001289,4


**Result:** Best score for imputation with mean and applied StandardScaler(). (The latter has an impact but imputation with 'mean' or 'median' leads to more or less the same result.)

#### Train set without columns containing NaN

In [14]:
# Create List of Columns containing NaN
nan_cols = []
for col in train_data.columns:
    if train_data[col].isnull().sum() > 0:
        nan_cols.append(col)

In [15]:
# Check results
nan_cols

['LotFrontage',
 'BsmtQual',
 'BsmtCond',
 'BsmtExposure',
 'BsmtFinType1',
 'BsmtFinType2',
 'GarageType',
 'GarageYrBlt',
 'GarageFinish',
 'GarageQual',
 'GarageCond']

In [16]:
# Create train set without missing values (drop nan_cols)
train_data_reduced = train_data.drop(nan_cols, axis=1)

assert train_data_reduced.isnull().sum().sum() == 0
assert train_data_reduced.shape[1] == train_data.shape[1] - len(nan_cols)

In [17]:
# Run pipeline
grid_results, best_score = evaluate_feature_sets(train_data_reduced, 
                                                 reg=reg, scorer=scorer, cv=cv)

# Print best score
print(best_score)

Fitting 3 folds for each of 4 candidates, totalling 12 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 out of  12 | elapsed:    4.0s finished


0.13817450160297168


**Result:** Results are really, really close. Imputing with Mean scores slightly better than elimination of the columns.

#### Full train set with outliers removed for top correlating columns

In [18]:
# Remove Outliers for remaining top_corr_cols
top_corr_columns = set(train_data.columns).intersection(set(top_corr_columns))
train_data_outliers = cleaning.remove_outliers_IQR_method(train_data, top_corr_columns)

FullBath
Rows removed: 0

YearRemodAdd
Rows removed: 0

YearBuilt
Rows removed: 7

OverallQual
Rows removed: 1

TotalBsmtSF
Rows removed: 49

GarageArea
Rows removed: 20

SalePrice
Rows removed: 21

1stFlrSF
Rows removed: 1

GrLivArea
Rows removed: 1

TotRmsAbvGrd
Rows removed: 16

GarageCars
Rows removed: 2

GarageYrBlt
Rows removed: 1


Rows removed in total: 119



In [19]:
# Run pipeline
grid_results, best_score = evaluate_feature_sets(train_data_outliers, 
                                                 reg=reg, scorer=scorer, cv=cv)

# Print best score
print(best_score)

Fitting 3 folds for each of 4 candidates, totalling 12 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 out of  12 | elapsed:    4.9s finished


0.12656302335354497


**Result:** Result on data with removed outliers leads to a better score.

In [20]:
# Check params
display(pd.DataFrame(grid_results).nsmallest(1, 'mean_test_score'))

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_preprocessor__num__imputer_n__strategy,param_preprocessor__num__scaler,params,split0_test_score,split1_test_score,split2_test_score,mean_test_score,std_test_score,rank_test_score
3,1.45768,0.246447,0.036368,0.005242,median,"StandardScaler(copy=True, with_mean=True, with...",{'preprocessor__num__imputer_n__strategy': 'me...,0.016128,0.015421,0.016506,0.016018,0.00045,4


#### Full train set with outliers removed and multi-correlation columns removed

In [21]:
# Remove Outliers for remaining top_corr_cols in the reduced data set
# GarageYrBlt was a the top_corr_features that was dropped above
cols_multi = set(train_data_outliers.columns).intersection(set(['1stFloor', 'GarageArea', 'FirstFlSF']))
train_data_multi = cleaning.delete_columns(train_data_outliers,  cols_multi)

assert train_data_multi.shape[1] == train_data_outliers.shape[1] - len(cols_multi)

'GarageArea successfully deleted'

In [22]:
# Run pipeline
grid_results, best_score = evaluate_feature_sets(train_data_multi, 
                                                 reg=reg, scorer=scorer, cv=cv)

# Print best score
print(best_score)

Fitting 3 folds for each of 4 candidates, totalling 12 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 out of  12 | elapsed:    4.4s finished


0.12658308430203355


**Result:** Result on data with removed multi_col(s) is slightly worse.

## Final tuning & evaluation

The full train_set with mean imputation, scaling and outlier removal scored the best results. So I will use this config. I will do the OHE outside the pipeline so I have no problems afterwards, when I want to inspect feature importance. I will also tune the regressor inside the pipeline this time.

In [23]:
# Split features and target variable
X_temp = train_data_outliers.drop(['SalePrice'], axis = 1)
y = train_data_outliers['SalePrice'].copy()
# One-Hot-Encode features and save column names
X = pd.get_dummies(X_temp, dummy_na=True)
feature_names = X.columns
# Save names for cat and num features separately (for pipeline)
num_features = list(X_temp.select_dtypes(include=['float64', 'int64']).columns)
cat_features = list(set(feature_names).difference(set(num_features)))
assert len(num_features) + len(cat_features) == len(feature_names)
                        
# Train-test split
X_train, X_test, y_train, y_test  = train_test_split(X, y, test_size = 0.2, random_state = 666)

In [24]:
# Check feature_names
print(feature_names[:5])
print("Number of one-hot-encoded features: ", len(feature_names))

Index(['MSSubClass', 'LotFrontage', 'LotArea', 'OverallQual', 'OverallCond'], dtype='object')
Number of one-hot-encoded features:  308


Note: Numeric features are passed before categorical ones.

In [25]:
# Assemble pipeline (define function)
def build_final_pipe(X_train, y_train, reg):
    """Build a pipeline for preprocessing and modelling.
    
    ARGUMENTS:
        X_train: training features (df or array)
        y_train: training labels (df or array)
        reg: classifier (sk-learn model object)
        
    RETURNS:
        full_pipe: pipeline object
    """
    
    # Define cat and num feature columns
    categorical_features = X_train[cat_features].columns
    numeric_features = X_train[num_features].columns
    assert len(categorical_features) + len(numeric_features) == X_train.shape[1]
    
    # level 1 - two separate pipes for cat and num features
    numeric_transformer = Pipeline(steps=[
        ('imputer_n', SimpleImputer(strategy='mean')),
        ('scaler', StandardScaler()),
            ])

    categorical_transformer = Pipeline(steps=[
        ('pass', transform.PassthroughTransformer()), # Simple Passtrough
            ])

    # level 2 - wrap the two level 1 pipes into a ColumnTransformer
    preprocessor = ColumnTransformer(
            transformers=[
                ('num', numeric_transformer, numeric_features),
                ('cat', categorical_transformer, categorical_features),
                         ])

    # level 3 - pipe it with a classifier
    full_pipe = Pipeline(steps=[
                       ('preprocessor', preprocessor),
                       ('reg', reg),
                               ]) 
    
    return preprocessor, full_pipe

In [26]:
# Define final model withouth CV, some parameters changed. 
# Among others I lowered l1_ratio to have more stability (trading in some precision)
elastic_net_final = ElasticNet(alpha=0.0009, l1_ratio=0.5, max_iter=3000,
                               fit_intercept=True, normalize=False, precompute=False, 
                                 tol=0.0001, copy_X=True, warm_start=False, 
                                 positive=False, random_state=666)

# Build, fit, predict
preprocessor_final, full_pipe_final = build_final_pipe(X_train, y_train, elastic_net_final) 
full_pipe_final.fit(X_train, y_train)
y_pred = full_pipe_final.predict(X_test)

In [27]:
print('Test r2 score: ', r2_score(y_test, y_pred))
test_mse = mean_squared_error(y_pred, y_test)
test_rmse = np.sqrt(test_mse)
print('Test RMSE: %.4f' % test_rmse)

Test r2 score:  0.9232344582903343
Test RMSE: 0.0911


In [28]:
# Print number of coeffs of fitted model
elastic_net_fitted = full_pipe_final.named_steps['reg']
print(elastic_net_fitted)
print(len(elastic_net_final.coef_))

ElasticNet(alpha=0.0009, copy_X=True, fit_intercept=True, l1_ratio=0.5,
      max_iter=3000, normalize=False, positive=False, precompute=False,
      random_state=666, selection='cyclic', tol=0.0001, warm_start=False)
308


### Save final data and model

In [29]:
joblib.dump(elastic_net_fitted, 'models/elastic_net_final.pkl')
%store feature_names

Stored 'feature_names' (Index)


---

## Apppendix: Experiment with preprocessing pipe only 

(**Note to myself**: instead of redefining a preprocessing_pipe I could simply access the preprocessor step from the full_pipe and then call fit_transform() on it, as done in OHE_SMOTENC pipe in the feature_engineering ref code.)

In [30]:
preprocessed_data = pd.DataFrame(preprocessor_final.fit_transform(X_train))

In [33]:
# Have a look
preprocessed_data.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,...,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307
0,1.439241,-2.059319,-1.102444,0.695955,-0.559401,1.190503,1.130153,1.160692,0.773608,-0.36544,0.175423,0.80654,0.616268,-0.863385,-0.115326,-0.236465,1.10337,-0.23584,0.869888,-0.777852,-1.118721,-0.081748,-0.271242,0.621969,1.233594,0.350952,0.825997,1.032477,0.879823,-0.40153,-0.138279,-0.285155,-0.053222,-0.194815,-0.870215,0.91005,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,...,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
1,1.439241,-0.9808,-0.470414,0.695955,-0.559401,1.12201,1.03206,1.043575,-1.505934,-0.36544,0.6448,0.931397,0.752522,-0.863385,-0.115326,-0.104225,-0.823898,-0.23584,0.869888,-0.777852,-1.118721,-0.081748,-0.271242,0.621969,1.107974,0.350952,0.040435,0.97817,0.749553,-0.40153,-0.138279,-0.285155,-0.053222,-0.194815,0.997546,-1.323708,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
2,-1.105583,0.0,1.234888,-0.086166,-0.559401,-0.110866,-0.78267,1.290644,0.814614,-0.36544,0.402273,1.857072,1.762695,-0.863385,-0.115326,0.876194,1.10337,-0.23584,0.869888,-0.777852,0.205785,-0.081748,0.429632,0.621969,-0.399472,0.350952,0.381534,-0.968475,0.917033,2.628924,-0.138279,-0.285155,-0.053222,-0.194815,2.118203,-1.323708,0.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0
3,1.439241,-1.551026,-0.879152,1.478075,-0.559401,0.950778,0.737779,-0.840471,0.82307,-0.36544,0.105918,1.219037,1.405019,-0.863385,-0.115326,0.529053,1.10337,-0.23584,0.869888,-0.777852,-1.118721,-0.081748,-0.972117,0.621969,0.898606,0.350952,0.16964,1.075376,0.868523,-0.40153,-0.138279,-0.285155,-0.053222,-0.194815,1.371099,0.91005,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
4,-1.105583,0.497312,-0.04942,-1.650407,0.382095,-0.521825,-1.37123,-0.840471,0.545121,-0.36544,0.335894,-0.342663,-0.637836,-0.863385,-0.115326,-1.453631,-0.823898,-0.23584,-1.011621,-0.777852,0.205785,-0.081748,-0.972117,-0.962038,-0.60884,-1.055503,-0.590081,-0.968475,-1.109654,-0.40153,-0.138279,-0.285155,-0.053222,-0.194815,0.250442,0.165464,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0


---