<a href="https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_2_keras_ensembles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-558: Applications of Deep Neural Networks
**Module 8: Kaggle Data Sets**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 8 Material

* Part 8.1: Introduction to Kaggle [[Video]](https://www.youtube.com/watch?v=v4lJBhdCuCU&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_1_kaggle_intro.ipynb)
* **Part 8.2: Building Ensembles with Scikit-Learn and Keras** [[Video]](https://www.youtube.com/watch?v=LQ-9ZRBLasw&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_2_keras_ensembles.ipynb)
* Part 8.3: How Should you Architect Your Keras Neural Network: Hyperparameters [[Video]](https://www.youtube.com/watch?v=1q9klwSoUQw&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_3_keras_hyperparameters.ipynb)
* Part 8.4: Bayesian Hyperparameter Optimization for Keras [[Video]](https://www.youtube.com/watch?v=sXdxyUCCm8s&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_4_bayesian_hyperparameter_opt.ipynb)
* Part 8.5: Current Semester's Kaggle [[Video]](https://www.youtube.com/watch?v=PHQt0aUasRg&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_5_kaggle_project.ipynb)


# Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.
  Running the following code will map your GDrive to ```/content/drive```.

In [1]:
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    COLAB = True
    print("Note: using Google CoLab")
    %tensorflow_version 2.x
except:
    print("Note: not using Google CoLab")
    COLAB = False

# Nicely formatted time string
def hms_string(sec_elapsed):
    h = int(sec_elapsed / (60 * 60))
    m = int((sec_elapsed % (60 * 60)) / 60)
    s = sec_elapsed % 60
    return "{}:{:>02}:{:>05.2f}".format(h, m, s)

Note: not using Google CoLab


# Part 8.2: Building Ensembles with Scikit-Learn and Keras

### Evaluating Feature Importance

Feature importance tells us how important each feature (from the feature/import vector) is to predicting a neural network or another model. There are many different ways to evaluate the feature importance of neural networks. The following paper presents an excellent (and readable) overview of the various means of assessing the significance of neural network inputs/features.

* An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data [[Cite:olden2004accurate]](http://depts.washington.edu/oldenlab/wordpress/wp-content/uploads/2013/03/EcologicalModelling_2004.pdf). *Ecological Modelling*, 178(3), 389-397.

In summary, the following methods are available to neural networks:

* Connection Weights Algorithm
* Partial Derivatives
* Input Perturbation
* Sensitivity Analysis
* Forward Stepwise Addition 
* Improved Stepwise Selection 1
* Backward Stepwise Elimination
* Improved Stepwise Selection

For this chapter, we will use the **input Perturbation feature ranking algorithm**. This algorithm will work with any regression or classification network. In the next section, I provide an implementation of the **input perturbation algorithm** for scikit-learn. This code implements a function below that will work with any scikit-learn model.

[Leo Breiman](https://en.wikipedia.org/wiki/Leo_Breiman) provided this algorithm in his seminal paper on random forests. [[Citebreiman2001random:]](https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf)  Although he presented this algorithm in conjunction with random forests, it is **model-independent and appropriate for any supervised learning model.  This algorithm, known as the input perturbation algorithm**, works by evaluating a trained model’s accuracy with each input individually shuffled from a data set. Shuffling an input causes it to become useless—effectively removing it from the model. More important inputs will produce a less accurate score when they are removed by shuffling them. This process makes sense because important features will contribute to the model's accuracy. I first presented the TensorFlow implementation of this algorithm in the following paper.

* Early stabilizing feature importance for TensorFlow deep neural networks[[Cite:heaton2017early]](https://www.heatonresearch.com/dload/phd/IJCNN%202017-v2-final.pdf)

This algorithm will use log loss to evaluate a classification problem and RMSE for regression.

In [2]:
from sklearn import metrics
import scipy as sp
import numpy as np
import math
from sklearn import metrics

def perturbation_rank(model, x, y, names, regression):
    errors = []

    for i in range(x.shape[1]):
        hold = np.array(x[:, i]) # save for future roll back
        np.random.shuffle(x[:, i]) # in place shuffle of the feature column
        
        if regression:
            pred = model.predict(x)
            error = metrics.mean_squared_error(y, pred)
        else:
            pred = model.predict(x)
            error = metrics.log_loss(y, pred)
            
        errors.append(error)
        x[:, i] = hold # roll back
        
    max_error = np.max(errors)
    importance = [e/max_error for e in errors]

    data = {'name':names,'error':errors,'importance':importance}
    result = pd.DataFrame(data, columns = ['name','error','importance'])
    result.sort_values(by=['importance'], ascending=[0], inplace=True)
    result.reset_index(inplace=True, drop=True)
    return result

## Classification and Input Perturbation Ranking

We now look at the code to perform perturbation ranking for a classification neural network.  The implementation technique is slightly different for classification vs. regression, so I must provide two different implementations.  The primary difference between classification and regression is how we evaluate the accuracy of the neural network in each of these two network types.  We will use the Root Mean Square (RMSE) error calculation, whereas we will use log loss for classification.

The code presented below creates a classification neural network that will predict the classic iris dataset.

In [3]:
# HIDE OUTPUT
import pandas as pd
import io
import requests
import numpy as np
from sklearn import metrics
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/iris.csv", 
    na_values=['NA', '?'])

# Convert to numpy - Classification
x = df[['sepal_l', 'sepal_w', 'petal_l', 'petal_w']].values
dummies = pd.get_dummies(df['species']) # Classification
species = dummies.columns
y = dummies.values

# Split into train/test
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)

# Build neural network
model = Sequential()
model.add(Dense(50, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(25, activation='relu')) # Hidden 2
model.add(Dense(y.shape[1],activation='softmax')) # Output
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(x_train,y_train,verbose=2,epochs=100)

Epoch 1/100
4/4 - 0s - loss: 1.2890
Epoch 2/100
4/4 - 0s - loss: 1.1851
Epoch 3/100
4/4 - 0s - loss: 1.1268
Epoch 4/100
4/4 - 0s - loss: 1.0875
Epoch 5/100
4/4 - 0s - loss: 1.0533
Epoch 6/100
4/4 - 0s - loss: 1.0191
Epoch 7/100
4/4 - 0s - loss: 0.9897
Epoch 8/100
4/4 - 0s - loss: 0.9629
Epoch 9/100
4/4 - 0s - loss: 0.9339
Epoch 10/100
4/4 - 0s - loss: 0.9067
Epoch 11/100
4/4 - 0s - loss: 0.8772
Epoch 12/100
4/4 - 0s - loss: 0.8493
Epoch 13/100
4/4 - 0s - loss: 0.8214
Epoch 14/100
4/4 - 0s - loss: 0.7913
Epoch 15/100
4/4 - 0s - loss: 0.7643
Epoch 16/100
4/4 - 0s - loss: 0.7381
Epoch 17/100
4/4 - 0s - loss: 0.7095
Epoch 18/100
4/4 - 0s - loss: 0.6843
Epoch 19/100
4/4 - 0s - loss: 0.6605
Epoch 20/100
4/4 - 0s - loss: 0.6375
Epoch 21/100
4/4 - 0s - loss: 0.6159
Epoch 22/100
4/4 - 0s - loss: 0.5954
Epoch 23/100
4/4 - 0s - loss: 0.5766
Epoch 24/100
4/4 - 0s - loss: 0.5577
Epoch 25/100
4/4 - 0s - loss: 0.5429
Epoch 26/100
4/4 - 0s - loss: 0.5272
Epoch 27/100
4/4 - 0s - loss: 0.5116
Epoch 28/1

<tensorflow.python.keras.callbacks.History at 0x7f2d881afcd0>

Next, we evaluate the accuracy of the trained model.  Here we see that the neural network performs great, with an accuracy of 1.0.  We might fear overfitting with such high accuracy for a more complex dataset.  However, for this example, we are more interested in determining the importance of each column.

In [4]:
from sklearn.metrics import accuracy_score

pred = model.predict(x_test)
predict_classes = np.argmax(pred,axis=1)
expected_classes = np.argmax(y_test,axis=1)
correct = accuracy_score(expected_classes,predict_classes)
print(f"Accuracy: {correct}")

Accuracy: 1.0


We are now ready to call the input perturbation algorithm.  First, we extract the column names and remove the target column.  The target column is not important, as it is the objective, not one of the inputs.  In supervised learning, the target is of the utmost importance.

We can see the importance displayed in the following table.  The most important column is always 1.0, and lessor columns will continue in a downward trend.  The least important column will have the lowest rank.

In [5]:
# Rank the features
from IPython.display import display, HTML

names = list(df.columns) # x+y column names
names.remove("species") # remove the target(y)
rank = perturbation_rank(model, x_test, y_test, names, False)
display(rank)

Unnamed: 0,name,error,importance
0,petal_l,2.164575,1.0
1,petal_w,0.638275,0.294873
2,sepal_l,0.287762,0.132941
3,sepal_w,0.088264,0.040777


## Regression and Input Perturbation Ranking

We now see how to use input perturbation ranking for a regression neural network.  We will use the MPG dataset as a demonstration.  The code below loads the MPG dataset and creates a regression neural network for this dataset.  The code trains the neural network and calculates an RMSE evaluation.

In [6]:
# HIDE OUTPUT
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from sklearn.model_selection import train_test_split
import pandas as pd
import io
import os
import requests
import numpy as np
from sklearn import metrics

save_path = "."

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", 
    na_values=['NA', '?'])

cars = df['name']

# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight',
       'acceleration', 'year', 'origin']].values
y = df['mpg'].values # regression

# Split into train/test
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)

# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train,y_train,verbose=2,epochs=100)

# Predict
pred = model.predict(x)

Epoch 1/100
10/10 - 0s - loss: 510750.8750
Epoch 2/100
10/10 - 0s - loss: 224474.1406
Epoch 3/100
10/10 - 0s - loss: 72071.7500
Epoch 4/100
10/10 - 0s - loss: 12176.7432
Epoch 5/100
10/10 - 0s - loss: 678.3466
Epoch 6/100
10/10 - 0s - loss: 1568.2787
Epoch 7/100
10/10 - 0s - loss: 1506.3440
Epoch 8/100
10/10 - 0s - loss: 665.5508
Epoch 9/100
10/10 - 0s - loss: 345.0493
Epoch 10/100
10/10 - 0s - loss: 338.1442
Epoch 11/100
10/10 - 0s - loss: 339.9945
Epoch 12/100
10/10 - 0s - loss: 326.8102
Epoch 13/100
10/10 - 0s - loss: 321.8234
Epoch 14/100
10/10 - 0s - loss: 321.2518
Epoch 15/100
10/10 - 0s - loss: 320.0280
Epoch 16/100
10/10 - 0s - loss: 318.0003
Epoch 17/100
10/10 - 0s - loss: 315.9073
Epoch 18/100
10/10 - 0s - loss: 314.4383
Epoch 19/100
10/10 - 0s - loss: 312.2990
Epoch 20/100
10/10 - 0s - loss: 310.6756
Epoch 21/100
10/10 - 0s - loss: 308.5789
Epoch 22/100
10/10 - 0s - loss: 308.0361
Epoch 23/100
10/10 - 0s - loss: 304.7319
Epoch 24/100
10/10 - 0s - loss: 302.9597
Epoch 25/100


Just as before, we extract the column names and discard the target.  We can now create a ranking of the importance of each of the input features.  The feature with a ranking of 1.0 is the most important.

In [7]:
# Rank the features
from IPython.display import display, HTML

names = list(df.columns) # x+y column names
names.remove("name")
names.remove("mpg") # remove the target(y)
rank = perturbation_rank(model, x_test, y_test, names, True)
display(rank)

Unnamed: 0,name,error,importance
0,year,141.345052,1.0
1,weight,136.310289,0.96438
2,cylinders,133.959119,0.947745
3,origin,133.687091,0.945821
4,horsepower,130.99431,0.92677
5,acceleration,128.784298,0.911134
6,displacement,83.875616,0.59341


## Biological Response with Neural Network

The following sections will demonstrate how to use feature importance ranking and ensembling with a more complex dataset. Ensembling is the process where you combine multiple models for greater accuracy. Kaggle competition winners frequently make use of ensembling for high-ranking solutions.

We will use the biological response dataset, a Kaggle dataset, where there is an unusually high number of columns. Because of the large number of columns, it is essential to use feature ranking to determine the importance of these columns. We begin by loading the dataset and preprocessing. This Kaggle dataset is a binary classification problem. You must predict if certain conditions will cause a biological response.

* [Predicting a Biological Response](https://www.kaggle.com/c/bioresponse)

In [8]:
import pandas as pd
import os
import numpy as np
from sklearn import metrics
from scipy.stats import zscore
from sklearn.model_selection import KFold
from IPython.display import HTML, display

URL = "https://data.heatonresearch.com/data/t81-558/kaggle/"

df_train = pd.read_csv(
    URL+"bio_train.csv", 
    na_values=['NA', '?'])

df_test = pd.read_csv(
    URL+"bio_test.csv", 
    na_values=['NA', '?'])

activity_classes = df_train['Activity']

A large number of columns is evident when we display the shape of the dataset.

In [9]:
print(df_train.shape)

(3751, 1777)


The following code constructs a classification neural network and trains it for the biological response dataset.  Once trained, the accuracy is measured.

In [10]:
import os
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from sklearn.model_selection import train_test_split
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import sklearn

# Encode feature vector
# Convert to numpy - Classification
x_columns = df_train.columns.drop('Activity')
x = df_train[x_columns].values
y = df_train['Activity'].values # Classification
x_submit = df_test[x_columns].values.astype(np.float32)


# Split into train/test
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42) 

print("Fitting/Training...")
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu'))
model.add(Dense(10))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, 
                        patience=5, verbose=1, mode='auto')
model.fit(x_train,y_train,validation_data=(x_test,y_test),
          callbacks=[monitor],verbose=0,epochs=1000)
print("Fitting done...")

# Predict
pred = model.predict(x_test).flatten()


# Clip so that min is never exactly 0, max never 1
pred = np.clip(pred,a_min=1e-6,a_max=(1-1e-6)) 
print("Validation logloss: {}".format(
    sklearn.metrics.log_loss(y_test,pred)))

# Evaluate success using accuracy
pred = pred>0.5 # If greater than 0.5 probability, then true
score = metrics.accuracy_score(y_test, pred)
print("Validation accuracy score: {}".format(score))

# Build real submit file
pred_submit = model.predict(x_submit)

# Clip so that min is never exactly 0, max never 1 (would be a NaN score)
pred_submit = np.clip(pred_submit,a_min=1e-6,a_max=(1-1e-6)) 
submit_df = pd.DataFrame({'MoleculeId':[x+1 for x \
        in range(len(pred_submit))],'PredictedProbability':\
                          pred_submit.flatten()})
submit_df.to_csv("submit.csv", index=False)

Fitting/Training...
Epoch 00010: early stopping
Fitting done...
Validation logloss: 0.6309741323700163
Validation accuracy score: 0.7494669509594882


## What Features/Columns are Important
The following uses perturbation ranking to evaluate the neural network.

In [11]:
# Rank the features
from IPython.display import display, HTML

names = list(df_train.columns) # x+y column names
names.remove("Activity") # remove the target(y)
rank = perturbation_rank(model, x_test, y_test, names, False)
display(rank[0:10])

Unnamed: 0,name,error,importance
0,D27,0.690416,1.0
1,D1012,0.645148,0.934433
2,D51,0.643758,0.932421
3,D1100,0.642325,0.930344
4,D1059,0.641669,0.929394
5,D1128,0.640544,0.927764
6,D1402,0.639555,0.926332
7,D1218,0.639551,0.926326
8,D1159,0.639223,0.925851
9,D1192,0.638954,0.925462


## Neural Network Ensemble

A neural network ensemble combines neural network predictions with other models. The program determines the exact blend of these models by logistic regression. The following code performs this blend for a classification.  **If you present the final predictions from the ensemble to Kaggle, you will see that the result is very accurate.**

In [12]:
# HIDE OUTPUT
import numpy as np
import os
import pandas as pd
import math
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.ensemble import RandomForestClassifier 
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression

SHUFFLE = False
FOLDS = 10

def build_ann(input_size,classes,neurons):
    model = Sequential()
    model.add(Dense(neurons, input_dim=input_size, activation='relu'))
    model.add(Dense(1))
    model.add(Dense(classes,activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam')
    return model

def mlogloss(y_test, preds):
    epsilon = 1e-15
    sum = 0
    for row in zip(preds,y_test):
        x = row[0][row[1]]
        x = max(epsilon,x)
        x = min(1-epsilon,x)
        sum+=math.log(x)
    return( (-1/len(preds))*sum)

def stretch(y):
    return (y - y.min()) / (y.max() - y.min()) # min-max scale


def blend_ensemble(x, y, x_submit):
    kf = StratifiedKFold(FOLDS)
    folds = list(kf.split(x,y))

    models = [
        KerasClassifier(build_fn=build_ann,neurons=20,
                    input_size=x.shape[1],classes=2),
        KNeighborsClassifier(n_neighbors=3),
        RandomForestClassifier(n_estimators=100, n_jobs=-1, 
                               criterion='gini'),
        RandomForestClassifier(n_estimators=100, n_jobs=-1, 
                               criterion='entropy'),
        ExtraTreesClassifier(n_estimators=100, n_jobs=-1, 
                             criterion='gini'),
        ExtraTreesClassifier(n_estimators=100, n_jobs=-1, 
                             criterion='entropy'),
        GradientBoostingClassifier(learning_rate=0.05, 
                subsample=0.5, max_depth=6, n_estimators=50)]

    dataset_blend_train = np.zeros((x.shape[0], len(models)))
    dataset_blend_test = np.zeros((x_submit.shape[0], len(models)))

    for j, model in enumerate(models):
        print("Model: {} : {}".format(j, model) )
        fold_sums = np.zeros((x_submit.shape[0], len(folds)))
        total_loss = 0
        for i, (train, test) in enumerate(folds):
            x_train = x[train]
            y_train = y[train]
            x_test = x[test]
            y_test = y[test]
            model.fit(x_train, y_train)
            pred = np.array(model.predict_proba(x_test))
            dataset_blend_train[test, j] = pred[:, 1]
            pred2 = np.array(model.predict_proba(x_submit))
            fold_sums[:, i] = pred2[:, 1]
            loss = mlogloss(y_test, pred)
            total_loss+=loss
            print("Fold #{}: loss={}".format(i,loss))
        print("{}: Mean loss={}".format(model.__class__.__name__,
                                        total_loss/len(folds)))
        dataset_blend_test[:, j] = fold_sums.mean(1)

    print()
    print("Blending models.")
    blend = LogisticRegression(solver='lbfgs')
    blend.fit(dataset_blend_train, y)
    return blend.predict_proba(dataset_blend_test)

if __name__ == '__main__':

    np.random.seed(42)  # seed to shuffle the train set

    print("Loading data...")
    URL = "https://data.heatonresearch.com/data/t81-558/kaggle/"

    df_train = pd.read_csv(
        URL+"bio_train.csv", 
        na_values=['NA', '?'])

    df_submit = pd.read_csv(
        URL+"bio_test.csv", 
        na_values=['NA', '?'])

    predictors = list(df_train.columns.values)
    predictors.remove('Activity')
    x = df_train[predictors].values
    y = df_train['Activity']
    x_submit = df_submit.values

    if SHUFFLE:
        idx = np.random.permutation(y.size)
        x = x[idx]
        y = y[idx]

    submit_data = blend_ensemble(x, y, x_submit)
    submit_data = stretch(submit_data)

    ####################
    # Build submit file
    ####################
    ids = [id+1 for id in range(submit_data.shape[0])]
    submit_df = pd.DataFrame({'MoleculeId': ids, 
                              'PredictedProbability': 
                              submit_data[:, 1]},
                             columns=['MoleculeId',
                            'PredictedProbability'])
    submit_df.to_csv("submit.csv", index=False)

Loading data...
Model: 0 : <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x7f2d73be5790>
Fold #0: loss=0.527560151251493




Fold #1: loss=0.5182472816656988




Fold #2: loss=0.5308221397375007




Fold #3: loss=0.5321196101426188




Fold #4: loss=0.554880651476705




Fold #5: loss=0.6079964678134003




Fold #6: loss=0.5248630124504088




Fold #7: loss=0.5357807272014603




Fold #8: loss=0.5586331315921277




Fold #9: loss=0.5232840511410385
KerasClassifier: Mean loss=0.5414187224472451
Model: 1 : KNeighborsClassifier(n_neighbors=3)


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d84222820>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #0: loss=3.606678388314123


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d84222820>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #1: loss=2.2256421551487593


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d4f702310>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #2: loss=3.6815437059542186


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d84222820>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #3: loss=2.416161292225968


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d84222820>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #4: loss=4.442472310149748


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d4f702310>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #5: loss=4.321350530738247


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d4f702310>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #6: loss=3.400455469543658


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2df4300670>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #7: loss=3.1724147110842513


Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f2d84222820>
Traceback (most recent call last):
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/home/adeng/miniconda3/envs/tensorflow/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Exception ignored on calling ctypes callback function: <function

Fold #8: loss=2.117356283193681
Fold #9: loss=3.0532135963322586
KNeighborsClassifier: Mean loss=3.243728844268491
Model: 2 : RandomForestClassifier(n_jobs=-1)
Fold #0: loss=0.4657177982691548
Fold #1: loss=0.4346825805694879
Fold #2: loss=0.4593868993445528
Fold #3: loss=0.41674899522216713
Fold #4: loss=0.4851849131056564
Fold #5: loss=0.48473291073937
Fold #6: loss=0.41274608628217674
Fold #7: loss=0.47405291219252377
Fold #8: loss=0.44974230059938286
Fold #9: loss=0.46340159258241087
RandomForestClassifier: Mean loss=0.45463969889068834
Model: 3 : RandomForestClassifier(criterion='entropy', n_jobs=-1)
Fold #0: loss=0.4511847247326708
Fold #1: loss=0.42707704254926593
Fold #2: loss=0.5550335199035183
Fold #3: loss=0.42186970733328516
Fold #4: loss=0.4794331756190797
Fold #5: loss=0.4730559509802762
Fold #6: loss=0.41116235817215196
Fold #7: loss=0.46835919493314265
Fold #8: loss=0.4496144890690015
Fold #9: loss=0.4625902934553457
RandomForestClassifier: Mean loss=0.4599380456747738


In [13]:
submit_df.shape

(2501, 2)