# Evaluation of the Performance Prediction Model for Outlier Detection

We now have computed metafeatures and performance of each of 3 outlier detection algorithms for each dataset in table_computed.txt. Based on this, we can train three regressors to predict the performance of isolaiton forest, local outlier factor and one class svm on a given dataset.

In [84]:
import matplotlib.pyplot as plt  
import scipy
import numpy as np
import pandas as pd
import openml as oml
import pickle

In [21]:
plt.rcParams['figure.dpi']= 120
plt.rcParams['xtick.labelsize'] = 8
plt.rcParams['ytick.labelsize'] = 8 

## Load Computed Table

In [3]:
# read computed table
with open("table_computed.txt", "rb") as fp: # Unpickling
    datasets = pickle.load(fp)
df = pd.DataFrame(datasets)
cols = ['name', 'contamination(%)', 'metafeatures', 'isolation_forest_auroc', 'lof_auroc', 'ocsvm_auroc']
df = df[cols]
df

Unnamed: 0,name,contamination(%),metafeatures,isolation_forest_auroc,lof_auroc,ocsvm_auroc
0,lymph,4.1,"[0.7071230158730157, 0.8857142857142858, 0.757...",0.996479,0.909624,0.610329
1,glass,4.2,"[0.459828722002635, 0.6049174979609763, 0.7060...",0.536043,0.594038,0.506775
2,wdbc,37.26,"[0.9386796733212339, 0.9543805634776596, 0.915...",0.654167,0.612818,0.484052
3,speech,1.65,"[0.9631008911881127, 0.9763895974447406, 0.982...",0.499921,0.508256,0.516767
4,satellite_image,32.0,"[0.7878128466652179, 0.8296717347643972, 0.882...",0.712866,0.538993,0.493408
5,baseball,9.33,"[0.8382204418025314, 0.9268990960742729, 0.900...",0.757366,0.589728,0.510074
6,ecoli,2.68,"[0.7472514035325506, 0.8585529525902041, 0.808...",0.828746,0.657492,0.690112
7,phoneme,29.35,"[0.7605367194395449, 0.7588796499200512, 0.904...",0.549767,0.546197,0.526092
8,click_prediction_small,16.84,"[0.8099027765678555, 0.8315560261543755, 0.732...",0.51393,0.5041,0.496262
9,musk,15.41,"[0.8414119011459457, 0.9536296918760492, 0.720...",0.418187,0.49491,0.576733


In [51]:
# metafeatures
X = np.asarray(df['metafeatures'].tolist())
# X.shape

(32, 5)

In [27]:
# target: auroc socre of 3 outlier detection classifiers 
y_isolation_forest = df['isolation_forest_auroc'].values
y_lof = df['lof_auroc'].values
y_ocsvm = df['ocsvm_auroc'].values

## Plot Performance of 3 Outlier Detection Algorithms on Benchmark Datasets

In [25]:
import plotly.plotly as py
import plotly.graph_objs as go

trace1 = go.Bar(
    x = df['name'].tolist(),
    y = y_isolation_forest.tolist(),
    name = 'isolation forest'
)
trace2 = go.Bar(
    x = df['name'].tolist(),
    y = y_lof.tolist(),
    name = 'local outlier factor'
)
trace3 = go.Bar(
    x = df['name'].tolist(),
    y = y_ocsvm.tolist(),
    name = 'one class svm'
)

data = [trace1, trace2, trace3]
layout = go.Layout(
    barmode='group'
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='grouped-bar')

## Train Regressors

In [86]:
from sklearn.externals import joblib

In [87]:
neigh_isolation_forest = KNeighborsRegressor(n_neighbors=3)
neigh_isolation_forest.fit(X, y_isolation_forest)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
          metric_params=None, n_jobs=1, n_neighbors=3, p=2,
          weights='uniform')

In [88]:
# save trained regressor
joblib.dump(neigh_isolation_forest, 'isolation_forest_regressor.pkl') 

['isolation_forest_regressor.pkl']

In [91]:
neigh_lof = KNeighborsRegressor(n_neighbors=3)
neigh_lof.fit(X, y_lof)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
          metric_params=None, n_jobs=1, n_neighbors=3, p=2,
          weights='uniform')

In [93]:
# save trained regressor
joblib.dump(neigh_lof, 'lof_regressor.pkl') 

['lof_regressor.pkl']

In [94]:
neigh_ocsvm = KNeighborsRegressor(n_neighbors=3)
neigh_ocsvm.fit(X, y_ocsvm)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
          metric_params=None, n_jobs=1, n_neighbors=3, p=2,
          weights='uniform')

In [95]:
# save trained regressor
joblib.dump(neigh_ocsvm, 'ocsvm_regressor.pkl') 

['ocsvm_regressor.pkl']

## Evaluation

To evaluate our model, we divide the datasets into 8 sets. We take one set out for testing and the other sets for training in each iteration. And count the number of that our model correctly predicts the superior algorithm among 3.

In [184]:
def computed_score(ground_truth, predicted):
    correctPredictions = 0
    
    for i in range(0,len(ground_truth)):
        if ground_truth[i] == predicted[i]:
            correctPredictions += 1
        
    return correctPredictions / len(ground_truth)

In [185]:
# training
neigh_isolation_forest = KNeighborsRegressor(n_neighbors=3)
neigh_isolation_forest.fit(X[10:,:], y_isolation_forest[10:])
neigh_lof = KNeighborsRegressor(n_neighbors=3)
neigh_lof.fit(X[10:,:], y_lof[10:])
neigh_ocsvm = KNeighborsRegressor(n_neighbors=3)
neigh_ocsvm.fit(X[10:,:], y_ocsvm[10:])
# testing
y_isolation_forest_pred = neigh_isolation_forest.predict(X[:10,:])
print(y_isolation_forest_pred)
y_lof_pred = neigh_lof.predict(X[:10,:])
print(y_lof_pred)
y_ocsvm_pred = neigh_ocsvm.predict(X[:10,:])
print(y_ocsvm_pred)

[0.5822931  0.59654332 0.6604442  0.67592692 0.58921228 0.53650246
 0.5822931  0.57384798 0.54818792 0.56525482]
[0.49728681 0.56111049 0.45551697 0.46330227 0.48777044 0.50857368
 0.49728681 0.49061482 0.50139824 0.51998268]
[0.53238648 0.66523029 0.59093256 0.60978974 0.53381737 0.49136559
 0.53238648 0.5450056  0.50120339 0.49486207]


In [187]:
df_pred = pd.DataFrame([y_isolation_forest_pred, y_lof_pred, y_ocsvm_pred])

predicted = []
for i in range(10):
    predicted.append(df_pred[i].idxmax())
print(predicted)

[0, 2, 0, 0, 0, 0, 0, 0, 0, 0]


In [190]:
ground_truth = [0, 1, 0, 2, 0, 0, 0, 0, 0, 2] # 0: isolation, 1: lof, 2: ocsvm
print(ground_truth)
accuracy = computed_score(ground_truth, predicted)
print("The accuracy of model is {}".format(accuracy))

[0, 1, 0, 2, 0, 0, 0, 0, 0, 2]
The accuracy of model is 0.7


## Further Explanation
If we evaluate the r2 score of our regressor as below, the result will be disappoiting. One possible reason is that, to predict the performance of an outlier detection algorithm, the number of our benchmark datasets is not enough. Another reason could be the benmark datasets is not various enough. It could also be the landmark metafeatures are not enough for the outlier detection case. However, we do not really care if our regressors can predict the accurate performance of an algorithm. We care more about our model can tell which algorithm performs better. That is why we choose the above approach to evaluate our models.   

In [191]:
from sklearn.model_selection import cross_val_score
knn = KNeighborsRegressor(n_neighbors=3)
score_knn_isolation_forest = cross_val_score(knn, X, y_isolation_forest, scoring='r2', cv=5).mean()
print(score_knn_isolation_forest)

-2.1362006577851487


In [83]:
# Example of predict the algorithm performance of a given dataset 18
mf = np.array([datasets[18]['metafeatures']])
print(neigh_isolation_forest.predict(mf))
print(neigh_lof.predict(mf))
print(neigh_ocsvm.predict(mf))

[0.47374036]
[0.45780546]
[0.51417277]


In [189]:
# Example of concatenate numpy array
np.concatenate((X[2:4],X[5:8]), axis=0)

array([[0.93867967, 0.95438056, 0.91580676, 0.88240753, 0.80849322],
       [0.96310089, 0.9763896 , 0.98264106, 0.98345187, 0.98345187],
       [0.83822044, 0.9268991 , 0.90011543, 0.90675964, 0.91721394],
       [0.7472514 , 0.85855295, 0.80822916, 0.6323873 , 0.55386017],
       [0.76053672, 0.75887965, 0.90451414, 0.75388822, 0.70651428]])