# RQ4 - Can we measure input sensitivity?

To help researchers detecting the problematic, we suggest to compute a metric based on our experiments. This metric is designed to quantify the level of input sensitivity related to the performance property of a software system. 
We first need to choose a threshold $\alpha$, representing the maximal proportion of variability due to inputs we can tolerate. 
For instance, if we consider that \pc{5} of lost performance can be accepted, $\alpha$ will be fixed at $0.05$. 
Then, we define this score of Input Sensitivity as follows:
$IS = \frac{1}{4}*|C_{max} - C_{min}| + \frac{1}{2\alpha}*\min(\frac{Q_{3}}{Q_{1}}-1,\alpha)$
where
$C_{min}$ and $C_{max}$ are the minimal and maximal Spearman correlations
$Q_{1}$ and $Q_{3}$ are resp the first and third quartiles of the performance distribution.

The first part of the expression quantifies how the input sensitivity change the rankings of configurations ($RQ_{1}$ and $RQ_{2}$), and the second part the actual impact of input sensitivity ($RQ_{3}$) in the actual performance. 
Both vary from $0$ to $0.5$, which includes $IS$ between $0$ (no input sensitivity) and $1$ (high input sensitivity). 

In the evaluation, we compute $IS$ for each couple of systems and performance properties of our dataset, with $\alpha$ arbitrarily fixed at 10%.


#### First, we import some libraries

In [2]:
# for arrays
import numpy as np

# for dataframes
import pandas as pd

# plots
import matplotlib.pyplot as plt
# high-level plots
import seaborn as sns

# statistics
import scipy.stats as sc
# hierarchical clustering, clusters
from scipy.cluster.hierarchy import linkage, cut_tree, leaves_list
from scipy import stats
# statistical tests
from scipy.stats import mannwhitneyu

# machine learning library
# Principal Component Analysis - determine new axis for representing data
from sklearn.decomposition import PCA
# Random Forests -> vote between decision trees
# Gradient boosting -> instead of a vote, upgrade the same tree
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier, GradientBoostingClassifier
# Decision Tree
from sklearn.tree import DecisionTreeRegressor, plot_tree
# To add interactions in linear regressions models
from sklearn.preprocessing import PolynomialFeatures
# Elasticnet is an hybrid method between ridge and Lasso
from sklearn.linear_model import LinearRegression, ElasticNet
# To separate the data into training and test
from sklearn.model_selection import train_test_split
# Simple clustering (iterative steps)
from sklearn.cluster import KMeans
# get interactions of features
from sklearn.preprocessing import PolynomialFeatures


# we use it to interact with the file system
import os
# compute time
from time import time

# statistics
import scipy.stats as sc
# hierarchical clustering, clusters
from scipy.cluster.hierarchy import linkage, cut_tree, leaves_list
from scipy import stats
# statistical tests
from scipy.stats import mannwhitneyu

# no warning
import warnings
warnings.filterwarnings("ignore")

### Import data

In [10]:
name_systems = ["nodejs", "poppler", "xz", "x264", "gcc", "lingeling", "imagemagick", "sqlite"]

inputs_perf = dict()

inputs_perf["gcc"] = ["size", "ctime", "exec"]
inputs_perf["imagemagick"] = ["time", "size"]
inputs_perf["lingeling"] = ["conflicts", "cps", "reductions"]
inputs_perf["nodejs"] = ["ops"]
inputs_perf["poppler"] = ["size", "time"]
inputs_perf["sqlite"] = ["q"+str(i+1) for i in range(15)]
inputs_perf["x264"] = ["size", "kbs", "fps", "etime", "cpu"]
inputs_perf["xz"] = ["size", "time"]


q2 = dict()

q2["gcc","ctime"]=1.09
q2["gcc","exec"]=1.07
q2["gcc","size"]=1.01
q2["imagemagick","time"]=1.02
q2["imagemagick", "size"]=1.00
q2["lingeling","conflicts"]=1.05
q2["lingeling","cps"]=1.06
q2["lingeling","reductions"]=1.04
q2["nodejs","ops"]=1.08
q2["poppler","size"]=1.0
q2["poppler","time"]=1.15
q2["sqlite","q1"]=1.02
q2["sqlite","q10"]=1.02
q2["sqlite","q11"]=1.02
q2["sqlite","q12"]=1.04
q2["sqlite","q13"]=1.02
q2["sqlite","q14"]=1.03
q2["sqlite","q15"]=1.03
q2["sqlite","q2"]=1.03
q2["sqlite","q3"]=1.01
q2["sqlite","q4"]=1.03
q2["sqlite","q5"]=1.02
q2["sqlite","q6"]=1.03
q2["sqlite","q7"]=1.01
q2["sqlite","q8"]=1.03
q2["sqlite","q9"]=1.02
q2["x264","cpu"]=1.05
q2["x264","etime"]=1.03
q2["x264","fps"]=1.04
q2["x264","kbs"]=1.12
q2["x264","size"]=1.12
q2["xz","size"]=1.0
q2["xz","time"]=1.03

cmin = dict()

cmin["gcc","ctime"]=0.72
cmin["gcc","exec"]=-0.69
cmin["gcc","size"]=0.48
cmin["imagemagick","time"]=-0.24
cmin["imagemagick", "size"]=0.01
cmin["lingeling","conflicts"]=-0.90
cmin["lingeling","cps"]=-0.89
cmin["lingeling","reductions"]=-0.99
cmin["nodejs","ops"]=-0.87
cmin["poppler","size"]=-1.00
cmin["poppler","time"]=-0.94
cmin["sqlite","q1"]= -0.78
cmin["sqlite","q2"]=-0.58
cmin["sqlite","q3"]=-0.78
cmin["sqlite","q4"]=-0.77
cmin["sqlite","q5"]=-0.80
cmin["sqlite","q6"]=-0.80
cmin["sqlite","q7"]=-0.71
cmin["sqlite","q8"]=0.03
cmin["sqlite","q9"]=-0.67
cmin["sqlite","q10"]=0.00
cmin["sqlite","q11"]=-0.25
cmin["sqlite","q12"]=-0.74
cmin["sqlite","q13"]=-0.27
cmin["sqlite","q14"]=-0.81
cmin["sqlite","q15"]=-0.30
cmin["x264","cpu"]=-0.31
cmin["x264","etime"]=0.02
cmin["x264","fps"]=0.01
cmin["x264","kbs"]=-0.69
cmin["x264","size"]=-0.69
cmin["xz","size"]=0.14
cmin["xz","time"]=-0.03

cmax = dict()

cmax["gcc","ctime"]=0.97
cmax["gcc","exec"]=1.00
cmax["gcc","size"]=1.00
cmax["imagemagick","time"]=1.00
cmax["imagemagick", "size"]=1.00
cmax["lingeling","conflicts"]=0.92
cmax["lingeling","cps"]=0.93
cmax["lingeling","reductions"]=1.00
cmax["nodejs","ops"]=0.95
cmax["poppler","size"]=1.00
cmax["poppler","time"]=1.00
cmax["sqlite","q1"]=0.87
cmax["sqlite","q2"]=0.94
cmax["sqlite","q3"]=0.84
cmax["sqlite","q4"]=0.84
cmax["sqlite","q5"]=0.81
cmax["sqlite","q6"]=0.86
cmax["sqlite","q7"]=0.92
cmax["sqlite","q8"]=0.96
cmax["sqlite","q9"]=0.89
cmax["sqlite","q10"]=0.96
cmax["sqlite","q11"]=0.97
cmax["sqlite","q12"]=0.85
cmax["sqlite","q13"]=0.95
cmax["sqlite","q14"]=0.87
cmax["sqlite","q15"]=0.94
cmax["x264","cpu"]=1.00
cmax["x264","etime"]=1.00
cmax["x264","fps"]=1.00
cmax["x264","kbs"]=1.00
cmax["x264","size"]=1.0
cmax["xz","size"]=1.00
cmax["xz","time"]=0.97

# RQ4 results

In [11]:
alpha = 0.1
score = dict()
for ns in name_systems:
    for perf in inputs_perf[ns]:
        score[ns, perf] = np.round(np.abs(cmax[ns,perf]-cmin[ns,perf])/4 + min(q2[ns,perf]-1, alpha)/(2*alpha),2)

In [12]:
score

{('nodejs', 'ops'): 0.86,
 ('poppler', 'size'): 0.5,
 ('poppler', 'time'): 0.98,
 ('xz', 'size'): 0.22,
 ('xz', 'time'): 0.4,
 ('x264', 'size'): 0.92,
 ('x264', 'kbs'): 0.92,
 ('x264', 'fps'): 0.45,
 ('x264', 'etime'): 0.4,
 ('x264', 'cpu'): 0.58,
 ('gcc', 'size'): 0.18,
 ('gcc', 'ctime'): 0.51,
 ('gcc', 'exec'): 0.77,
 ('lingeling', 'conflicts'): 0.71,
 ('lingeling', 'cps'): 0.76,
 ('lingeling', 'reductions'): 0.7,
 ('imagemagick', 'time'): 0.41,
 ('imagemagick', 'size'): 0.25,
 ('sqlite', 'q1'): 0.51,
 ('sqlite', 'q2'): 0.53,
 ('sqlite', 'q3'): 0.46,
 ('sqlite', 'q4'): 0.55,
 ('sqlite', 'q5'): 0.5,
 ('sqlite', 'q6'): 0.57,
 ('sqlite', 'q7'): 0.46,
 ('sqlite', 'q8'): 0.38,
 ('sqlite', 'q9'): 0.49,
 ('sqlite', 'q10'): 0.34,
 ('sqlite', 'q11'): 0.41,
 ('sqlite', 'q12'): 0.6,
 ('sqlite', 'q13'): 0.41,
 ('sqlite', 'q14'): 0.57,
 ('sqlite', 'q15'): 0.46}

In [16]:
for ns in name_systems:
    for perf in inputs_perf[ns]:
        print(ns, perf, np.round(q2[ns, perf]-1,2))

nodejs ops 0.08
poppler size 0.0
poppler time 0.15
xz size 0.0
xz time 0.03
x264 size 0.12
x264 kbs 0.12
x264 fps 0.04
x264 etime 0.03
x264 cpu 0.05
gcc size 0.01
gcc ctime 0.09
gcc exec 0.07
lingeling conflicts 0.05
lingeling cps 0.06
lingeling reductions 0.04
imagemagick time 0.02
imagemagick size 0.0
sqlite q1 0.02
sqlite q2 0.03
sqlite q3 0.01
sqlite q4 0.03
sqlite q5 0.02
sqlite q6 0.03
sqlite q7 0.01
sqlite q8 0.03
sqlite q9 0.02
sqlite q10 0.02
sqlite q11 0.02
sqlite q12 0.04
sqlite q13 0.02
sqlite q14 0.03
sqlite q15 0.03
