# Performance of Contract Calculation

Problem: contract calculation is brute force, not a heuristic etc. 

Objective: 

- Quantify durations of contract calculation
- Validate, brute force, not the min heap operations, drives performance

Data created in the performance js scripts. 

In [149]:
import pandas as pd

In [160]:
#imports and simple cleaning
two_c_calc_df = pd.read_csv('./perf_2C_calc.csv')
two_c_calc_df.drop(two_c_calc_df.columns[-1], axis=1, inplace=True)
two_c_min_heap_df = pd.read_csv('./perf_2C_min_heap.csv')
two_c_min_heap_df.drop(two_c_min_heap_df.columns[-1], axis=1, inplace=True)
three_c_calc_df = pd.read_csv('./perf_3C_calc.csv')
three_c_calc_df.drop(three_c_calc_df.columns[-1], axis=1, inplace=True)
three_c_min_heap_df = pd.read_csv('./perf_3C_min_heap.csv')
three_c_min_heap_df.drop(three_c_min_heap_df.columns[-1], axis=1, inplace=True)

The add operation is called multiple times per contract optimization. Based on rows in the data frames this amount can be calculated:

In [167]:
no_of_add_oeprations_per_calc_2c = len(two_c_min_heap_df) /len(two_c_calc_df[two_c_calc_df['Func']=='calculation'])
no_of_add_oeprations_per_calc_3c = len(three_c_min_heap_df) /len(three_c_calc_df[three_c_calc_df['Func']=='calculation'])

print("Add is called "+ str(int(no_of_add_oeprations_per_calc_2c))+" times in 3C Analysis")
print("Add is called "+ str(int(no_of_add_oeprations_per_calc_3c))+" times in 3C Analysis")


Add is called 245 times in 3C Analysis
Add is called 900 times in 3C Analysis


These values are multiplied with the descriptive describe() statisitcs of the add operation:

In [164]:
min_heap_concat = pd.concat([two_c_min_heap_df.describe()*no_of_add_oeprations_per_calc_2c, three_c_min_heap_df.describe()*no_of_add_oeprations_per_calc_3c],axis=1)
min_heap_concat.columns = ['2C_Min_Heap','3C_Min_Heap']
min_heap_concat = min_heap_concat.drop('count')
min_heap_concat

Unnamed: 0,2C_Min_Heap,3C_Min_Heap
mean,0.064571,0.121394
std,0.132102,0.132609
min,0.020363,0.03688
25%,0.040725,0.075611
50%,0.051027,0.112492
75%,0.102061,0.150284
max,7.390665,11.624426


Concatenate the frames to visualize durations of optimization and add operation

In [159]:
c1 = two_c_calc_df[two_c_calc_df['Func']=='u_scoring_func'].describe()
c2 = two_c_calc_df[two_c_calc_df['Func']=='s_scoring_func'].describe()
c3 = two_c_calc_df[two_c_calc_df['Func']=='calculation'].describe()


c4 = three_c_calc_df[three_c_calc_df['Func']=='u_scoring_func'].describe()
c5 = three_c_calc_df[three_c_calc_df['Func']=='s_scoring_func'].describe()
c6 = three_c_calc_df[three_c_calc_df['Func']=='calculation'].describe()

concat = pd.concat([ c3,c6],axis=1)
concat.columns = [ '2c_calculation', '3c_calculation']

concat = pd.concat([concat, min_heap_concat],axis=1)
concat = concat.drop(['count','mean', 'std'])
concat

Unnamed: 0,2c_calculation,3c_calculation,2C_Min_Heap,3C_Min_Heap
min,0.450125,2.574792,0.020363,0.03688
25%,1.012063,4.615615,0.040725,0.075611
50%,11.280791,29.133458,0.051027,0.112492
75%,14.75025,45.649677,0.102061,0.150284
max,22.073208,123.524042,7.390665,11.624426


Observations
- 3C takes more time than 2C, median is nearly five times higher
- min heap: low quartile values, maxima are however high. Relevant differences between 2c/3c only in maxima. 

Consequence
- of contract calculation, solely the BRUTE-FORCE Combination + Recursion is time consuming. Other parts such as calculation of prefernce functions or the min heap operations can be ignored. 
- It might be worth investigating alternatives
