In [1]:
import pandas as pd
import numpy as np
from scipy.stats import wilcoxon


# Testing algorithms with theoretical distributions

The analysis is then run to examine the algorithms. First the original algorithm will be run from which it is expected to return the tabulated values. Then 500 replicates of each algorithm will be executed for each of the proposed instances. Of these 50 replicates, the one that returns the lowest cost will be taken as the solution.

The algorithms studied will be the following, on the one hand will study the savings algorithm modified with the selection by triangular distribution. On the other hand, the modified algorithm will be tested with the geometric distribution with the following values for $\beta :\{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9\}$

What we can observe in the results is the following.

 - On the one hand we can observe a clear malfunctioning of the algorithm that has the triangular distribution as its method of candidate selection. This method surely has a malfunction because its behavior is closer to a random selection because it gives too much probability to the worst candidates.

 - On the other hand we can say that the algorithms, with the exception of the one that uses triangular distribution and geometry with a beta = 0.1, manage to improve the results obtained by the CWS algorithm with 50 replicates.


In [2]:
results = pd.read_csv('../reports/datasets/theoretical_distributions_results.csv', index_col=0)
inverse = results.iloc[:, 0].apply(lambda x: 1/x)
scaled = pd.DataFrame(np.array(inverse).reshape(1, -1).T  * np.array(results))
scaled.columns = results.columns
scaled.index = results.index
scaled


Unnamed: 0,original,triangular n=50,"geometrical n=50, beta=0.1","geometrical n=50, beta=0.2","geometrical n=50, beta=0.3","geometrical n=50, beta=0.4","geometrical n=50, beta=0.5","geometrical n=50, beta=0.6","geometrical n=50, beta=0.7","geometrical n=50, beta=0.8","geometrical n=50, beta=0.9"
A-n32-k5,1.0,1.783751,0.980415,0.950001,0.979162,0.974573,0.95624,0.978886,0.984571,0.988352,0.988352
A-n38-k5,1.0,2.02755,1.023077,0.998711,0.997517,0.997517,0.997517,0.997517,0.997517,0.997517,0.997517
A-n45-k7,1.0,1.684831,0.986176,0.979999,0.972971,0.970807,0.972656,0.972656,0.985425,0.982682,0.985425
A-n55-k9,1.0,2.133346,1.030016,1.012785,1.001599,0.996858,0.999162,0.999151,0.999151,0.996858,1.0
A-n60-k9,1.0,1.950712,0.991299,0.971987,0.971109,0.96313,0.96232,0.963164,0.962857,0.963344,0.992676
A-n61-k9,1.0,2.084047,0.981979,0.956331,0.956297,0.953581,0.958692,0.958692,0.960598,0.960598,0.960598
A-n65-k9,1.0,2.242172,0.969192,0.969893,0.991796,0.966194,0.999707,1.0,1.0,1.0,1.0
A-n80-k10,1.0,2.159204,0.978919,0.973178,0.972804,0.97289,0.972597,0.972548,0.97233,0.991827,0.994138
B-n50-k7,1.0,2.591666,1.044628,1.012844,1.002367,0.998524,0.995739,0.994041,0.994324,0.994041,0.995138
B-n52-k7,1.0,2.660473,1.00373,0.995145,0.996252,0.996222,0.997468,0.996468,0.997468,0.997468,0.997468


In [3]:
scaled.mean()

original                      1.000000
triangular n=50               2.340245
geometrical n=50, beta=0.1    1.000250
geometrical n=50, beta=0.2    0.979388
geometrical n=50, beta=0.3    0.980792
geometrical n=50, beta=0.4    0.980609
geometrical n=50, beta=0.5    0.980551
geometrical n=50, beta=0.6    0.983170
geometrical n=50, beta=0.7    0.984716
geometrical n=50, beta=0.8    0.986259
geometrical n=50, beta=0.9    0.991018
dtype: float64

## Examining the significance of the results

Next, we study if the difference between both samples is significant by means of the One-Way the Wilcoxon test that allows us to study if there is enough evidence to affirm that the proposed algorithms achieve a lower cost than the original one.


In [4]:
wilcoxon_results = {}
for i in range(1, 11):
    t = wilcoxon(x = results.iloc[:, i], y=results.iloc[:, 0],  alternative='less')
    algorithm = results.columns[i]
    wilcoxon_results[algorithm] =  t[1]
pd.DataFrame(data = wilcoxon_results.values(),
             index=wilcoxon_results.keys(), columns=['p-value'])


Unnamed: 0,p-value
triangular n=50,0.9999997
"geometrical n=50, beta=0.1",0.3243282
"geometrical n=50, beta=0.2",0.0001046269
"geometrical n=50, beta=0.3",2.956246e-06
"geometrical n=50, beta=0.4",2.694395e-07
"geometrical n=50, beta=0.5",3.976442e-07
"geometrical n=50, beta=0.6",3.976442e-07
"geometrical n=50, beta=0.7",3.976442e-07
"geometrical n=50, beta=0.8",5.867514e-07
"geometrical n=50, beta=0.9",4.14905e-06


As a conclusion we can say that the algorithms, with the exception of the one that uses triangular distribution and geometry with a beta = 0.1, manage to improve the results obtained by the CWS algorithm with 50 replicates.