In [2]:
import os
import pandas as pd

This notebook compares the ANN methods:

- method0: sorted interaction search
- method1: naive interaction search

It is refered to in the Chapter "Test Cases & Benchmarking Methodology". The faster of the two (method1) was used in the final comparison with the other search methods.


## Read data and cleanup
Raw results files can be found in sph-nearest-neighbor/test/results/ 

In [14]:
statsfile_fr_ann1 = os.path.join(resultsdir, "stats_fr_ann_method1.csv")
statsfile_fr_ann0 = os.path.join(resultsdir, "stats_fr_ann_method0.csv")

Read data and label search method, cleanup the fill, only keep relevant data, and combine to a single dataframe:

In [15]:
df_ann0 = pd.read_csv(statsfile_fr_ann0)
df_ann0 = df_ann0.assign(method=pd.Series(["ANN"]*len(df_ann0)).values)
df_ann1 = pd.read_csv(statsfile_fr_ann1)
df_ann1 = df_ann1.assign(method=pd.Series(["ANN"]*len(df_ann1)).values)

# set filltype to a nice integer percentage
df_ann0['fill'] = df_ann0.fill.mul(100).astype(int);
df_ann1['fill'] = df_ann1.fill.mul(100).astype(int);


# drop unused columns from dataframes
df_ann0.drop(columns=["time", "sizex", "sizey", "sizez"], inplace = True)
df_ann1.drop(columns=["time", "sizex", "sizey", "sizez"], inplace = True)

# combine to a single df
df = pd.concat([df_ann0, df_ann1])

## compare ANN method0 (sorted interaction search) and method1 (naive interaction search)

In [19]:
df[df['method'] == "ANN"].groupby(["fill","filltype","listmethod"])["tprocessing"].describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,count,mean,std,min,25%,50%,75%,max
fill,filltype,listmethod,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2,corners,0,5.0,63.17064,0.673018,62.2415,62.8041,63.3777,63.416,64.0139
2,corners,1,5.0,34.76754,0.756158,33.6308,34.4992,34.9827,35.0721,35.6529
11,clusters2,0,5.0,791.7928,7.856341,786.832,786.863,787.639,792.408,805.222
11,clusters2,1,5.0,452.388,16.671359,426.5,445.573,458.873,463.334,467.66
11,clusters4,0,5.0,441.3558,3.963555,435.776,439.543,441.59,443.764,446.106
11,clusters4,1,5.0,255.2726,9.156022,241.657,251.871,256.07,261.622,265.143
11,clusters6,0,5.0,223.4602,1.27987,222.438,222.905,222.972,223.304,225.682
11,clusters6,1,5.0,138.8876,5.465991,131.33,136.078,140.534,140.625,145.871
11,corners,0,5.0,1339.178,3.740217,1335.18,1335.78,1339.45,1341.52,1343.96
11,corners,1,5.0,883.224,285.427161,748.545,751.693,754.776,767.456,1393.65


The processing time with method1 is about half that of method2, while memory use is comparable.

In [20]:
df[df['method'] == "ANN"].groupby(["fill","filltype","listmethod"])["memory"].describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,count,mean,std,min,25%,50%,75%,max
fill,filltype,listmethod,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2,corners,0,5.0,16020.0,52.153619,15940.0,15996.0,16044.0,16052.0,16068.0
2,corners,1,5.0,15921.6,74.543947,15864.0,15868.0,15872.0,15984.0,16020.0
11,clusters2,0,5.0,42455.2,77.68655,42332.0,42448.0,42456.0,42504.0,42536.0
11,clusters2,1,5.0,42427.2,48.033322,42344.0,42428.0,42452.0,42452.0,42460.0
11,clusters4,0,5.0,34302.4,40.159681,34232.0,34312.0,34312.0,34328.0,34328.0
11,clusters4,1,5.0,34229.6,43.598165,34196.0,34204.0,34204.0,34244.0,34300.0
11,clusters6,0,5.0,27532.8,96.556719,27400.0,27472.0,27564.0,27584.0,27644.0
11,clusters6,1,5.0,27540.8,56.384395,27444.0,27544.0,27560.0,27568.0,27588.0
11,corners,0,5.0,51009.6,86.236883,50908.0,50972.0,50980.0,51056.0,51132.0
11,corners,1,5.0,51060.0,52.839379,51016.0,51028.0,51040.0,51068.0,51148.0
