It appears that in the calc_acc method it marks a sample correct if ANY of the labels with the same most_count have the correct label.
For k=2 (I think the paper always used k=2?), any time there is a 1-1 tie (top 2 have different labels), it will be correct if EITHER is correct, rather than a "fair" classifier that would have to make a decision before scoring.
This fork has my implementation of the kNN accuracy results for comparisons. see file docstring for notes:
https://github.com/kts/npc_gzip/blob/main/calc_acc.py
npc_gzip/experiments.py
Line 116 in a469915
It appears that in the
calc_accmethod it marks a sample correct if ANY of the labels with the samemost_counthave the correct label.For
k=2(I think the paper always usedk=2?), any time there is a1-1tie (top 2 have different labels), it will be correct if EITHER is correct, rather than a "fair" classifier that would have to make a decision before scoring.This fork has my implementation of the kNN accuracy results for comparisons. see file docstring for notes:
https://github.com/kts/npc_gzip/blob/main/calc_acc.py