Skip to content

Distance Metrics

nickcdryan edited this page Apr 10, 2018 · 5 revisions

The Nearist hardware is able to support a variety of distance metrics, including:

  • L1 or "Manhattan" distance
  • Hamming distance
  • Jaccard distance
  • Bit-wise AND

Nearist hardware supports the L1 distance as an alternative to L2 distance and Cosine similarity. Because L1 distance only requires addition and subtraction, and not multiplication, it can be implemented with fewer transistors in hardware. This allows Nearist to include more distance calculators per VCU, and thereby increases the overall performance of the system.

In our experiments on a nubmer of datasets, we observe little or no difference in accuracy using L1 distance in place of L2 or Cosine. For example, in our MNIST experiment(here and here) we saw the following results:

Dataset L1 L2 Cosine
MNIST 98.51% 98.51% 98.42%

See here for a more detailed discussion of the MNIST results.

This page discusses the difference between L1 and L2 in detail.

Academic Findings

A number of academic publications have investigated the relative performance of different distance metrics under the KNN algorithm. For example, Chomboon and colleagues (Chomboon et al., 2015) compared eleven different distance metrics across eight different data distributions and found that L1 (aka "cityblock") consistently had one of the highest classification accuracies:

Prasath et al. (2017) performed an extensive review of the literature and ran their own analysis comparing XX distance metrics across XX real world datasets with and without noise. In their meta-analysis of previous studies, four out of eight publications cited L1 as the best or one of the best distance metric. The remaining publications that included L1 in their analysis found L1 to be a top performer.

Chomboon, K., Pasapichi, C., Pongsakorn, T., Kerdprasop, K., & Kerdprasop, N. (2015). An empirical study of distance metrics for k-nearest neighbor algorithm. In The 3rd International Conference on Industrial Application Engineering 2015 (pp. 280–285).

Prasath, V.B., Alfeilat, H.A., Lasassmeh, O., & Hassanat, A.B. (2017). Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier - A Review. CoRR, abs/1708.04321.

Clone this wiki locally