-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Open
Labels
ModerateAnything that requires some knowledge of conventions and best practicesAnything that requires some knowledge of conventions and best practicesNeeds BenchmarksA tag for the issues and PRs which require some benchmarksA tag for the issues and PRs which require some benchmarksPerformancemodule:metrics
Description
Describe the workflow you want to enable:
KNN usage with Mahalanobis can become rather slow (several seconds per test datapoint) when the feature space is large (1500 features). Even if the training set is small (100s of images)
Describe your proposed solution:
Mahalanobis distance computes d = (x-y)T VI (x-y) for each x in the training set.
If we develop we get d = xT.VI.x + yT.VI.y - xT.VI.y - yT.VI.x
Now the first term could be computed at init_time for all x in training set, as well as xT.VI and VI.x.
This way, at test time:
- yT.VI.y could be computed only once. It is not necessary to compute it for all x.
- for each x in training set, only two dot products are left to compute the distances.
The issue with this is now we cannot have fully parallel (in terms of training set elements) code of distance computation in the KNN implementation.
Describe alternatives you've considered, if relevant
Similar idea can be found here: https://www.ece.ucsb.edu/scl/html/database/RelevanceFeedbackIndexing_ICIP2008.pdf
Metadata
Metadata
Assignees
Labels
ModerateAnything that requires some knowledge of conventions and best practicesAnything that requires some knowledge of conventions and best practicesNeeds BenchmarksA tag for the issues and PRs which require some benchmarksA tag for the issues and PRs which require some benchmarksPerformancemodule:metrics