-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Currently we have 3 different similarity functions:
hamming_similaritycosine_similaritydot_similarity
And with the future introduction of complex hypervectors we will likely add a forth one if we follow the current design. I think, however, that we should only provide one similarity function that changes it's behavior based on the dtype of the input tensors. It would also be nice if it handles batched operations, i.e., with input shapes (*, d) and (n, d) the output shape should be (*, n) which has the similarity score for each input sample against each other element.
In order to unify the output domain we can stick to the [-1, +1] range that the cosine similarity and the complex variant of cosine similarity produce where 0 means orthogonal, +1 the same, and -1 the exact opposite. We can simply scale the hamming_similarity to fall in this domain.
The dot_similarity will then be removed from the library but is still available as part of PyTorch. And can therefore still be used in specific instances.
API design
x = torchhd.random_hv(10, 10000)
torchhd.functional.similarity(x, x) # aliased as torchhd.similarity(x, x)