You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When determining Bloom Filter similarity the score is computed in C for all NxM combinations, but only the single best match is returned. This somewhat limits the next step that solves pairwise matches for the entire network at once.
Currently the structure is:
the index in filters1
the similarity score between 0 and 1 of the best match
The original index in entity A
The original index in entity B
The index in filters2 of the best match
Instead of computing a tuple for each entity in entity A, we could explore the memory/accuracy trade off of instead computing a similarity matrix - recording the n-gram similarity score between every pair.
The text was updated successfully, but these errors were encountered:
Issue by tho802
Monday Apr 11, 2016 at 16:48 GMT
Originally opened as https://github.csiro.au/magic/AnonymousLinking/issues/1
When determining Bloom Filter similarity the score is computed in C for all NxM combinations, but only the single best match is returned. This somewhat limits the next step that solves pairwise matches for the entire network at once.
Currently the structure is:
Instead of computing a tuple for each entity in entity A, we could explore the memory/accuracy trade off of instead computing a similarity matrix - recording the n-gram similarity score between every pair.
The text was updated successfully, but these errors were encountered: