You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following up #17 and #30 where the performance of the identity merging algorithm has been evaluated on 22 different open source stacks. We noticed particular bad performance on 2 organization IBM and intel with ~60% precision and recall.
This needs to be investigated because nearly all other organization are above 90% precision and recall, and we should be able to promise an acceptable score (at least 90 %) on all organizations.
The text was updated successfully, but these errors were encountered:
It turns out the identity graph of intel and IBM were pretty big: 80k and 11k edges respectively. And reducing the proportion of popular names decreased the number of false positive and false negative as popular identities tend to be the ones with problems. That's why increasing the popularity threshold from 5 to 100, we improved our precision and recall from ~62 to 94% for both organizations.
Following up #17 and #30 where the performance of the identity merging algorithm has been evaluated on 22 different open source stacks. We noticed particular bad performance on 2 organization
IBM
andintel
with ~60% precision and recall.This needs to be investigated because nearly all other organization are above 90% precision and recall, and we should be able to promise an acceptable score (at least 90 %) on all organizations.
The text was updated successfully, but these errors were encountered: