Skip to content

Commit

Permalink
individuals scores are squared
Browse files Browse the repository at this point in the history
  • Loading branch information
fgregg committed Feb 11, 2015
1 parent 301cf8a commit 594e5a0
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion dedupe/clustering.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def cluster(dupes, threshold=.5, max_components=30000):
return clustering.values()

def confidences(items, distances) :
scores = numpy.sum(distances[items, :][:, items], 0)
scores = numpy.sum(distances[items, :][:, items]**2, 0)
scores /= len(items) - 1
scores = 1 - scores
return scores
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
setup(
name='dedupe',
url='https://github.com/datamade/dedupe',
version='0.7.7.0.0',
version='0.7.7.0.1',
description='A python library for accurate and scaleable data deduplication and entity-resolution',
packages=['dedupe', 'dedupe.distance', 'dedupe.variables'],
ext_modules=[Extension('dedupe.cpredicates', ['src/cpredicates.c'])],
Expand Down

0 comments on commit 594e5a0

Please sign in to comment.