Skip to content

Commit

Permalink
Merge pull request #1133 from dedupeio/fix_levenshtein_search_dep
Browse files Browse the repository at this point in the history
Fix levenshtein search dep
  • Loading branch information
fgregg committed Jan 17, 2023
2 parents 0b0b324 + 4293efb commit 543b644
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 26 deletions.
23 changes: 5 additions & 18 deletions .github/workflows/pythonpackage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,11 @@ jobs:
run: pip install -r requirements.txt
- name: pytest
run: pytest
- name: Submit to coveralls
run: coveralls --service=github
env:
COVERALLS_PARALLEL: true
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
COVERALLS_FLAG_NAME: job-${{ matrix.os}}-${{ matrix.python-version }}
- name: Code Coverage
uses: codecov/codecov-action@v3
with:
flags: job-${{ matrix.os}}-${{ matrix.python-version }}
verbose: true # optional (default = false)
- name: Integration tests
# Do everything twice: The first time is training and generates settings,
# the second time it tests using a static settings file.
Expand All @@ -66,18 +65,6 @@ jobs:
python benchmarks/benchmarks/canonical_matching.py
python benchmarks/benchmarks/canonical_gazetteer.py
python benchmarks/benchmarks/canonical_gazetteer.py
coveralls_finish:
name: Indicate completion to coveralls.io
needs: test
runs-on: ubuntu-latest
container: python:3-slim
steps:
- name: Finished
run: |
pip3 install --upgrade coveralls
coveralls --service=github --finish
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
settings_file_persists:
runs-on: ubuntu-latest
steps:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Dedupe Python Library

[![Tests Passing](https://github.com/dedupeio/dedupe/workflows/tests/badge.svg)](https://github.com/dedupeio/dedupe/actions?query=workflow%3Atests)[![Coverage Status](https://coveralls.io/repos/github/dedupeio/dedupe/badge.svg)](https://coveralls.io/github/dedupeio/dedupe)
[![Tests Passing](https://github.com/dedupeio/dedupe/workflows/tests/badge.svg)](https://github.com/dedupeio/dedupe/actions?query=workflow%3Atests)[![codecov](https://codecov.io/gh/dedupeio/dedupe/branch/master/graph/badge.svg?token=aauKUrTEgh)](https://codecov.io/gh/dedupeio/dedupe)

_dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data._

Expand Down
8 changes: 4 additions & 4 deletions dedupe/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,21 +94,21 @@ def fieldDistance(self, record_pairs: RecordPairs) -> None:
if not mask.any():
return
scores = scores[mask]
record_ids = numpy.array(record_ids)[mask]
record_id_array = numpy.array(record_ids)[mask]

with self.offset.get_lock():
fp: Scores
fp = numpy.memmap(
self.score_file_path,
dtype=self.dtype,
offset=self.offset.value,
shape=(len(record_ids),),
shape=(len(record_id_array),),
)
fp["pairs"] = record_ids
fp["pairs"] = record_id_array
fp["score"] = scores
fp.flush()

self.offset.value += len(record_ids) * self.dtype.itemsize
self.offset.value += len(record_id_array) * self.dtype.itemsize


def scoreDuplicates(
Expand Down
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[project]
name = "dedupe"
description = "A python library for accurate and scaleable data deduplication and entity-resolution"
version = "2.0.20"
version = "2.0.21"
readme = "README.md"
requires-python = ">=3.7"
license = {file = "LICENSE"}
Expand Down Expand Up @@ -36,7 +36,7 @@ dependencies = [
"haversine>=0.4.1",
"BTrees>=4.1.4",
"zope.index",
"Levenshtein_search==1.4.5",
"dedupe_Levenshtein_search",
"typing_extensions",
]

Expand Down Expand Up @@ -68,7 +68,7 @@ check_untyped_defs = true

[tool.pytest.ini_options]
minversion = "7.1"
addopts = "--cov dedupe --cov-report html"
addopts = "--cov dedupe --cov-report xml"
testpaths = ["tests", "dedupe"]

[tool.isort]
Expand Down

0 comments on commit 543b644

Please sign in to comment.