Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError during training should yield more helpful error message #565

Closed
rlvoyer opened this issue May 18, 2017 · 3 comments
Closed

TypeError during training should yield more helpful error message #565

rlvoyer opened this issue May 18, 2017 · 3 comments

Comments

@rlvoyer
Copy link

rlvoyer commented May 18, 2017

My hunch is that this a result of not having any positive examples in my training set (will confirm), but this should have a more helpful error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-9a80de5295c7> in <module>()
----> 1 deduper.train()

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/api.py in train(self, recall, index_predicates)
    669         self.classifier.fit(self.data_model.distances(examples), y)
    670
--> 671         self._trainBlocker(recall, index_predicates)
    672
    673     def _trainBlocker(self, recall, index_predicates):  # pragma: no cover

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/api.py in _trainBlocker(self, recall, index_predicates)
    680
    681         self.predicates = block_learner.learn(matches,
--> 682                                               recall)
    683
    684         self.blocker = blocking.Blocker(self.predicates)

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/training.py in learn(self, matches, recall)
     31         comparison_count = self.comparisons(self.total_cover, compound_length)
     32
---> 33         coverable_dupes = set.union(*viewvalues(dupe_cover))
     34         uncoverable_dupes = [pair for i, pair in enumerate(matches)
     35                              if i not in coverable_dupes]

TypeError: descriptor 'union' of 'set' object needs an argument
@rlvoyer
Copy link
Author

rlvoyer commented May 22, 2017

Confirmed: the error went away after adding some positive examples to my training set.

@alexing
Copy link

alexing commented Jun 2, 2019

Is this getting fixed? It's pretty annoying

@lokhande-vishnu
Copy link

you can fix it by making a change like this
if len(viewvalues(dupe_cover)) != 0:
coverable_dupes = set.union(*viewvalues(dupe_cover))
else:
coverable_dupes = set()

@fgregg fgregg closed this as completed in 8cc41b0 Jan 20, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants