New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError during training should yield more helpful error message #565

Open
rlvoyer opened this Issue May 18, 2017 · 1 comment

Comments

Projects
None yet
1 participant
@rlvoyer
Copy link

rlvoyer commented May 18, 2017

My hunch is that this a result of not having any positive examples in my training set (will confirm), but this should have a more helpful error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-9a80de5295c7> in <module>()
----> 1 deduper.train()

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/api.py in train(self, recall, index_predicates)
    669         self.classifier.fit(self.data_model.distances(examples), y)
    670
--> 671         self._trainBlocker(recall, index_predicates)
    672
    673     def _trainBlocker(self, recall, index_predicates):  # pragma: no cover

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/api.py in _trainBlocker(self, recall, index_predicates)
    680
    681         self.predicates = block_learner.learn(matches,
--> 682                                               recall)
    683
    684         self.blocker = blocking.Blocker(self.predicates)

/Users/robertvoyer/anaconda3/envs/entity-search/lib/python3.6/site-packages/dedupe/training.py in learn(self, matches, recall)
     31         comparison_count = self.comparisons(self.total_cover, compound_length)
     32
---> 33         coverable_dupes = set.union(*viewvalues(dupe_cover))
     34         uncoverable_dupes = [pair for i, pair in enumerate(matches)
     35                              if i not in coverable_dupes]

TypeError: descriptor 'union' of 'set' object needs an argument
@rlvoyer

This comment has been minimized.

Copy link
Author

rlvoyer commented May 22, 2017

Confirmed: the error went away after adding some positive examples to my training set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment