Skip to content

Commit

Permalink
Fix bug in Ngram splitting logic
Browse files Browse the repository at this point in the history
Rather than returning the TemporarySpan, along with its splits, Snorkel
was returning the TemporarySpan twice, and only the 2nd split. Hiromu
Hota fixed this bug in Fonduer in [1]. This commit fixes it for Snorkel.

[1] HazyResearch/fonduer#108

Co-authored-by: Hiromu Hota <hiromu.hota@hal.hitachi.com>
  • Loading branch information
lukehsiao and Hiromu Hota committed Aug 20, 2018
1 parent 687cb62 commit a06c782
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion snorkel/candidates.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ def apply(self, context):
ts1 = TemporarySpan(char_start=start, char_end=start + m.start(1) - 1, sentence=context)
if ts1 not in seen:
seen.add(ts1)
yield ts
yield ts1
ts2 = TemporarySpan(char_start=start + m.end(1), char_end=end, sentence=context)
if ts2 not in seen:
seen.add(ts2)
Expand Down

0 comments on commit a06c782

Please sign in to comment.