Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip TemporarySpan if it is an empty string #112

Merged
merged 1 commit into from
Aug 21, 2018

Conversation

HiromuHota
Copy link
Contributor

When a text is "BC548BG-", the output of Ngram.apply is currently TemporarySpan(s) with "BC548BG-", "BC548BG", and "" (empty).
This patch skips if TemporarySpan contains an empty string.

Copy link
Contributor

@lukehsiao lukehsiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukehsiao lukehsiao added the bug Something isn't working label Aug 21, 2018
@lukehsiao lukehsiao added this to the v0.3.0 milestone Aug 21, 2018
lukehsiao added a commit to snorkel-team/snorkel that referenced this pull request Aug 21, 2018
A bug occurs if the text of the span ends in one of the split tokens.
For example, "BC546-" will try to yield "BC546-", "BC546", and an empty
span with invalid char_start and char_end. This stops it from yielding
the empty span.

See HazyResearch/fonduer#112.

Co-authored-by: Hiromu Hota <hiromu.hota@hal.hitachi.com>
@lukehsiao lukehsiao self-assigned this Aug 21, 2018
@lukehsiao lukehsiao merged commit aadee4c into HazyResearch:master Aug 21, 2018
@HiromuHota HiromuHota deleted the fix/skipemptytempspan branch August 21, 2018 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants