Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_col_ngrams and get_cell_ngrams return inconsistent result when a mention is not tabular #471

Closed
HiromuHota opened this issue Jun 30, 2020 · 0 comments · Fixed by #504
Labels
clean-up Cleaning up the code or refactoring
Milestone

Comments

@HiromuHota
Copy link
Contributor

HiromuHota commented Jun 30, 2020

Description of the bug

get_col_ngrams and get_cell_ngrams from fonduer.utils.data_model_utils.tabular return an inconsistent result when a mention is not tabular

Given mention.get_span()=="Sample" and mention.is_tabular() == False like below, get_col_ngrams(mention) returns [None] while get_cell_ngrams(mention) returns ["markdown"]

image

To Reproduce

See #470

Expected behavior

There could be four approaches:

  1. Return [""]
  2. Return [] (like get_col_ngrams)
  3. Return ["markdown"] (like get_cell_ngrams)
  4. Raise a ValueError

Error Logs/Screenshots

Not a bug, but inconsistent return values among tabular util functions.

Environment (please complete the following information)

  • Fonduer Version: 0.8.2

Additional context

N/A

HiromuHota pushed a commit to HiromuHota/fonduer that referenced this issue Sep 8, 2020
@lukehsiao lukehsiao added this to the v0.8.3 milestone Sep 11, 2020
@lukehsiao lukehsiao added the clean-up Cleaning up the code or refactoring label Sep 11, 2020
lukehsiao pushed a commit that referenced this issue Sep 11, 2020
…ghbor_cell_ngrams yield nothing if mention is not tabular (#504)

Fixes #471.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean-up Cleaning up the code or refactoring
Projects
None yet
2 participants