Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Better multi-word predicates in Open IE predictors #1759

Merged
merged 33 commits into from
Sep 17, 2018

Conversation

gabrielStanovsky
Copy link
Contributor

Consolidate multiword predicate (e.g., "decided to run"), in which case we don't need to return the embedded predicate ("run").

Copy link
Contributor

@joelgrus joelgrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly stylistic things, plus one logic question

"""
Return the word indices of a predicate in BIO tags.
"""
return [ind for (ind, tag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you don't have to use parens here

another nit: you could probably fit this on one line (and if you can't, the in is a weird place to split it)

"""
# Get predicate word indices from both predictions
pred_ind1, pred_ind2 = map(get_predicate_indices,
[tags1, tags2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an unusual use of map and makes it less obvious what you're doing. just do

pred_ind1 = get_predicate_indices(tags1)
pred_ind2 = get_predicate_indices(tags2)

it's the same number of lines and much clearer

"""
# Allow tags1 to add elements to tags2
return [tag2 if (tag2 != 'O')\
else tag1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this else on the previous line, it will make the code clearer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the parens aren't really necessary here either

the embedded predicate ("run").
"""
pred_dict: Dict[str, List[str]] = {}
merged_outputs = list(map(join_mwp, outputs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am not a fan of list(map, just use a list comprehension instead

return " ".join([sent_tokens[pred_id].text
for pred_id in get_predicate_indices(tags)])

def check_predicates_subsumed(tags1: List[str], tags2: List[str]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the check in the name, it makes it read awkward when you use it later. I'd call it something like predicates_are_subsumed


# Else - check if this predicate if subsumed by another predicate
for pred2_text, tags2 in zip(predicate_texts, merged_outputs):
if (tags1 != tags2) and check_predicates_subsumed(tags1, tags2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see, if it were called predicates_are_subsumed, this line would read much clearer

also nit: you don't need parens here

for (tag1, tag2) in zip(tags1, tags2)]

def consolidate_predictions(outputs: List[List[str]], sent_tokens: List[Token]) -> Dict[str, List[str]]:
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to have (say) predicate_a < predicate_b < predicate_c ?
it seems like your logic wouldn't work in that case

@gabrielStanovsky
Copy link
Contributor Author

@joelgrus I addressed your style comments, and changed the predicate merging logic to be able to merge more than 2 predicates (also added a test for that).

PTAL, thanks!

# Consolidate
pred_dict = consolidate_predictions(predictions, sent_tokens)

# Check that only "decided to join" is left
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

pred_ind2 = get_predicate_indices(tags2)

# Return if pred_ind1 is contained in pred_ind2
return any(set.intersection(set(pred_ind1), set(pred_ind2)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this returns true if they intersect, is that ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok, maybe just the docstring and comment are wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, intersection is what I was going for. The docstrings were wrong.

def merge_overlapping_predictions(tags1: List[str], tags2: List[str]) -> List[str]:
"""
Merge two predictions into one. Assumes the predicate in tags1 are contained in
the predicate of tags2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "contained in" correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, leftover docstring. I fixed that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants