Better multi-word predicates in Open IE predictors #1759

gabrielStanovsky · 2018-09-12T17:43:28Z

Consolidate multiword predicate (e.g., "decided to run"), in which case we don't need to return the embedded predicate ("run").

…span_metric` parameter to the SRL model to accomodate for Open IE

joelgrus

mostly stylistic things, plus one logic question

joelgrus · 2018-09-12T18:28:52Z

allennlp/predictors/open_information_extraction.py

+    """
+    Return the word indices of a predicate in BIO tags.
+    """
+    return [ind for (ind, tag)


nit: you don't have to use parens here

another nit: you could probably fit this on one line (and if you can't, the in is a weird place to split it)

joelgrus · 2018-09-12T18:31:23Z

allennlp/predictors/open_information_extraction.py

+    """
+    # Get predicate word indices from both predictions
+    pred_ind1, pred_ind2 = map(get_predicate_indices,
+                               [tags1, tags2])


this is an unusual use of map and makes it less obvious what you're doing. just do

pred_ind1 = get_predicate_indices(tags1) pred_ind2 = get_predicate_indices(tags2)

it's the same number of lines and much clearer

joelgrus · 2018-09-12T18:31:58Z

allennlp/predictors/open_information_extraction.py

+    """
+    # Allow tags1 to add elements to tags2
+    return [tag2 if (tag2 != 'O')\
+            else tag1


put this else on the previous line, it will make the code clearer

nit: the parens aren't really necessary here either

joelgrus · 2018-09-12T18:33:17Z

allennlp/predictors/open_information_extraction.py

+    the embedded predicate ("run").
+    """
+    pred_dict: Dict[str, List[str]] = {}
+    merged_outputs = list(map(join_mwp, outputs))


nit: I am not a fan of list(map, just use a list comprehension instead

joelgrus · 2018-09-12T18:35:30Z

allennlp/predictors/open_information_extraction.py

+    return " ".join([sent_tokens[pred_id].text
+                     for pred_id in get_predicate_indices(tags)])
+
+def check_predicates_subsumed(tags1: List[str], tags2: List[str]) -> bool:


I don't like the check in the name, it makes it read awkward when you use it later. I'd call it something like predicates_are_subsumed

joelgrus · 2018-09-12T18:36:01Z

allennlp/predictors/open_information_extraction.py

+
+        # Else - check if this predicate if subsumed by another predicate
+        for pred2_text, tags2 in zip(predicate_texts, merged_outputs):
+            if (tags1 != tags2) and check_predicates_subsumed(tags1, tags2):


see, if it were called predicates_are_subsumed, this line would read much clearer

also nit: you don't need parens here

joelgrus · 2018-09-12T18:42:29Z

allennlp/predictors/open_information_extraction.py

+            for (tag1, tag2) in zip(tags1, tags2)]
+
+def consolidate_predictions(outputs: List[List[str]], sent_tokens: List[Token]) -> Dict[str, List[str]]:
+    """


is it possible to have (say) predicate_a < predicate_b < predicate_c ?
it seems like your logic wouldn't work in that case

gabrielStanovsky · 2018-09-12T22:58:27Z

@joelgrus I addressed your style comments, and changed the predicate merging logic to be able to merge more than 2 predicates (also added a test for that).

PTAL, thanks!

joelgrus · 2018-09-12T23:19:59Z

allennlp/tests/predictors/open_information_extraction_test.py

+        # Consolidate
+        pred_dict = consolidate_predictions(predictions, sent_tokens)
+
+        # Check that only "decided to join" is left


fix comment

joelgrus · 2018-09-14T17:56:27Z

allennlp/predictors/open_information_extraction.py

+    pred_ind2 = get_predicate_indices(tags2)
+
+    # Return if pred_ind1 is contained in pred_ind2
+    return any(set.intersection(set(pred_ind1), set(pred_ind2)))


this returns true if they intersect, is that ok?

I think it's ok, maybe just the docstring and comment are wrong?

Yes, intersection is what I was going for. The docstrings were wrong.

joelgrus · 2018-09-14T17:58:06Z

allennlp/predictors/open_information_extraction.py

+def merge_overlapping_predictions(tags1: List[str], tags2: List[str]) -> List[str]:
+    """
+    Merge two predictions into one. Assumes the predicate in tags1 are contained in
+    the predicate of tags2.


is "contained in" correct?

Same, leftover docstring. I fixed that.

Gabi Stanovsky and others added 25 commits September 6, 2018 18:26

adding Open IE

40e1598

adding Open IE

31abdd8

adding Open IE

f98448b

fixing typo

9f1e1be

Using SRL model instead of a duplicated OIE model, adding an `ignore_…

ba66d12

…span_metric` parameter to the SRL model to accomodate for Open IE

adding conversion script from open ie extractions to conll format

1cc5d90

minor fixes

6181eb2

style fixes

4022d2a

fixing naming convention

c55587d

fixing naming convention

4b9144f

returning empty dictionary when in get_metrics when ignoring span loss

dba7ab2

minor

45a7905

minor

bb34e6a

adding Open IE predictor to docs

bfbea10

fixing comments for docs

7a0a60d

Merge branch 'master' into master

327fc87

Merge branch 'master' of https://github.com/allenai/allennlp

4aaf692

merging overlapping predicates

5990b0c

merging overlapping predicates

0c406b2

Merge branch 'master' of https://github.com/allenai/allennlp

a74c58b

Merge branch 'master' of https://github.com/allenai/allennlp

7fe40d1

Merge branch 'master' into better_mwp

a35f36a

merging overlapping predicates

e3eef72

adding typing

8040091

Merge branch 'master' into better_mwp

af3cc40

gabrielStanovsky assigned joelgrus Sep 12, 2018

joelgrus reviewed Sep 12, 2018

View reviewed changes

Gabi Stanovsky added 3 commits September 12, 2018 11:57

addressing style comments

46cb29c

fixing multiple predicates bug

c6eeee6

minor

0e5d653

Gabi Stanovsky added 2 commits September 12, 2018 16:16

minor

35f8bb7

fixing bug in merge and sanitizing labels

a5e90a8

joelgrus reviewed Sep 14, 2018

View reviewed changes

Gabi Stanovsky and others added 2 commits September 14, 2018 13:41

fixing documentation

84e6c75

Merge branch 'master' into better_mwp

47ff2b8

joelgrus approved these changes Sep 17, 2018

View reviewed changes

Merge branch 'master' into better_mwp

7c55c17

gabrielStanovsky merged commit ae72f79 into allenai:master Sep 17, 2018

gabrielStanovsky deleted the better_mwp branch September 17, 2018 23:07

gabrielStanovsky mentioned this pull request Sep 17, 2018

Changing commit for improved Open IE (multi-word predicates) allenai/allennlp-demo#61

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better multi-word predicates in Open IE predictors #1759

Better multi-word predicates in Open IE predictors #1759

gabrielStanovsky commented Sep 12, 2018

joelgrus left a comment

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

joelgrus Sep 12, 2018

gabrielStanovsky commented Sep 12, 2018

joelgrus Sep 12, 2018

gabrielStanovsky Sep 14, 2018

joelgrus Sep 14, 2018

joelgrus Sep 14, 2018

gabrielStanovsky Sep 14, 2018

joelgrus Sep 14, 2018

gabrielStanovsky Sep 14, 2018

Better multi-word predicates in Open IE predictors #1759

Better multi-word predicates in Open IE predictors #1759

Conversation

gabrielStanovsky commented Sep 12, 2018

joelgrus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrielStanovsky commented Sep 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment