Poor triple extractor performance (OpenIE) #17

filip-cermak · 2022-11-07T19:41:23Z

I followed the README and successfully run the OpenIE16 benchmark, then I modified OIE_2016.json file to point to my directory with test.txt file containing just one line Julia owns two cats and one dog. The output is however really poor, the expected triples [Julia, owns, two cats] and [Julia, owns, one dog] have low scores and there are many other (ill-created) triples, sometimes with even higher score values (complete output attached below).

Is this the normal behavior of the model? Why is the performance so poor? Is there a systematic issue with how I am doing this?

['$input_txt:$ Julia owns two cats and one dog',
 {'deduplicated:': {'Julia [SEP] owns two [SEP] One Dog': [2,
    0.15572701767086983,
    [[0, 5], [24, 31]],
    8,
    0],
   'Julia [SEP] owns two cats [SEP] One Dog': [5,
    0.38142501655966043,
    [[0, 5], [24, 31]],
    21,
    0],
   'Julia [SEP] cats [SEP] One Dog': [2,
    0.09584893216378987,
    [[0, 5], [24, 31]],
    6,
    0],
   'Two Cats [SEP] cats [SEP] One Dog': [2,
    0.09491016250103712,
    [[11, 19], [24, 31]],
    6,
    0],
   'Julia [SEP] owns two cats and [SEP] One Dog': [6,
    0.4070159122347832,
    [[0, 5], [24, 31]],
    26,
    0],
   'Julia [SEP] cats and [SEP] Two Cats': [2,
    0.14055378548800945,
    [[0, 5], [11, 19]],
    9,
    0],
   'Julia [SEP] two cats [SEP] One Dog': [1,
    0.06196947582066059,
    [[0, 5], [24, 31]],
    4,
    0],
   'Julia [SEP] one [SEP] Two Cats': [1,
    0.06188515014946461,
    [[0, 5], [11, 19]],
    4,
    0],
   'Julia [SEP] owns [SEP] Two Cats': [6,
    0.2982198027893901,
    [[0, 5], [11, 19]],
    20,
    0],
   'Julia [SEP] owns [SEP] One Dog': [4,
    0.17877793312072754,
    [[0, 5], [24, 31]],
    12,
    0],
   'Julia [SEP] two [SEP] One Dog': [3,
    0.1447404371574521,
    [[0, 5], [24, 31]],
    10,
    0],
   'Julia [SEP] and one [SEP] Two Cats': [5,
    0.32297115167602897,
    [[0, 5], [11, 19]],
    23,
    0],
   'Two Cats [SEP] owns [SEP] One Dog': [8,
    0.44091942673549056,
    [[11, 19], [24, 31]],
    32,
    0],
   'Julia [SEP] and [SEP] One Dog': [1,
    0.04122000187635422,
    [[0, 5], [24, 31]],
    3,
    0],
   'Julia [SEP] cats and one [SEP] Two Cats': [2,
    0.10967723815701902,
    [[0, 5], [11, 19]],
    8,
    0],
   'Two Cats [SEP] one [SEP] One Dog': [2,
    0.08130411058664322,
    [[11, 19], [24, 31]],
    6,
    0],
   'Two Cats [SEP] owns two [SEP] One Dog': [8,
    0.4609282175078988,
    [[11, 19], [24, 31]],
    36,
    0],
   'Julia [SEP] and [SEP] Two Cats': [3,
    0.14550211280584335,
    [[0, 5], [11, 19]],
    12,
    0],
   'Two Cats [SEP] two [SEP] One Dog': [4,
    0.19086267473176122,
    [[11, 19], [24, 31]],
    16,
    0],
   'Two Cats [SEP] and [SEP] One Dog': [6,
    0.1381131475791335,
    [[11, 19], [24, 31]],
    21,
    0]}}]

The text was updated successfully, but these errors were encountered:

atul1234anand · 2022-11-17T16:26:39Z

Hey. How did you go about providing your test file? Is it by adding the file path to the gold parameter in OIE_2016.json? It seems to require some parameters additional to a sentence.

jesseLiu2000 · 2023-02-18T21:02:33Z

Hi! I thought you can try more sentences when you test it, cause fewer examples can lead to bias. Then, you can use the evaluation metrics the paper mentioned to test whether the result is poor! Thanks!

filip-cermak · 2023-03-02T14:56:20Z

So the real problem with this was that I was looking at the raw triples before the ranking algorithm was applied. However, looking at the ranked triples, more second-guessing is involved. This output is ranked by the contrastive distance and hence only top (or bottom) triples should be taken seriously.

zhanwenchen · 2023-08-11T23:35:24Z

@filip-cermak But how do you do that? Do you need to run your own data by some sort of dictionary?

filip-cermak mentioned this issue Nov 7, 2022

Running inference on sentences #14

Closed

jesseLiu2000 closed this as completed Feb 18, 2023

zhanwenchen mentioned this issue Aug 14, 2023

ValueError When Running OIE on New Sentence #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor triple extractor performance (OpenIE) #17

Poor triple extractor performance (OpenIE) #17

filip-cermak commented Nov 7, 2022 •

edited

atul1234anand commented Nov 17, 2022

jesseLiu2000 commented Feb 18, 2023

filip-cermak commented Mar 2, 2023

zhanwenchen commented Aug 11, 2023 •

edited

Poor triple extractor performance (OpenIE) #17

Poor triple extractor performance (OpenIE) #17

Comments

filip-cermak commented Nov 7, 2022 • edited

atul1234anand commented Nov 17, 2022

jesseLiu2000 commented Feb 18, 2023

filip-cermak commented Mar 2, 2023

zhanwenchen commented Aug 11, 2023 • edited

filip-cermak commented Nov 7, 2022 •

edited

zhanwenchen commented Aug 11, 2023 •

edited