Demo results does not match when the model is loaded locally #3418

snijesh · 2019-11-02T05:42:12Z

I have used SRL model (other models also). It seems the output generated in the demo screen are more accurate than the results obtained locally. What is the difference between the models loaded for demo and loaded locally. Or is there any extra files to be added to get better prediction

schmmd · 2019-11-05T17:57:17Z

While the demo models are hosted on Google cloud for performance reasons (it's faster and cheaper to download from GCS for the running demo), they are identical.

~ curl https://s3-us-west-2.amazonaws.com/allennlp/models/bert-base-srl-2019.06.17.tar.gz > a
~ curl https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2019.06.17.tar.gz > b
~ diff a b

I don't know why you are seeing different performance. Can you give specific examples of the differences you're seeing?

matt-gardner · 2019-11-08T03:03:50Z

See here: https://github.com/allenai/allennlp/blob/9a6962f00d2b0d30b81900b4e9764ddc3433f400/tutorials/how_to/elmo.md#notes-on-statefulness-and-non-determinism. There are several other issues in the repo with more discussion on this; you can probably find them for searching for links to that note that I linked to.

schmmd · 2019-11-08T04:39:53Z

@matt-gardner does this model use ELMo? I gathered from the name that it didn't.

matt-gardner · 2019-11-08T19:37:50Z

Ah, sorry, you're right. Though it's not clear which models @snijesh was using in each case. @snijesh, if you still have questions, feel free to post again. I'll leave this closed until we hear from you, though.

HaritzPuerto · 2020-02-12T05:09:13Z

Hello, I have just encountered this problem. Given the sentence:

In 2011 the circulation of the magazine was 1,310,696 copies.

While the demo returns this beautiful result:
was: [ARGM-TMP: In 2011] [ARG1: the circulation of the magazine] [V: was] [ARG2: 1,310,696 copies] .

The python-api returns:
{'verbs': [], 'words': ['In', '2011', 'the', 'circulation', 'of', 'the', 'magazine', 'was', '1,310,696', 'copies', '.']}

In other sentences, I got the same results. I am using this model Predictor.from_path("https://s3-us-west-2.amazonaws.com/allennlp/models/bert-base-srl-2019.06.17.tar.gz"). Since it looks like it is not the elmo model, I do not know what is causing the mismatch in the performance.

Thank you for your work! allennlp is really useful :) @matt-gardner

matt-gardner · 2020-02-12T15:11:21Z

This is almost certainly due to a mismatch in spacy models. We use spacy to detect verbs, and different versions of spacy models detect verbs differently, especially with things like "was". In the demo, with an older version of spacy, "was" gets detected as a verb, so the prediction is made. In newer versions of spacy, I believe "was" in this gets detected as AUX, so no prediction is made.

HaritzPuerto · 2020-02-13T04:53:48Z

You are right. I have just downgraded spacy to 2.1.4 and now the behaviour is the same as in the demo. Thank you

Parth27 · 2020-04-09T15:11:05Z

Hello,
I am facing a similar issue with the semantic role label predictor.
For a sentence like 'Please take a few minutes to review our 2001 goals on Enrons intranet' part of my output is this:
{'verbs': [{'verb': 'take', 'description': '[ARGM-DIS: Please] [V: take] [ARG1: a few minutes] [ARGM-PRP: to review our 2001 goals on Enrons intranet]'

As you can see, it classifies 'to review' as Purpose. But on the demo, it correctly says that this is not a Purpose.
This is the output of demo:
take: [ARGM-DIS: Please] [V: take] [ARG1: a few minutes] [ARG0: to review our 2001 goals on Enrons intranet]

I have tried with newer as well as older spacy versions, specifically: 2.1.4,2.1.9 and 2.2.4.
Please @matt-gardner get back to me as soon as possible, I really need this to work.

matt-gardner closed this as completed Nov 8, 2019

cooelf mentioned this issue Jun 27, 2020

Allennlp预测SRL结果不一致 cooelf/SemBERT#12

Closed

matt-gardner mentioned this issue Jun 29, 2020

SRL predictor misses Auxiliary verb #4388

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo results does not match when the model is loaded locally #3418

Demo results does not match when the model is loaded locally #3418

snijesh commented Nov 2, 2019

schmmd commented Nov 5, 2019

matt-gardner commented Nov 8, 2019

schmmd commented Nov 8, 2019

matt-gardner commented Nov 8, 2019

HaritzPuerto commented Feb 12, 2020 •

edited

Loading

matt-gardner commented Feb 12, 2020

HaritzPuerto commented Feb 13, 2020

Parth27 commented Apr 9, 2020 •

edited

Loading

Demo results does not match when the model is loaded locally #3418

Demo results does not match when the model is loaded locally #3418

Comments

snijesh commented Nov 2, 2019

schmmd commented Nov 5, 2019

matt-gardner commented Nov 8, 2019

schmmd commented Nov 8, 2019

matt-gardner commented Nov 8, 2019

HaritzPuerto commented Feb 12, 2020 • edited Loading

matt-gardner commented Feb 12, 2020

HaritzPuerto commented Feb 13, 2020

Parth27 commented Apr 9, 2020 • edited Loading

HaritzPuerto commented Feb 12, 2020 •

edited

Loading

Parth27 commented Apr 9, 2020 •

edited

Loading