Online version finds the right antecedents, but actual version does not #16

kleinias · 2018-02-19T17:12:03Z

The online version is working fine on the following text:
"I know that Barbara and Sandy are here. I see Barbara watching TV. I hear Sandy breathing."

https://huggingface.co/coref/?text=I%20know%20that%20Barbara%20and%20Sandy%20are%20here.%20I%20see%20Barbara%20watching%20TV.%20I%20hear%20Sandy%20breathing.

But the actual version doesn't find ennough, just Barbara. It outputs by running:

clusters = coref.one_shot_coref(utterances=u"I know that Barbara and Sandy are here. I see Barbara watching TV. I hear Sandy breathing.")
print(clusters)
print (coref.get_most_representative())
mentions = coref.get_mentions()
print(mentions)

Loading spacy model

Info about model en_core_web_sm

lang               en             
pipeline           ['tagger', 'parser', 'ner']
accuracy           {'token_acc': 99.8698372794, 'ents_p': 84.9664503965, 'ents_r': 85.6312524451, 'uas': 91.7237657538, 'tags_acc': 97.0403350292, 'ents_f': 85.2975560875, 'las': 89.800872413}
name               core_web_sm    
license            CC BY-SA 3.0   
author             Explosion AI   
url                https://explosion.ai
vectors            {'keys': 0, 'width': 0, 'vectors': 0}
sources            ['OntoNotes 5', 'Common Crawl']
version            2.0.0          
spacy_version      >=2.0.0a18     
parent_package     spacy          
speed              {'gpu': None, 'nwords': 291344, 'cpu': 5122.3040471407}
email              contact@explosion.ai
description        English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.
source             /usr/local/lib/python3.6/dist-packages/en_core_web_sm

loading model from /usr/local/lib/python3.6/dist-packages/neuralcoref/weights/
{3: [3, 0]}
{}
[Barbara, Barbara and Sandy, Sandy, Barbara, TV, Sandy, Sandy breathing]

The text was updated successfully, but these errors were encountered:

bea-alex · 2018-03-15T12:30:15Z

I'd be interested in resolving this issue too. I got the same result as the previous commenter and here are the underlying scores:

{u'pair_scores': {0: {}, 1: {0: -1.8137825597308108}, 2: {0: -1.738390801732288, 1: -1.6511597972712726}, 3: {0: 6.5473994047601911, 1: -0.57869067045464151, 2: -1.6598056098030169}, 4: {0: -1.8103805461400377, 1: -1.5256500224140488, 2: -1.5399936662599227, 3: -1.6966305608918302}, 5: {0: -2.2999057893775179, 1: -1.7149788666508408, 2: 0.68513795195160965, 3: -2.0966374729906301, 4: -1.8540071211764726}, 6: {0: -1.9504528157206593, 1: -1.8210641784028945, 2: -1.8300293314203429, 3: -1.8767248759882404, 4: -1.6296482123249305, 5: -1.9901970079817037}}, u'single_scores': {0: None, 1: 1.5870258214636452, 2: 1.6899656734067761, 3: 1.5896249109319895, 4: 1.8004470287030618, 5: 1.5748515581318938, 6: 1.6261232857954271}}

thomwolf · 2018-05-15T09:03:37Z

Hi @bea-alex, @kleinias, after trying the new version it still seems to be different between our production setup (online demo) and the open-sourced version.
My guess is that it is related to a difference in spacy model. In our production setup we selected a large spacy 1 model with a higher parsing accuracy.
I will investigate further and keep up updated.

thomwolf · 2018-05-16T08:41:41Z

Ok more investigation indicated it's indeed an issue with the accuracy of the spacy model you use.

Parsing the sentence I hear Sandy breathing. with the default spacy 2 en_core_web_sm model incorrectly label Sandy as an ADJ.

The most simple solution is to use the larger model en_core_web_lg which label this example correctly.

Currently spacy model is hard coded to en_core_web_sm, you can overcome it by passing an nlp object to coref constructor. In the next version I will release in a few days I will make it easier to use any spacy model.

thomwolf · 2018-06-19T12:30:19Z

We are now on release v3.0 so I am closing this old issue.
Feel free to open it again (or a new one) if you are experiencing some issues with the new release.

boehm-e · 2018-10-04T11:09:42Z

Hi,

I am using the en_core_web_lg model, but am still having the issue:
with this sentence:

Once upon a time a lion lived in a forest. One day after a heavy meal it was sleeping. After a while, a mouse came and it started to play. Suddenly the lion got up with anger and looked for those who disturbed it's nice sleep. Then it saw a small mouse standing trembling with fear. The lion jumped on it. The mouse begged the lion to forgive it. The lion felt pity and left. The mouse ran away. On another day, the lion was caught in a net.

One day after a heavy meal IT was sleeping, is replaced by
One day after a heavy meal ONE DAY AFTER A HEAVY MEAL was sleeping

and it is replaced correctly in the web version.

Do you have any idea on how to solve this issue?

thank you very much for open-sourcing this project!

https://huggingface.co/coref/my_story

thomwolf closed this as completed May 16, 2018

thomwolf reopened this May 16, 2018

thomwolf mentioned this issue May 16, 2018

result not matched as demo #30

Closed

thomwolf closed this as completed Jun 19, 2018

theSage21 mentioned this issue Jul 2, 2018

No clusters being formed. But rest api shows it #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online version finds the right antecedents, but actual version does not #16

Online version finds the right antecedents, but actual version does not #16

kleinias commented Feb 19, 2018 •

edited

bea-alex commented Mar 15, 2018

thomwolf commented May 15, 2018 •

edited

thomwolf commented May 16, 2018 •

edited

thomwolf commented Jun 19, 2018

boehm-e commented Oct 4, 2018

Online version finds the right antecedents, but actual version does not #16

Online version finds the right antecedents, but actual version does not #16

Comments

kleinias commented Feb 19, 2018 • edited

bea-alex commented Mar 15, 2018

thomwolf commented May 15, 2018 • edited

thomwolf commented May 16, 2018 • edited

thomwolf commented Jun 19, 2018

boehm-e commented Oct 4, 2018

kleinias commented Feb 19, 2018 •

edited

thomwolf commented May 15, 2018 •

edited

thomwolf commented May 16, 2018 •

edited