Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent parse result for sentences with double quotes between models #1716

Closed
mehmetilker opened this issue Dec 12, 2017 · 2 comments
Closed
Labels
lang / en English language data and models perf / accuracy Performance: accuracy

Comments

@mehmetilker
Copy link

mehmetilker commented Dec 12, 2017

Large model (en_core_web_lg) parse result gave me quote as "prep" but small model as "punct". (which is right I think.)

Sentence exampe : "Throughout our first week of basic training..."

en_core_web_sm
" punct Throughout
Throughout ROOT Throughout
....

en_core_web_lg
" punct .
Throughout -- prep -- "

I have just found another case for a sentence contains number, different parse result between small model and large.
In the sentence "You know, I can get that for $1 across the street."
With small model I can get "get-prep-across" dependency.
But with large model I got "1 - prep - across".

Another one: One of our own's on the inside.
small model > One - prep - on
Large model > on - prep - 's

In all cases small model wins. If it were opposite, it would be expected but this is wrong if I did not misunderstand the situation.

Your Environment

  • spaCy version: 2.0.5
  • Platform: Windows-10-10.0.16299-SP0
  • Python version: 3.6.3
  • Models: en, en_core_web_lg
@ines ines added the lang / en English language data and models label Mar 27, 2018
@ines ines added perf / accuracy Performance: accuracy and removed performance labels Aug 15, 2018
@ines
Copy link
Member

ines commented Dec 14, 2018

Merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.

@ines ines closed this as completed Dec 14, 2018
@lock
Copy link

lock bot commented Jan 13, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lang / en English language data and models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

3 participants