Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bugs with the NER model #1713

Closed
Zhenshan-Jin opened this issue Dec 11, 2017 · 2 comments
Closed

Potential bugs with the NER model #1713

Zhenshan-Jin opened this issue Dec 11, 2017 · 2 comments
Labels
feat / ner Feature: Named Entity Recognizer lang / en English language data and models perf / accuracy Performance: accuracy

Comments

@Zhenshan-Jin
Copy link

Zhenshan-Jin commented Dec 11, 2017

  1. Problem with the parenthesis dependency parsing:
    The parsing result is different for different formatted input
Kentucky Fried Chicken (KFC) is one of the most famous fried chicken restaurant.

image

Kentucky Fried Chicken(KFC) is one of the most famous fried chicken restaurant.

image
We can see that by removing the space between Kentucky Fried Chicken and (KFC) the parsing result is different, which, in my opinion, is not right, like Kentucky Fried Chicken( and KFC). I guess it's probably the problem with the training data. But not sure.

  1. Problem with the stability of the self-trained NER model:
    I can't really show the model result here, sorry about that. The problem is that the model is unstable, like by adding some irrelevant words to the begging or the end of the paragraph, the entity being recognized by the self-trained model would be totally changed. And also the entities recognized based on paragraph and sentences is different, which should be the case based on the Transition model described by Hannibal. I'm wondering have that issue happens before.

Thanks!

Info about spaCy
spaCy version: 2.0.3
Python version: 3.6
pre-trained Models: en_core_web_md

@Zhenshan-Jin Zhenshan-Jin changed the title Potential bug with the NER model Potential bugs with the NER model Dec 11, 2017
@ines ines added lang / en English language data and models feat / ner Feature: Named Entity Recognizer labels Apr 29, 2018
@ines ines added perf / accuracy Performance: accuracy and removed performance labels Aug 15, 2018
@ines
Copy link
Member

ines commented Dec 14, 2018

Merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.

@ines ines closed this as completed Dec 14, 2018
@lock
Copy link

lock bot commented Jan 13, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / ner Feature: Named Entity Recognizer lang / en English language data and models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

3 participants