Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noun chunks are not consistently extracted #1818

Closed
pengyu opened this issue Jan 9, 2018 · 2 comments
Closed

noun chunks are not consistently extracted #1818

pengyu opened this issue Jan 9, 2018 · 2 comments
Labels
feat / parser Feature: Dependency Parser feat / tagger Feature: Part-of-speech tagger lang / en English language data and models perf / accuracy Performance: accuracy

Comments

@pengyu
Copy link

pengyu commented Jan 9, 2018

The following example shows that sometime "Oxtr(-/-) mice" is extracted as a noun chunk, sometimes it is not. How to make the result to be consistent? Thanks.

$ cat main2.py 
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import spacy
nlp = spacy.load('en', disable=['tokenizer', 'ner', 'textcat'])
## 'tagger' and 'parser' can not be disabled.

doc = nlp(u'Male Oxtr(-/-) mice failed to maintain their body temperatures during exposure to a cold environment.')
print [x for x in doc.noun_chunks]
doc = nlp(u'Oxtr(-/-) mice also showed decreased neuronal activation in the thermoregulatory hypothalamic region during cold exposure.')
print [x for x in doc.noun_chunks]
$ ./main2.py 
[Male Oxtr(-/-) mice, their body temperatures, exposure, a cold environment]
[mice, decreased neuronal activation, the thermoregulatory hypothalamic region, cold exposure]
@ines ines added lang / en English language data and models feat / tagger Feature: Part-of-speech tagger feat / parser Feature: Dependency Parser perf / accuracy Performance: accuracy and removed performance labels Aug 15, 2018
@ines
Copy link
Member

ines commented Dec 14, 2018

The noun chunks depend on the part-of-speech tags and dependency parse, so this issue likely comes down to incorrect predictions made by the tagger or parser.

I'm merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.

@ines ines closed this as completed Dec 14, 2018
@lock
Copy link

lock bot commented Jan 13, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / parser Feature: Dependency Parser feat / tagger Feature: Part-of-speech tagger lang / en English language data and models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

3 participants