Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noun chunks are missed when there are () #1840

Closed
pengyu opened this issue Jan 15, 2018 · 2 comments
Closed

noun chunks are missed when there are () #1840

pengyu opened this issue Jan 15, 2018 · 2 comments
Labels
lang / en English language data and models perf / accuracy Performance: accuracy

Comments

@pengyu
Copy link

pengyu commented Jan 15, 2018

The following example shows that phospholipase C (PLC) δ1 can not be correctly extracted. This usually happens when there are (). Can this bug be systematically fixed?

$ cat main2.py 
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import spacy
nlp = spacy.load('en', disable=['tokenizer', 'ner', 'textcat'])
## 'tagger' and 'parser' can not be disabled.

doc = nlp(u'We previously revealed that the expression of phospholipase C (PLC) δ1, one of the most basal PLCs, is down-regulated in colon adenocarcinoma, and that the KRAS signaling pathway suppresses PLCδ1 expression.')
print [x for x in doc.noun_chunks]
$ ./main2.py 
[We, the expression, phospholipase C (PLC, δ1, the most basal PLCs, colon adenocarcinoma, the KRAS, PLCδ1 expression]
@ines ines added performance lang / en English language data and models labels Jan 15, 2018
@ines ines added perf / accuracy Performance: accuracy and removed performance labels Aug 15, 2018
@ines
Copy link
Member

ines commented Dec 14, 2018

The noun chunks depend on the part-of-speech tags and dependency parse, so this issue likely comes down to incorrect predictions made by the tagger or parser.

I'm merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.

@ines ines closed this as completed Dec 14, 2018
@lock
Copy link

lock bot commented Jan 13, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lang / en English language data and models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

2 participants