Wrongly picking negations in negaspcy? #14

Raghu17s · 2020-04-30T09:17:32Z

Hi,

I am trying to find negations in a sentence using negspacy. But, it's printing first negation (no headache) as False which is supposed to be True and picking the second negation correctly. Should I fine-tune any parameters to get the first negation correctly?

Here is my code.

nlp = spacy.load("en_core_sci_md")
negex = Negex(nlp, language = "en_clinical")
doc = nlp('I am having Hypertension with no headache and fever')
for ent in doc.ents:
    print(ent.text, ent._.negex)

Output:

Hypertension False
no headache False
fever True

The text was updated successfully, but these errors were encountered:

jenojp · 2020-04-30T12:06:07Z

Hi @Raghu17s - I've run across this issue too, especially with scispacy as the language model. The NER is making the entity span "no headache" instead of just "headache". Check out the example here: https://github.com/jenojp/negspacy#negations-in-noun-chunks

Basically just modify this line and it should work as you expect.

negex = Negex(nlp, language = "en_clinical", chunk_prefix=["no"])

Raghu17s · 2020-04-30T12:51:51Z

Thank you.
Then should I mention all negation words in the chunk_prefix like not, but, etc.
Or "no" is the only one which is making issues.

I have observed one strange thing. If I add comma in between no and headache (no, headache instead of no headache) it picks as negation. But this cannot be done manually everytime.

jenojp · 2020-04-30T14:21:41Z

I see, that makes sense because the scispaCy NER model does not treat it as 1 entity because of the comma.

Yes you should add other words that you're having the problem with based on the NER. You should be careful to not be too greedy there since in the biomedical domain, many entities could "start with" a negation-like word (e.g., non hodgkin's lymphoma).

stefano-marchesin · 2020-11-01T21:13:22Z

Hi @jenojp,
what about composite terms instead of single words? For example, I believe 'free of' is not detected by negex because in the code (starting at line 307 of negation.py):

if self.chunk_prefix:
if any(
c.text.lower() == doc[e.start].text.lower()
for c in self.chunk_prefix
):
e._.set(self.extension_name, True)

the check is only on the first word (i.e., doc[e.start]) and therefore 'free of' is not appropriately handled.

jenojp · 2020-11-02T20:00:46Z

@stefano-marchesin, that's a good catch. I hadn't had any composite terms pop up in my use cases but I can definitely see that being an issue. Could you share an example entity that this is happening on? I'm assuming you're using scispacy?

stefano-marchesin · 2020-11-02T20:15:47Z

Hi @jenojp! yes, I am using scispacy and within negex the language param is set to "en_clinical". As an example, consider the following: "resection margins free of dysplasia"

In this case, free of dysplasia is considered as a single entity and Negex fails to catch "free of" as chunk_prefix. As a workaround I did the following (starting at line 307 of negation.py):

if self.chunk_prefix:
if any(
c.text.lower() == doc[e.start:e.start+len(c)].text.lower()
for c in self.chunk_prefix
):
e._.set(self.extension_name, True)

which solved the problem for me.

stefano-marchesin · 2020-11-02T20:24:42Z

On a side note,

I've also noticed that for the same mention of the previous comment -- i.e., "resection margins free of dysplasia" -- negex finds also "resection margins" as a negative mention. I believe this is related to the "free" pattern included within following_negations. However, is there any chance to remove this behavior? It would create inconsistencies.

jenojp · 2020-11-03T14:31:53Z

Hey can you give me a de-id test block of text to work with? I want to try and recreate what you're seeing.

jenojp · 2020-11-03T18:51:08Z

Also, you're able to add and remove patterns on the fly in 0.1.8. So you can remove "free" simply by doing the following:

from negspacy.negation import Negex
import spacy

nlp = spacy.load("en_core_sci_sm")
negex = Negex(nlp)
negex.remove_patterns(following_negations=["free"])

Issue #14 (compound chunk prefixes) and remove patterns bug fix

jenojp · 2020-11-18T15:26:25Z

Closed with release 0.1.9

jenojp closed this as completed Apr 30, 2020

jenojp reopened this Nov 2, 2020

jenojp added a commit that referenced this issue Nov 17, 2020

Merge pull request #26 from jenojp/develop

5122767

Issue #14 (compound chunk prefixes) and remove patterns bug fix

jenojp closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrongly picking negations in negaspcy? #14

Wrongly picking negations in negaspcy? #14

Raghu17s commented Apr 30, 2020

jenojp commented Apr 30, 2020

Raghu17s commented Apr 30, 2020 •

edited

Loading

jenojp commented Apr 30, 2020

stefano-marchesin commented Nov 1, 2020 •

edited

Loading

jenojp commented Nov 2, 2020

stefano-marchesin commented Nov 2, 2020

stefano-marchesin commented Nov 2, 2020

jenojp commented Nov 3, 2020 •

edited

Loading

jenojp commented Nov 3, 2020

jenojp commented Nov 18, 2020

Wrongly picking negations in negaspcy? #14

Wrongly picking negations in negaspcy? #14

Comments

Raghu17s commented Apr 30, 2020

jenojp commented Apr 30, 2020

Raghu17s commented Apr 30, 2020 • edited Loading

jenojp commented Apr 30, 2020

stefano-marchesin commented Nov 1, 2020 • edited Loading

jenojp commented Nov 2, 2020

stefano-marchesin commented Nov 2, 2020

stefano-marchesin commented Nov 2, 2020

jenojp commented Nov 3, 2020 • edited Loading

jenojp commented Nov 3, 2020

jenojp commented Nov 18, 2020

Raghu17s commented Apr 30, 2020 •

edited

Loading

stefano-marchesin commented Nov 1, 2020 •

edited

Loading

jenojp commented Nov 3, 2020 •

edited

Loading