<a href="https://colab.research.google.com/github/gulabpatel/NLP_Basics/blob/main/Part%201.6%3A%20Negation_SpacyStanza.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview
In this notebook, I used negex to detect negated entities. Negex was developed in early 2000', but it continues to be a popular package for negation detection. 

Here, we combine negex with stanza, which is an NLP package. Together, they will identify medical entities, and tell us whether an entity is negated.

# Installation

In [None]:
!install spacy
!pip install negspacy
!pip install spacy_stanza #This package wraps the Stanza (formerly StanfordNLP) library, so you can use Stanford's models in a spaCy pipeline.

In [2]:
import spacy # to build a nlp pipeline
import stanza # for named entity recognition
# this package wraps Stanza around Spacy, so that we can use Stanza in a spaCy pipeline.
import spacy_stanza
import sys
from negspacy.negation import Negex
from negspacy.termsets import termset # to customize negation terms
import pandas as pd

# Set up NLP pipeline

In [3]:
# download and initialize a mimic pipeline with an i2b2 NER model
stanza.download('en', package='mimic', processors={'ner': 'i2b2'})
nlp = spacy_stanza.load_pipeline('en', package='mimic', processors={'ner': 'i2b2'})

Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.4.0.json:   0%|   …

2022-08-10 08:52:29 INFO: Downloading these customized packages for language: en (English)...
| Processor       | Package |
-----------------------------
| tokenize        | mimic   |
| pos             | mimic   |
| lemma           | mimic   |
| depparse        | mimic   |
| ner             | i2b2    |
| pretrain        | mimic   |
| forward_charlm  | mimic   |
| backward_charlm | mimic   |



Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/tokenize/mimic.pt:   0%|       …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/pos/mimic.pt:   0%|          | …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/lemma/mimic.pt:   0%|          …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/depparse/mimic.pt:   0%|       …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/ner/i2b2.pt:   0%|          | 0…

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/pretrain/mimic.pt:   0%|       …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/forward_charlm/mimic.pt:   0%| …

Downloading https://huggingface.co/stanfordnlp/stanza-en/resolve/v1.4.0/models/backward_charlm/mimic.pt:   0%|…

2022-08-10 08:52:37 INFO: Finished downloading models and saved to /root/stanza_resources.


Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.4.0.json:   0%|   …

2022-08-10 08:52:38 INFO: Loading these models for language: en (English):
| Processor | Package |
-----------------------
| tokenize  | mimic   |
| pos       | mimic   |
| lemma     | mimic   |
| depparse  | mimic   |
| ner       | i2b2    |

2022-08-10 08:52:38 INFO: Use device: cpu
2022-08-10 08:52:38 INFO: Loading: tokenize
2022-08-10 08:52:38 INFO: Loading: pos
2022-08-10 08:52:38 INFO: Loading: lemma
2022-08-10 08:52:38 INFO: Loading: depparse
2022-08-10 08:52:38 INFO: Loading: ner
2022-08-10 08:52:39 INFO: Done loading processors!


# Add customized terms to the default list of terms

In [4]:
ts = termset("en_clinical")
# customize the term list by adding more negation terms
ts.add_patterns({
            'preceding_negations': ['abstain from','other than','except for','except','with the exception of',
                                    'excluding','lack of','contraindication','contraindicated','interfere with',
                                   'prohibit','prohibits'],
            'following_negations':['negative','is allowed','impossible','exclusionary']
        })

# Let negex know what entities we are extracting

In [5]:
nlp.add_pipe("negex", config={"ent_types":["PROBLEM","TEST",'TREATMENT']})

<negspacy.negation.Negex at 0x7eff77d4ea10>

# Examples
"True" means an entity should be negated

In [6]:
doc = nlp('Patient had a headache, but no fever')

for e in doc.ents:
	print(e.text, e._.negex)

a headache False
fever True


In [7]:
doc = nlp('No history of diabetes')

for e in doc.ents:
	print(e.text, e._.negex)

diabetes True


In [8]:
doc = nlp('Patients should abstain from painkillers like NSAIDs and allergy medications for 24 hours')

for e in doc.ents:
	print(e.text, e._.negex)

painkillers True
NSAIDs True
allergy medications True


In [9]:
doc = nlp('Women with pregnancy should not take hormonal birth control')

for e in doc.ents:
	print(e.text, e._.negex)

pregnancy False
hormonal birth control True


# References
Negation
* https://github.com/jenojp/negspacy
* https://medium.com/@MansiKukreja/clinical-text-negation-handling-using-negspacy-and-scispacy-233ce69ab2ac
* https://towardsdatascience.com/clinical-notes-the-negative-story-e1140dd275c7
* https://www.youtube.com/watch?v=IiD3YZkkCmE&t=2210s
(see 36:41 of the video)

Negex: how to add and delete custom negation terms
* https://pypi.org/project/negspacy/
