# Overview
In this notebook, we'll look beyond our target concepts to look for semantic modifiers using the **ConText** algorithm. We'll then see how to detect which section of a clinical document a concept is found in.

In [None]:
import spacy
import medspacy

from IPython.display import YouTubeVideo

In [None]:
nlp = medspacy.load(enable=["sentencizer", "target_matcher", "context", "sectionizer"])

In [None]:
nlp.pipe_names

First, let's set up our model to extract some of the concepts we saw in the previous notebook:

In [None]:
from medspacy.ner import TargetRule

In [None]:
target_matcher = nlp.get_pipe("target_matcher")

In [None]:
# Add rules from previous notebook for target extraction
target_rules = [
    TargetRule("pneumonia", "PROBLEM"),
    TargetRule("afib", "PROBLEM"),
    TargetRule("CHF", "PROBLEM"),
    TargetRule("Breast Cancer", "PROBLEM"),
    TargetRule("Alzheimer's", "PROBLEM"),
    TargetRule("metformin", "TREATMENT"),
    TargetRule("CKD", "PROBLEM", pattern=r"ckd stage ([1-5]|one|two|three|four|five)"),
    
    TargetRule("Type II Diabetes Mellitus", "PROBLEM", 
              pattern=r"type (2|ii|two) (diabetes|dm)( mellitus)?"),
]

In [None]:
target_matcher.add(target_rules)

# III. Contextual analysis
Clinical text often contains mentions of concepts which the patient did not actually experience. For example:

- "There is *no evidence of* **pneumonia**"
- "*Mother* with **breast cancer**"
- "Patient presents for *r/o* **COVID-19**"

In all of these instances, we need to use the contextual clues around the entity to assert attributes like negation, experiencer, and uncertainty.

In [None]:
YouTubeVideo("UEm7H8cfz80", start=1747, end=1885, rel=0)

## ConText Algorithm

One method for this is the [ConText algorithm](https://www.sciencedirect.com/science/article/pii/S1532046409000744). ConText links target entities like problems with semantic modifiers like those shown above. 

ConText's algorithm is extremely simple. Once you have named entities identified in a sentence, you can run ConText to determine whether any of them are not affirmed for the patient at the time the note was written. Here is an example for identifying a negated named entity:

1. Mark all ConText terms from your dictionary in the sentence
2. All named entities between the negation term and the end of the sentence are changed from "affirmed" to "negated"
3. Unless there is a termination term - then all named entities between the negation term and the termination term are changed from "affirmed" to "negated".


Here is an example of how it works when "but" is included as a termination term in your dictionary

<img alt="an example visualization of ConText" src="../slides/ConText-negation-example.png" width="500">

You can decide what types of modifiers you want ConText to address. If I create a modifier for experiencer, and include the word "mother" in my dictionary, then the experiencer is no longer the patient, which is the default assumption, but the person represented by the term "mother": 

<img alt="an example visualization of ConText" src="../slides/ConText-experiencer-example.png" width="500">

The medSpaCy implementation of ConText is [cycontext](https://github.com/medspacy/cycontext).

Here we'll show the basic usage of ConText. When instantiating ConText, we can use default rules and then add additional as needed. See the [cycontext](https://github.com/medspacy/cycontext) repository for more detailed examples and tutorials.

In [None]:
from medspacy.ner import TargetRule
from medspacy.context import ConTextItem
from medspacy.visualization import visualize_ent, visualize_dep

In [None]:
doc = nlp("There is no evidence of pneumonia.")

We can visualize the target and modifiers using two functions from `medspacy.visualization`. `visualize_ent` will highlight the spans of both target and modifier concepts. `visualize_dep` will show arrows between concepts to show which targets are modified by modifiers.

In [None]:
visualize_ent(doc)

In [None]:
visualize_dep(doc)

In [None]:
texts = [
    "Patient presents for management of Type II Diabetes Mellitus",
    "No evidence of pneumonia",
    "Past medical history significant for afib, CHF, and CKD Stage 3, now CKD stage five.",
    "Mother with breast cancer",
    "continue metformin for type 2 dm",
    "Her grandma was recently diagnosed with Alzheimer's"
]

In [None]:
docs = list(nlp.pipe(texts))

### TODO
The widget below will allow you to scroll through the docs and visualize each of them using the two functions described above. Select **"both"** as the display type so that you can see both highlighted entities ("ent") and the relations between them ("dep").

In [None]:
from medspacy.visualization import MedspaCyVisualizerWidget

In [None]:
w = MedspaCyVisualizerWidget(docs)

#### Note
If the output above won't display, you may need to do some extra configuration to get widgets to show up in notebooks. First, try running these commands in your terminal:

```bash
pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
```

Then restart your kernel and try again.

If that doesn't work, uncomment the code below and manually visualize each doc one at a time:

In [None]:
# idx = 0
# visualize_dep(docs[idx])
# visualize_ent(docs[idx])

## Adding rules to ConText
MedSpaCy comes with default rules for matching targets and modifiers. But you'll often find new examples which aren't included in the default rules. Let's see now how to add a rule.

In the sentence **"Her grandma was recently diagnosed with Alzheimer's"**, medSpaCy fails to recognize that **"grandma"** is a **"FAMILY"** modifier. 

In [None]:
text = "Her grandma was recently diagnosed with Alzheimer's"
doc = nlp(text)

In [None]:
visualize_ent(doc)

We can add this rule using the `ConTextRule` class and adding to the `context` component.

In [None]:
from medspacy.context import ConTextRule

In [None]:
context = nlp.get_pipe("context")

This class uses the same arguments as `TargetMatcher`, `literal` and `category`, with an optional `pattern`:

In [None]:
new_context_rules = [
    ConTextRule(literal="grandma", category="FAMILY", pattern=None),
]

In [None]:
context.add(new_context_rules)

In [None]:
doc = nlp(text)

In [None]:
visualize_ent(doc)
visualize_dep(doc)

### TODO
Add a ConText item rule to create a negation modifier for the phrase **"is not evident"**. Then add the rule to `context.add()` and process the text below. Make sure that **"is not evident"** modifies **"Pneumonia"**.

The list below shows the possible values for the `category` argument:
- 'FAMILY',
- 'HISTORICAL',
- 'HYPOTHETICAL',
- 'NEGATED_EXISTENCE',
- 'POSSIBLE_EXISTENCE'

In [None]:
new_context_rules = [
    ConTextRule(literal=____, category=____),
]

In [None]:
context.add(new_context_rules)

In [None]:
doc = nlp("Pneumonia is not evident.")

In [None]:
visualize_dep(doc)

# IV.  Section detection
Another important aspect of understanding what is being described in text is the structure of the document. Here is a discussion about determining whether a report describes cervical lymphadenopathy:

In [None]:
YouTubeVideo("UEm7H8cfz80", start=1205, end=1303, rel=0)


Clinical notes often contain a certain structure. The one example of this is the [SOAP note](https://www.globalpremeds.com/blog/2015/01/02/understanding-soap-format-for-clinical-rounds/). Different parts of the notes have different significance. For example, a document listed in the **Past Medical History** or **Problem List** is likely a historical condition which may not be relevant to a patient visit, where as the **Assessment/Plan** will be contain more up-to-date diagnoses.

MedSpaCy will detect sections through the `sectionizer` component. We can then visualize the section headers in using `visualize_ent`.

In [None]:
sectionizer = nlp.get_pipe("sectionizer")

In [None]:
text = """Past Medical History:
1. Type II DM
2. Afib
3. CKD Stage 3

Family History:
1. Breast Cancer


Reason for this examination: Possible pneumonia.

IMPRESSION:
No evidence of pneumonia.

Assessment/Plan:
Continue metformin for type 2 dm.
"""

In [None]:
doc = nlp(text)

In [None]:
visualize_ent(doc)

We can see all of the section titles in the doc by calling `doc._.section_categories`. We can also see which section an entity occured in using `ent._.section_category`:

In [None]:
print(doc._.section_categories)

In [None]:
for ent in doc.ents:
    print(ent, "-->", ent._.section_category)

## Adding rules to the sectionizer
Just like with `context` and `target_matcher`, you'll want to add new section titles to the `sectionizer` component. We can do this with the `SectionRule` class, which just like the other rules we created will take `literal`, `category`, and optional `pattern` arguments.

We then add these patterns using `sectionizer.add()`.

For example, we can see below that medSpaCy fails to recognize **"Previous Medical History"** to be equivalent to **"Past Medical History"**.

In [None]:
text = """Previous Medical History:
Pneumonia in 2012
"""

In [None]:
doc = nlp(text)

In [None]:
visualize_ent(doc)

Let's add a rule here to match it.

In [None]:
from medspacy.section_detection import SectionRule

In [None]:
sectionizer = nlp.get_pipe("sectionizer")

In [None]:
rule = SectionRule("Previous Medical History:", "past_medical_history")

In [None]:
sectionizer.add([rule])

In [None]:
doc = nlp(text)
visualize_ent(doc)

Here is a list of the default rules which come with the `Sectionizer` class:

In [None]:
sectionizer.rules[:10]

### TODO
Look at the note below. Identify any section titles which aren't highlighted by medSpaCy. Then add patterns to the sectionizer to match them. For the `"section_title"` key, you can either choose from one of the standardized titles below or choose your own.

In [None]:
text = """\
Active Medical Issues:
- Hip Pain
- Hypertension
- CHF

Medical Decision Making:
Patient understands benefits and risks of surgery.

Instructions for Home Care:
Dress wound twice a day.
"""

In [None]:
new_section_patterns = [
    SectionRule(literal=___, category=___),
    # ...
]

In [None]:
sectionizer.____(____)

In [None]:
doc = nlp(text)
visualize_ent(doc)

# What if we used machine learning rather than a rules-based system?

In [None]:
YouTubeVideo("UEm7H8cfz80", start=2600, end=2708, rel=0)

# Next Steps
So far, we've used medSpaCy to write rules for extracting concepts in the text and for identifying other attributes like negation and an entity's section in the note. Next we'll see how to use **machine learning** as an alternative method for concept extraction.

[nlp-04-machine-learning-ner.ipynb](nlp-04-machine-learning-ner.ipynb)