In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import os, sys
sys.path.append("..")

# Radiology Reports
## Overview
While working on a diagnosis, the provider will order procedures such as chest imaging to confirm a suspected diagnosis of pneumonia. The subsequent chest imaging report will contain a radiologist's finding and interpretations. This may contain explicit assertions of pneumonia or confirmation of pneumonia-related findings such as opacity or infiltrate.

The diagram below walks through the logic used for classifying radiology reports. While the emergency classifier considered only the clinical concept of **"Pneumonia"**, the radiology classifier considers other concepts and makes some distinctions between them.
![Radiology classification](./Radiology.png)

- First, the system looks for what we call **"Tier 1"** evidence which is uniquely interpreted as referring to pneumonia. The classes in this set are **"Pneumonia"** and **"Consolidation"**. If either of these concepts are explicitly asserted (either positive or negative assertions), this is used to make a document classification
- In the absence of Tier 1 evidence, the NLP next looks for **"Tier 2"** evidence, which could be interpreted as pneumonia but needs some additional context. The Tier 2 concepts are **"Infiltrate"** and **"Opacity"**. If there are positive assertions of either of these, the system next attempts to see if these assertions indicate pneumonia.
- A number of other diagnoses, such as atelectasis or pulmonary fibrosis, may have similar radiographic findings to pneumonia. The NLP will attempt to rule out any mentions of radiographic findings which are specifically linked to these conditions.



In [3]:
from medspacy_pna import build_nlp
from medspacy_pna.display import create_html
from medspacy.visualization import visualize_ent
from medspacy_pna.document_classification.radiology_document_classifier import TIER_2_CLASSES, ALTERNATE_DIAGNOSES
from IPython.display import HTML

In [4]:
%%capture
nlp_rad = build_nlp("radiology")

In [5]:
clf = nlp_rad.get_pipe("pneumonia_radiologydocumentclassifier")

In [6]:
# Concepts which can be considered "Positive"
print(clf.target_classes)

{'CONSOLIDATION', 'OPACITY', 'PNEUMONIA', 'INFILTRATE'}


In [7]:
# Concepts which may be linked to other diagnosis
print(TIER_2_CLASSES)
print(ALTERNATE_DIAGNOSES)

{'OPACITY', 'INFILTRATE'}
{'ATELECTASIS', 'INTERSTITIAL_LUNG_DISEASE', 'FIBROSIS', 'PULMONARY_EDEMA'}


In [8]:
nlp_rad.pipe_names

['tok2vec',
 'tagger',
 'parser',
 'medspacy_concept_tagger',
 'medspacy_target_matcher',
 'medspacy_context',
 'medspacy_sectionizer',
 'medspacy_postprocessor',
 'pneumonia_radiologydocumentclassifier']

## Example 1: Positive Pneumonia

In [9]:
text = """
Impression: Findings are suggestive of pneumonia.
"""
doc = nlp_rad(text)

  matches = self.matcher(doc)


In [10]:
HTML(create_html(doc, "radiology", document_classification=True))

In [11]:
for ent in doc.ents:
    print(ent, ent.label_, ent._.section_category)

pneumonia PNEUMONIA impression


In [12]:
print(doc._.document_classification)

POS


## Example 2: Positive radiographic findings

In [13]:
text = """
Findings: Some ground glass opacities are present.
"""
doc = nlp_rad(text)

  matches = self.matcher(doc)


In [14]:
HTML(create_html(doc, "radiology", document_classification=True))

In [15]:
for ent in doc.ents:
    print(ent, ent.label_, ent._.section_category)

opacities OPACITY impression


In [16]:
print(doc._.document_classification)

POS


## Example 3: Negative radiographic findings

In [17]:
text = """
Indication: Rule out pneumonia

Findings: No opacities or infiltrate.
"""
doc = nlp_rad(text)

  matches = self.matcher(doc)


In [18]:
HTML(create_html(doc, "radiology", document_classification=True))

In [19]:
for ent in doc.ents:
    print(ent, ent.label_, ent._.section_category)

pneumonia PNEUMONIA indication
opacities OPACITY impression
infiltrate INFILTRATE impression


In [20]:
print(doc._.document_classification)

NEG


## Example 4: Alternate diagnosis

In [21]:
text = """
Infiltrate is present and likely represents atelectasis.
"""
doc = nlp_rad(text)

  matches = self.matcher(doc)


In [22]:
# In this example, "atelectasis" isn't highlighted since only target concepts 
# are highlighted in the provider view
HTML(create_html(doc, "radiology", document_classification=True))

In [23]:
for token in doc:
    print(token, token._.concept_tag)


 
Infiltrate INFILTRATE
is 
present 
and 
likely 
represents 
atelectasis ATELECTASIS
. 

 


In [24]:
for ent in doc.ents:
    print(ent, ent.label_, ent._.section_category, ent._.is_negated)

Infiltrate INFILTRATE None False
atelectasis ATELECTASIS None False


## Alternate schemas
The classification schema described here was the highest performing in the validation study. However, other schemas are possible and may perform better for a particular use case.

In [25]:
text = """
Indication: rule out pneumonia.

Impression: There is some opacity present. It could represent atelectasis.
"""

In [26]:
doc = nlp_rad(text, disable=["pneumonia_radiologydocumentclassifier"])

  matches = self.matcher(doc)


In [27]:
clf = nlp_rad.get_pipe("pneumonia_radiologydocumentclassifier")

The default schema, **"linked"**, looks explicitly in the same sentence as a finding to link it to an alternate diagnosis. So in the example above, although atelectasis is listed as the likely diagnosis, it won't be linked to the opacity because it's in a different sentence.

In [28]:
clf.classify_document(doc, classification_schema="linked")

'POS'

The **"full"** schema will look in the entire document for alternate diagnoses That will work better in cases like this, but will cause false negatives if another diagnosis is mentioned but isn't actually related to the finding. We prioritized recall over precision so favored the "linked" schema.

In [29]:
clf.classify_document(doc, classification_schema="full")

'NEG'

Finally, the **"attributes"** schema will not consider alternate diagnosis and makes a classification purely based on the target concepts and their attributes (negation, uncertainty, etc.)

In [30]:
clf.classify_document(doc, classification_schema="attributes")

'POS'