# Clinical NLP

In this example, we are using the spaCy library with [scispaCy](https://allenai.github.io/scispacy/) models for domain-specific entity extraction. We also use scispaCy's entity linker to map entities to the MeSH vocabulary for normalization.

In [24]:
import spacy
from scispacy.linking import EntityLinker

In [25]:
nlp = spacy.load('en_core_sci_sm')
linker = EntityLinker(name='mesh', k=5)
nlp.add_pipe(linker)



In [26]:
text = "The patient underwent a CT scan in April. It did not reveal any abnormalities."

In [27]:
doc = nlp(text)

  extended_neighbors[empty_vectors_boolean_flags] = numpy.array(neighbors)[:-1]
  extended_distances[empty_vectors_boolean_flags] = numpy.array(distances)[:-1]


### Linguistic Analysis

Boundary detection / sentence splitting

In [5]:
for s in doc.sents:
    print(s)

The patient underwent a CT scan in April.
It did not reveal any abnormalities.


In [6]:
sentence = list(doc.sents)[0]

Tokenization

In [7]:
for token in sentence:
    print(token)

The
patient
underwent
a
CT
scan
in
April
.


Part-of-speech tagging

In [8]:
for token in sentence:
    print(token, token.pos_)

The DET
patient NOUN
underwent VERB
a DET
CT NOUN
scan NOUN
in ADP
April PROPN
. PUNCT


Noun chunking

In [9]:
for token in sentence.noun_chunks:
    print(token)

The patient
a CT scan


Dependency parsing

In [10]:
from spacy import displacy

In [11]:
displacy.render(sentence, style="dep", jupyter=True, options={'distance' : 100})

## Information Extraction

Entity extraction

In [12]:
for e in sentence.ents:
    print('Entity:', e)

Entity: patient
Entity: CT scan
Entity: April


Entity normalization / linking

In [13]:
from IPython.display import display_markdown

In [14]:
for e in sentence.ents:
    display_markdown(f'__Entity: {e}__', raw=True)
    for entity_id, prob in e._.kb_ents:
        mesh_term = linker.kb.cui_to_entity[entity_id]
        print('Probability:', prob)
        print(mesh_term)

__Entity: patient__

Probability: 0.8386321067810059
CUI: D019727, Name: Proxy
Definition: A person authorized to decide or act for another person, for example, a person having durable power of attorney.
TUI(s): 
Aliases: (total: 2): 
	 Patient Agent, Proxy
Probability: 0.7973070740699768
CUI: D010361, Name: Patients
Definition: Individuals participating in the health care system for the purpose of receiving therapeutic, diagnostic, or preventive procedures.
TUI(s): 
Aliases: (total: 2): 
	 Patients, Clients
Probability: 0.7851049304008484
CUI: D005791, Name: Patient Care
Definition: Care rendered by non-professionals.
TUI(s): 
Aliases: (total: 2): 
	 Informal care, Patient Care
Probability: 0.7439239025115967
CUI: D000070659, Name: Patient Comfort
Definition: Patient care intended to prevent or relieve suffering in conditions that ensure optimal quality living.
TUI(s): 
Aliases: (total: 2): 
	 Comfort Care, Patient Comfort
Probability: 0.7175934910774231
CUI: D064406, Name: Patient Harm
Definition: A meas

__Entity: CT scan__

Probability: 0.8230447769165039
CUI: D000072098, Name: Single Photon Emission Computed Tomography Computed Tomography
Definition: An imaging technique using a device which combines TOMOGRAPHY, EMISSION-COMPUTED, SINGLE-PHOTON and TOMOGRAPHY, X-RAY COMPUTED in the same session.
TUI(s): 
Aliases: (total: 5): 
	 CT SPECT Scan, Single Photon Emission Computed Tomography Computed Tomography, CT SPECT, SPECT CT Scan, SPECT CT
Probability: 0.8186503648757935
CUI: D000072078, Name: Positron Emission Tomography Computed Tomography
Definition: An imaging technique that combines a POSITRON-EMISSION TOMOGRAPHY (PET) scanner and a CT X RAY scanner. This establishes a precise anatomic localization in the same session.
TUI(s): 
Aliases: (total: 7): 
	 PET-CT Scan, PET-CT, CT PET Scan, Positron Emission Tomography Computed Tomography, PET CT Scan, Positron Emission Tomography-Computed Tomography, CT PET
Probability: 0.7265672087669373
CUI: D056973, Name: Four-Dimensional Computed Tomography
Definition

__Entity: April__

Probability: 0.8357817530632019
CUI: D053300, Name: Tumor Necrosis Factor Ligand Superfamily Member 13
Definition: A member of tumor necrosis factor superfamily found on MACROPHAGES; DENDRITIC CELLS and T-LYMPHOCYTES. It occurs as transmembrane protein that can be cleaved to release a secreted form that specifically binds to TRANSMEMBRANE ACTIVATOR AND CAML INTERACTOR PROTEIN; and B CELL MATURATION ANTIGEN.
TUI(s): 
Aliases: (total: 5): 
	 APRIL Protein, TALL-2 Protein, A Proliferation Inducing Ligand Protein, Tumor Necrosis Factor Ligand Superfamily Member 13, TNF- and APOL-Related Leukocyte Expressed Ligand 2 Protein


# Gene Named Entity Recognition

In [28]:
text = """Dual MAPK pathway inhibition with BRAF and MEK inhibitors in BRAF(V600E)-mutant NSCLC 
might improve efficacy over BRAF inhibitor monotherapy based on observations in BRAF(V600)-mutant melanoma"""

Specialized model for biological entities

In [30]:
bionlp = spacy.load('en_ner_bionlp13cg_md')

In [36]:
displacy.serve(bionlp(text), style='ent', jupyter=True)

TypeError: serve() got an unexpected keyword argument 'jupyter'