![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/11.Pretrained_Clinical_Pipelines.ipynb)

# 11. Pretrained_Clinical_Pipelines

In [None]:
import os

jsl_secret = os.getenv('SECRET')

import sparknlp
sparknlp_version = sparknlp.version()
import sparknlp_jsl
jsl_version = sparknlp_jsl.version()

print (jsl_secret)

In [None]:
import json
import os
from pyspark.ml import Pipeline
from pyspark.sql import SparkSession

from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp.base import *
import sparknlp_jsl
import sparknlp

params = {"spark.driver.memory":"16G",
"spark.kryoserializer.buffer.max":"2000M",
"spark.driver.maxResultSize":"2000M"}

spark = sparknlp_jsl.start(jsl_secret,params=params)

print ("Spark NLP Version :", sparknlp.version())
print ("Spark NLP_JSL Version :", sparknlp_jsl.version())

Spark NLP Version : 3.1.2
Spark NLP_JSL Version : 3.1.2



<b>  if you want to work with Spark 2.3 </b>
```
import os

# Install java
! apt-get update -qq
! apt-get install -y openjdk-8-jdk-headless -qq > /dev/null

!wget -q https://archive.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz

!tar xf spark-2.3.0-bin-hadoop2.7.tgz
!pip install -q findspark

os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
os.environ["SPARK_HOME"] = "/content/spark-2.3.0-bin-hadoop2.7"
! java -version

import findspark
findspark.init()
from pyspark.sql import SparkSession

! pip install --ignore-installed -q spark-nlp==2.7.5
import sparknlp

spark = sparknlp.start(spark23=True)
```

## Pretrained Pipelines

In order to save you from creating a pipeline from scratch, Spark NLP also has a pre-trained pipelines that are already fitted using certain annotators and transformers according to various use cases.

Here is the list of clinical pre-trained pipelines: 

> These clinical pipelines are trained with `embeddings_healthcare_100d` and accuracies might be 1-2% lower than `embeddings_clinical` which is 200d.

**1.   explain_clinical_doc_carp** :

> a pipeline with `ner_clinical`, `assertion_dl`, `re_clinical` and `ner_posology`. It will extract clinical and medication entities, assign assertion status and find relationships between clinical entities.

**2.   explain_clinical_doc_era** :

> a pipeline with `ner_clinical_events`, `assertion_dl` and `re_temporal_events_clinical`. It will extract clinical entities, assign assertion status and find temporal relationships between clinical entities.

**3.   recognize_entities_posology** :

> a pipeline with `ner_posology`. It will only extract medication entities.


*Since 3rd pipeline is already a subset of 1st and 2nd pipeline, we will only cover the first two pipelines in this notebook.*

**4.   explain_clinical_doc_ade** :

> a pipeline for `Adverse Drug Events (ADE)` with `ner_ade_biobert`, `assertiondl_biobert`, `classifierdl_ade_conversational_biobert` and `re_ade_biobert`. It will classify the document, extract `ADE` and `DRUG` entities, assign assertion status to `ADE` entities, and relate them with `DRUG` entities, then assign ADE status to a text (`True` means ADE, `False` means not related to ADE).

**letter codes in the naming conventions:**

> c : ner_clinical

> e : ner_clinical_events

> r : relation extraction

> p : ner_posology

> a : assertion

> ade : adverse drug events

**Relation Extraction types:**

`re_clinical` >> TrIP (improved), TrWP (worsened), TrCP (caused problem), TrAP (administered), TrNAP (avoided), TeRP (revealed problem), TeCP (investigate problem), PIP (problems related)

`re_temporal_events_clinical` >> `AFTER`, `BEFORE`, `OVERLAP`

**5.  icd10cm_snomed_mapping:**

> a pipeline converts ICD10CM codes to Snomed codes. Just feed a comma or white space delimited ICD10CM codes and it will return the corresponding SNOMED codes as a list.

**6.  snomed_icd10cm_mapping:**

> a pipeline converts Snomed codes to ICD10CM codes. Just feed a comma or white space delimited SNOMED codes and it will return the corresponding ICD10CM codes as a list.


## 1.explain_clinical_doc_carp 

a pipeline with ner_clinical, assertion_dl, re_clinical and ner_posology. It will extract clinical and medication entities, assign assertion status and find relationships between clinical entities.

In [None]:
from sparknlp.pretrained import PretrainedPipeline

In [None]:
pipeline = PretrainedPipeline('explain_clinical_doc_carp', 'en', 'clinical/models')

explain_clinical_doc_carp download started this may take some time.
Approx size to download 1.6 GB
[OK!]


In [None]:
pipeline.model.stages

[DocumentAssembler_9619f8fd837c,
 SentenceDetector_c0b14c755033,
 REGEX_TOKENIZER_352efbad7483,
 POS_6f55785005bf,
 dependency_d5a8da6c9093,
 WORD_EMBEDDINGS_MODEL_9004b1d00302,
 MedicalNerModel_cd5ce67b529f,
 NerConverter_89fec9d64d2a,
 MedicalNerModel_4a303d875127,
 NerConverter_50c49f50f3ec,
 ASSERTION_DL_25881ab6309e,
 RelationExtractionModel_9c255241fec3]

In [None]:
# Load pretrained pipeline from local disk:

# >> pipeline_local = PretrainedPipeline.from_disk('/root/cache_pretrained/explain_clinical_doc_carp_en_2.5.5_2.4_1597841630062')

In [None]:
text ="""A 28-year-old female with a history of gestational diabetes mellitus, used to take metformin 1000 mg two times a day, presented with a one-week history of polyuria , polydipsia , poor appetite , and vomiting .
She was seen by the endocrinology service and discharged on 40 units of insulin glargine at night, 12 units of insulin lispro with meals.
"""

annotations = pipeline.annotate(text)

annotations.keys()


dict_keys(['sentences', 'clinical_ner_tags', 'document', 'clinical_ner_chunks', 'assertion', 'clinical_relations', 'posology_ner_tags', 'tokens', 'posology_ner_chunks', 'embeddings', 'pos_tags', 'dependencies'])

In [None]:
import pandas as pd

rows = list(zip(annotations['tokens'], annotations['clinical_ner_tags'], annotations['posology_ner_tags'], annotations['pos_tags'], annotations['dependencies']))

df = pd.DataFrame(rows, columns = ['tokens','clinical_ner_tags','posology_ner_tags','POS_tags','dependencies'])

df.head(20)

Unnamed: 0,tokens,clinical_ner_tags,posology_ner_tags,POS_tags,dependencies
0,A,O,O,DD,female
1,28-year-old,O,O,NN,female
2,female,O,O,NN,ROOT
3,with,O,O,II,history
4,a,O,O,DD,history
5,history,O,O,NN,female
6,of,O,O,II,history
7,gestational,B-PROBLEM,O,JJ,of
8,diabetes,I-PROBLEM,O,NN,mellitus
9,mellitus,I-PROBLEM,O,NN,gestational


In [None]:
text = 'Patient has a headache for the last 2 weeks and appears anxious when she walks fast. No alopecia noted. She denies pain'

result = pipeline.fullAnnotate(text)[0]

chunks=[]
entities=[]
status=[]

for n,m in zip(result['clinical_ner_chunks'],result['assertion']):
    
    chunks.append(n.result)
    entities.append(n.metadata['entity']) 
    status.append(m.result)
        
df = pd.DataFrame({'chunks':chunks, 'entities':entities, 'assertion':status})

df

Unnamed: 0,chunks,entities,assertion
0,a headache,PROBLEM,present
1,anxious,PROBLEM,conditional
2,alopecia,PROBLEM,absent
3,pain,PROBLEM,absent


In [None]:
text = """
The patient was prescribed 1 unit of Advil for 5 days after meals. The patient was also 
given 1 unit of Metformin daily.
He was seen by the endocrinology service and she was discharged on 40 units of insulin glargine at night , 
12 units of insulin lispro with meals , and metformin 1000 mg two times a day.
"""

result = pipeline.fullAnnotate(text)[0]

chunks=[]
entities=[]
begins=[]
ends=[]

for n in result['posology_ner_chunks']:
    
    chunks.append(n.result)
    begins.append(n.begin)
    ends.append(n.end)
    entities.append(n.metadata['entity']) 
        
df = pd.DataFrame({'chunks':chunks, 'begin':begins, 'end':ends, 'entities':entities})

df

Unnamed: 0,chunks,begin,end,entities
0,1 unit,28,33,DOSAGE
1,Advil,38,42,DRUG
2,for 5 days,44,53,DURATION
3,1 unit,96,101,DOSAGE
4,Metformin,106,114,DRUG
5,daily,116,120,FREQUENCY
6,40 units,190,197,DOSAGE
7,insulin glargine,202,217,DRUG
8,at night,219,226,FREQUENCY
9,12 units,231,238,DOSAGE


## **2.   explain_clinical_doc_era** 

> a pipeline with `ner_clinical_events`, `assertion_dl` and `re_temporal_events_clinical`. It will extract clinical entities, assign assertion status and find temporal relationships between clinical entities.



In [None]:
era_pipeline = PretrainedPipeline('explain_clinical_doc_era', 'en', 'clinical/models')

explain_clinical_doc_era download started this may take some time.
Approx size to download 1.6 GB
[OK!]


In [None]:
era_pipeline.model.stages

[DocumentAssembler_81ef1f17c7c1,
 SentenceDetector_0b67d45c215f,
 REGEX_TOKENIZER_4d38514cc549,
 POS_6f55785005bf,
 dependency_d5a8da6c9093,
 WORD_EMBEDDINGS_MODEL_9004b1d00302,
 MedicalNerModel_7cb29c8c904c,
 NerConverter_dc8e863a00ea,
 RelationExtractionModel_14b00157fc1a,
 ASSERTION_DL_25881ab6309e]

In [None]:
text ="""She is admitted to The John Hopkins Hospital 2 days ago with a history of gestational diabetes mellitus diagnosed. She denied pain and any headache.
She was seen by the endocrinology service and she was discharged on 03/02/2018 on 40 units of insulin glargine, 
12 units of insulin lispro, and metformin 1000 mg two times a day. She had close follow-up with endocrinology post discharge. 
"""

result = era_pipeline.fullAnnotate(text)[0]


In [None]:
result.keys()

dict_keys(['sentences', 'clinical_ner_tags', 'document', 'clinical_ner_chunks', 'assertion', 'clinical_relations', 'tokens', 'embeddings', 'pos_tags', 'dependencies'])

In [None]:
import pandas as pd

chunks=[]
entities=[]
begins=[]
ends=[]

for n in result['clinical_ner_chunks']:
    
    chunks.append(n.result)
    begins.append(n.begin)
    ends.append(n.end)
    entities.append(n.metadata['entity']) 
        
df = pd.DataFrame({'chunks':chunks, 'begin':begins, 'end':ends, 'entities':entities})

df

Unnamed: 0,chunks,begin,end,entities
0,admitted,7,14,OCCURRENCE
1,The John Hopkins Hospital,19,43,CLINICAL_DEPT
2,2 days ago,45,54,DATE
3,gestational diabetes mellitus,74,102,PROBLEM
4,denied,119,124,EVIDENTIAL
5,pain,126,129,PROBLEM
6,any headache,135,146,PROBLEM
7,the endocrinology service,165,189,CLINICAL_DEPT
8,discharged,203,212,OCCURRENCE
9,03/02/2018,217,226,DATE


In [None]:
chunks=[]
entities=[]
status=[]

for n,m in zip(result['clinical_ner_chunks'],result['assertion']):
    
    chunks.append(n.result)
    entities.append(n.metadata['entity']) 
    status.append(m.result)
        
df = pd.DataFrame({'chunks':chunks, 'entities':entities, 'assertion':status})

df

Unnamed: 0,chunks,entities,assertion
0,admitted,OCCURRENCE,present
1,The John Hopkins Hospital,CLINICAL_DEPT,present
2,2 days ago,DATE,present
3,gestational diabetes mellitus,PROBLEM,present
4,denied,EVIDENTIAL,absent
5,pain,PROBLEM,absent
6,any headache,PROBLEM,absent
7,the endocrinology service,CLINICAL_DEPT,present
8,discharged,OCCURRENCE,present
9,03/02/2018,DATE,present


In [None]:
import pandas as pd

def get_relations_df (results, col='relations'):
  rel_pairs=[]
  for rel in results[0][col]:
      rel_pairs.append((
          rel.result, 
          rel.metadata['entity1'], 
          rel.metadata['entity1_begin'],
          rel.metadata['entity1_end'],
          rel.metadata['chunk1'], 
          rel.metadata['entity2'],
          rel.metadata['entity2_begin'],
          rel.metadata['entity2_end'],
          rel.metadata['chunk2'], 
          rel.metadata['confidence']
      ))

  rel_df = pd.DataFrame(rel_pairs, columns=['relation','entity1','entity1_begin','entity1_end','chunk1','entity2','entity2_begin','entity2_end','chunk2', 'confidence'])

  rel_df.confidence = rel_df.confidence.astype(float)
  
  return rel_df

In [None]:
annotations = era_pipeline.fullAnnotate(text)

rel_df = get_relations_df (annotations, 'clinical_relations')

rel_df

Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,AFTER,OCCURRENCE,7,14,admitted,CLINICAL_DEPT,19,43,The John Hopkins Hospital,0.963836
1,BEFORE,OCCURRENCE,7,14,admitted,DATE,45,54,2 days ago,0.587098
2,BEFORE,OCCURRENCE,7,14,admitted,PROBLEM,74,102,gestational diabetes mellitus,0.999991
3,OVERLAP,CLINICAL_DEPT,19,43,The John Hopkins Hospital,DATE,45,54,2 days ago,0.996056
4,BEFORE,CLINICAL_DEPT,19,43,The John Hopkins Hospital,PROBLEM,74,102,gestational diabetes mellitus,0.995216
5,OVERLAP,DATE,45,54,2 days ago,PROBLEM,74,102,gestational diabetes mellitus,0.996954
6,BEFORE,EVIDENTIAL,119,124,denied,PROBLEM,126,129,pain,1.0
7,BEFORE,EVIDENTIAL,119,124,denied,PROBLEM,135,146,any headache,1.0
8,OVERLAP,PROBLEM,126,129,pain,PROBLEM,135,146,any headache,1.0
9,BEFORE,CLINICAL_DEPT,165,189,the endocrinology service,OCCURRENCE,203,212,discharged,0.825623


In [None]:
annotations[0]['clinical_relations']

[Annotation(category, 7, 43, AFTER, {'chunk2': 'The John Hopkins Hospital', 'confidence': '0.9638356', 'entity2_end': '43', 'chunk1': 'admitted', 'entity2_begin': '19', 'entity1': 'OCCURRENCE', 'entity1_begin': '7', 'entity1_end': '14', 'entity2': 'CLINICAL_DEPT'}),
 Annotation(category, 7, 54, BEFORE, {'chunk2': '2 days ago', 'confidence': '0.5870984', 'entity2_end': '54', 'chunk1': 'admitted', 'entity2_begin': '45', 'entity1': 'OCCURRENCE', 'entity1_begin': '7', 'entity1_end': '14', 'entity2': 'DATE'}),
 Annotation(category, 7, 102, BEFORE, {'chunk2': 'gestational diabetes mellitus', 'confidence': '0.9999908', 'entity2_end': '102', 'chunk1': 'admitted', 'entity2_begin': '74', 'entity1': 'OCCURRENCE', 'entity1_begin': '7', 'entity1_end': '14', 'entity2': 'PROBLEM'}),
 Annotation(category, 19, 54, OVERLAP, {'chunk2': '2 days ago', 'confidence': '0.9960561', 'entity2_end': '54', 'chunk1': 'The John Hopkins Hospital', 'entity2_begin': '45', 'entity1': 'CLINICAL_DEPT', 'entity1_begin': '1

## 3.explain_clinical_doc_ade 

A pipeline for `Adverse Drug Events (ADE)` with `ner_ade_healthcare`, and `classifierdl_ade_biobert`. It will extract `ADE` and `DRUG` clinical entities, and then assign ADE status to a text(`True` means ADE, `False` means not related to ADE). Also extracts relations between `DRUG` and `ADE` entities (`1` means the adverse event and drug entities are related, `0` is not related).

In [None]:
ade_pipeline = PretrainedPipeline('explain_clinical_doc_ade', 'en', 'clinical/models')

explain_clinical_doc_ade download started this may take some time.
Approx size to download 462.3 MB
[OK!]


In [None]:
result = ade_pipeline.fullAnnotate("The main adverse effects of Leflunomide consist of diarrhea, nausea, liver enzyme elevation, hypertension, alopecia, and allergic skin reactions.")

result[0].keys()

dict_keys(['bert_sentence_embeddings', 'bert_embeddings', 'document', 'ner_chunks_ade_assertion', 'ner_tags_ade', 'relations_ade_drug', 'ner_chunks_ade', 'assertion_ade', 'tokens', 'class', 'pos_tags', 'dependencies'])

In [None]:
result[0]['class'][0].metadata

{'sentence': '0', 'False': '0.0033159005', 'True': '0.99668413'}

In [None]:
text = "Jaw,neck, low back and hip pains. Numbness in legs and arms. Took about a month for the same symptoms to begin with Vytorin. The pravachol started the pains again in about 3 months. I stopped taking all statin drungs. Still hurting after 2 months of stopping. Be careful taking this drug."

import pandas as pd

chunks = []
entities = []
begin =[]
end = []

print ('sentence:', text)
print()

result = ade_pipeline.fullAnnotate(text)

print ('ADE status:', result[0]['class'][0].result)

print ('prediction probability>> True : ', result[0]['class'][0].metadata['True'], \
        'False: ', result[0]['class'][0].metadata['False'])

for n in result[0]['ner_chunks_ade']:

    begin.append(n.begin)
    end.append(n.end)
    chunks.append(n.result)
    entities.append(n.metadata['entity']) 

df = pd.DataFrame({'chunks':chunks, 'entities':entities,
                'begin': begin, 'end': end})

df


sentence: Jaw,neck, low back and hip pains. Numbness in legs and arms. Took about a month for the same symptoms to begin with Vytorin. The pravachol started the pains again in about 3 months. I stopped taking all statin drungs. Still hurting after 2 months of stopping. Be careful taking this drug.

ADE status: True
prediction probability>> True :  0.99863094 False:  0.0013689825


Unnamed: 0,chunks,entities,begin,end
0,"Jaw,neck, low back and hip pains",ADE,0,31
1,Numbness,ADE,34,41
2,Vytorin,DRUG,116,122
3,pravachol,DRUG,129,137
4,pains,ADE,151,155


#### with AssertionDL

In [None]:
import pandas as pd

text = """I have an allergic reaction to vancomycin. 
My skin has be itchy, sore throat/burning/itchy, and numbness in tongue and gums. 
I would not recommend this drug to anyone, especially since I have never had such an adverse reaction to any other medication."""

print (text)

light_result = ade_pipeline.fullAnnotate(text)[0]

chunks=[]
entities=[]
status=[]

for n,m in zip(light_result['ner_chunks_ade_assertion'],light_result['assertion_ade']):
    
    chunks.append(n.result)
    entities.append(n.metadata['entity']) 
    status.append(m.result)
        
df = pd.DataFrame({'chunks':chunks, 'entities':entities, 'assertion':status})

df

I have an allergic reaction to vancomycin. 
My skin has be itchy, sore throat/burning/itchy, and numbness in tongue and gums. 
I would not recommend this drug to anyone, especially since I have never had such an adverse reaction to any other medication.


Unnamed: 0,chunks,entities,assertion
0,allergic reaction,ADE,present
1,itchy,ADE,present
2,sore throat/burning/itchy,ADE,present
3,numbness in tongue and gums,ADE,present


#### with Relation Extraction

In [None]:
import pandas as pd

text = """I have Rhuematoid Arthritis for 35 yrs and have been on many arthritis meds. 
I currently am on Relefen for inflamation, Prednisone 5mg, every other day and Enbrel injections once a week. 
I have no problems from these drugs. Eight months ago, another doctor put me on Lipitor 10mg daily because my chol was 240. 
Over a period of 6 months, it went down to 159, which was great, BUT I started having terrible aching pain in my arms about that time which was radiating down my arms from my shoulder to my hands.
"""
 
print (text)

results = ade_pipeline.fullAnnotate(text)

rel_pairs=[]

for rel in results[0]["relations_ade_drug"]:
    rel_pairs.append((
        rel.result, 
        rel.metadata['entity1'], 
        rel.metadata['entity1_begin'],
        rel.metadata['entity1_end'],
        rel.metadata['chunk1'], 
        rel.metadata['entity2'],
        rel.metadata['entity2_begin'],
        rel.metadata['entity2_end'],
        rel.metadata['chunk2'], 
        rel.metadata['confidence']
    ))

rel_df = pd.DataFrame(rel_pairs, columns=['relation','entity1','entity1_begin','entity1_end','chunk1','entity2','entity2_begin','entity2_end','chunk2', 'confidence'])
rel_df

I have Rhuematoid Arthritis for 35 yrs and have been on many arthritis meds. 
I currently am on Relefen for inflamation, Prednisone 5mg, every other day and Enbrel injections once a week. 
I have no problems from these drugs. Eight months ago, another doctor put me on Lipitor 10mg daily because my chol was 240. 
Over a period of 6 months, it went down to 159, which was great, BUT I started having terrible aching pain in my arms about that time which was radiating down my arms from my shoulder to my hands.



Unnamed: 0,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
0,0,DRUG,96,102,Relefen,ADE,409,430,aching pain in my arms,1.0
1,0,DRUG,121,130,Prednisone,ADE,409,430,aching pain in my arms,0.9999989
2,0,DRUG,157,173,Enbrel injections,ADE,409,430,aching pain in my arms,0.9999994
3,1,DRUG,269,275,Lipitor,ADE,409,430,aching pain in my arms,0.9999975


## 4.Clinical Deidentification

This pipeline can be used to deidentify PHI information from medical texts. The PHI information will be masked and obfuscated in the resulting text. The pipeline can mask and obfuscate `AGE`, `CONTACT`, `DATE`, `ID`, `LOCATION`, `NAME`, `PROFESSION`, `CITY`, `COUNTRY`, `DOCTOR`, `HOSPITAL`, `IDNUM`, `MEDICALRECORD`, `ORGANIZATION`, `PATIENT`, `PHONE`, `PROFESSION`, `STREET`, `USERNAME`, `ZIP`, `ACCOUNT`, `LICENSE`, `VIN`, `SSN`, `DLN`, `PLATE`, `IPADDR` entities.

In [None]:
deid_pipeline = PretrainedPipeline("clinical_deidentification", "en", "clinical/models")

clinical_deidentification download started this may take some time.
Approx size to download 1.6 GB
[OK!]


In [None]:
deid_res = deid_pipeline.annotate("Record date : 2093-01-13, David Hale, M.D. IP: 203.120.223.13. The driver's license no:A334455B. the SSN:324598674 and e-mail: hale@gmail.com. Name : Hendrickson, Ora MR. 25 years-old # 719435 Date : 01/13/93. Signed by Oliveira Sander, . Record date : 2079-11-09, Patient's VIN : 1HGBH41JXMN109286.")

In [None]:
deid_res.keys()

dict_keys(['masked', 'obfuscated', 'ner_chunk', 'sentence'])

In [None]:
pd.set_option("display.max_colwidth", 100)

df = pd.DataFrame(list(zip(deid_res['sentence'], deid_res['masked'], deid_res['obfuscated'])),
                  columns = ['Sentence','Masked', 'Obfuscated'])
df

Unnamed: 0,Sentence,Masked,Obfuscated
0,"Record date : 2093-01-13, David Hale, M.D.","Record date : <DATE>, <DOCTOR>, M.D.","Record date : 2093-01-17, Dr Armin Gums, M.D."
1,IP: 203.120.223.13.,IP: <IPADDR>.,IP: 003.003.003.003.
2,The driver's license no:A334455B.,The driver's license <DLN>.,The driver's license S99921801.
3,the SSN:324598674 and e-mail: hale@gmail.com.,the <SSN> and e-mail: <EMAIL>.,the 999-36-5441 and e-mail: Coy@google.com.
4,"Name : Hendrickson, Ora MR. 25 years-old # 719435 Date : 01/13/93.",Name : <PATIENT> MR. <AGE> years-old # <MEDICALRECORD> Date : <DATE>.,Name : Katheran Furry MR. 5 years-old # Y8290315 Date : 03-18-1986.
5,"Signed by Oliveira Sander, .","Signed by <DOCTOR>, .","Signed by Dr Pansy Perking, ."
6,"Record date : 2079-11-09, Patient's VIN : 1HGBH41JXMN109286.","Record date : <DATE>, Patient's VIN : <VIN>.","Record date : 2079-12-22, Patient's VIN : 5eeee44ffff555666."


## 5.ICD10CM to Snomed Code

This pretrained pipeline maps ICD10CM codes to SNOMED codes without using any text data. You’ll just feed a comma or white space delimited ICD10CM codes and it will return the corresponding SNOMED codes as a list. For the time being, it supports 132K Snomed codes and will be augmented & enriched in the next releases.

In [None]:
icd_snomed_pipeline = PretrainedPipeline("icd10cm_snomed_mapping", "en", "clinical/models")

icd10cm_snomed_mapping download started this may take some time.
Approx size to download 514.5 KB
[OK!]


In [None]:
icd_snomed_pipeline.model.stages

[DocumentAssembler_effe917bc86b,
 REGEX_TOKENIZER_a2e7a20a20d4,
 LEMMATIZER_0ca0f7005a90,
 Finisher_07470acb09e3]

In [None]:
icd_snomed_pipeline.annotate('M89.50 I288 H16269')

{'icd10cm': ['M89.50', 'I288', 'H16269'],
 'snomed': ['733187009', '449433008', '51264003']}

|**ICD10CM** | **Details** | 
| ---------- | -----------:|
| M89.50 |  Osteolysis, unspecified site |
| I288 | Other diseases of pulmonary vessels |
| H16269 | Vernal keratoconjunctivitis, with limbar and corneal involvement, unspecified eye |

| **SNOMED** | **Details** |
| ---------- | -----------:|
| 733187009 | Osteolysis following surgical procedure on skeletal system |
| 449433008 | Diffuse stenosis of left pulmonary artery |
| 51264003 | Limbal AND/OR corneal involvement in vernal conjunctivitis |

## 6.Snomed to ICD10CM Code
This pretrained pipeline maps SNOMED codes to ICD10CM codes without using any text data. You'll just feed a comma or white space delimited SNOMED codes and it will return the corresponding candidate ICD10CM codes as a list (multiple ICD10 codes for each Snomed code). For the time being, it supports 132K Snomed codes and 30K ICD10 codes and will be augmented & enriched in the next releases.

In [None]:
snomed_icd_pipeline = PretrainedPipeline("snomed_icd10cm_mapping","en","clinical/models")

snomed_icd10cm_mapping download started this may take some time.
Approx size to download 1.8 MB
[OK!]


In [None]:
snomed_icd_pipeline.model.stages

[DocumentAssembler_136f968cb1ef,
 REGEX_TOKENIZER_ecc8d3a8dbc9,
 LEMMATIZER_e9ae88d69d05,
 Finisher_790dd28aacd1]

In [None]:
snomed_icd_pipeline.annotate('733187009 449433008 51264003')

{'icd10cm': ['M89.59, M89.50, M96.89',
  'Q25.6, I28.8',
  'H10.45, H10.1, H16.269'],
 'snomed': ['733187009', '449433008', '51264003']}

| **SNOMED** | **Details** |
| ------ | ------:|
| 733187009| Osteolysis following surgical procedure on skeletal system |
| 449433008 | Diffuse stenosis of left pulmonary artery |
| 51264003 | Limbal AND/OR corneal involvement in vernal conjunctivitis|

| **ICDM10CM** | **Details** |  
| ---------- | ---------:|
| M89.59 | Osteolysis, multiple sites |  
| M89.50 | Osteolysis, unspecified site |
| M96.89 | Other intraoperative and postprocedural complications and disorders of the musculoskeletal system | 
| Q25.6 | Stenosis of pulmonary artery |    
| I28.8 | Other diseases of pulmonary vessels |
| H10.45 | Other chronic allergic conjunctivitis |
| H10.1 | Acute atopic conjunctivitis | 
| H16.269 | Vernal keratoconjunctivitis, with limbar and corneal involvement, unspecified eye |

Also you can find these healthcare code mapping pretrained pipelines here: [Healthcare_Codes_Mapping](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/11.1.Healthcare_Code_Mapping.ipynb)

- ICD10CM to UMLS  
- Snomed to UMLS 
- RxNorm to UMLS
- RxNorm to MeSH
- MeSH to UMLS