

![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/ER_RXNORM.ipynb)




To run this yourself, you will need to upload your license keys to the notebook. Just Run The Cell Below in order to do that. Also You can open the file explorer on the left side of the screen and upload `license_keys.json` to the folder that opens.
Otherwise, you can look at the example outputs at the bottom of the notebook.



## 1. Colab Setup

Import license keys

In [1]:
import os
import json

from google.colab import files

license_keys = files.upload()

with open(list(license_keys.keys())[0]) as f:
    license_keys = json.load(f)

sparknlp_version = license_keys["PUBLIC_VERSION"]
jsl_version = license_keys["JSL_VERSION"]

print ('SparkNLP Version:', sparknlp_version)
print ('SparkNLP-JSL Version:', jsl_version)

Saving License Keys 3.0.2.json to License Keys 3.0.2.json
SparkNLP Version: 3.0.2
SparkNLP-JSL Version: 3.0.2


Install dependencies

In [2]:
%%capture
for k,v in license_keys.items(): 
    %set_env $k=$v

!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/jsl_colab_setup.sh
!bash jsl_colab_setup.sh

# Install Spark NLP Display for visualization
!pip install --ignore-installed spark-nlp-display

Import dependencies into Python

In [3]:
import pandas as pd
from pyspark.ml import Pipeline
from pyspark.sql import SparkSession
import pyspark.sql.functions as F

import sparknlp
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp.base import *
import sparknlp_jsl

Start the Spark session

In [4]:
spark = sparknlp_jsl.start(license_keys['SECRET'])

# manually configure the session
# params = {"spark.driver.memory" : "16G",
#           "spark.kryoserializer.buffer.max" : "2000M",
#           "spark.driver.maxResultSize" : "2000M"}

# spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

# **ICD10 to Snomed Code Mapping**

Import Pretrained Pipeline

In [5]:
from sparknlp.pretrained import PretrainedPipeline 
pipeline = PretrainedPipeline("icd10cm_snomed_mapping","en","clinical/models")
result = pipeline.annotate('M89.50 I288 H16269')

icd10cm_snomed_mapping download started this may take some time.
Approx size to download 514.5 KB
[OK!]


In [6]:
pipeline.model.stages

[DocumentAssembler_effe917bc86b,
 REGEX_TOKENIZER_a2e7a20a20d4,
 LEMMATIZER_0ca0f7005a90,
 Finisher_07470acb09e3]

Visualize

In [7]:
result

{'icd10cm': ['M89.50', 'I288', 'H16269'],
 'snomed': ['733187009', '449433008', '51264003']}

In [None]:
import pandas as pd
result_table = pd.DataFrame(result, columns=["icd10cm", "snomed"])

In [None]:
print(result_table)

  icd10cm     snomed
0  M89.50  733187009
1    I288  449433008
2  H16269   51264003


# **Snomed to ICD10 Code Mapping**

Import Pretrained Pipeline

In [14]:
from sparknlp.pretrained import PretrainedPipeline 
pipeline = PretrainedPipeline("snomed_icd10cm_mapping","en","clinical/models")
result = pipeline.annotate('721617001 733187009 109006')

snomed_icd10cm_mapping download started this may take some time.
Approx size to download 1.8 MB
[OK!]


In [15]:
pipeline.model.stages

[DocumentAssembler_136f968cb1ef,
 REGEX_TOKENIZER_ecc8d3a8dbc9,
 LEMMATIZER_e9ae88d69d05,
 Finisher_790dd28aacd1]

Visualize

In [16]:
result

{'icd10cm': ['K22.70, C15.5',
  'M89.59, M89.50, M96.89',
  'F41.9, F40.10, F94.8, F93.0, F40.8, F93.8'],
 'snomed': ['721617001', '733187009', '109006']}

In [11]:
import pandas as pd
result_table = pd.DataFrame(result, columns=["snomed", "icd10cm"])

In [12]:
print(result_table)

      snomed icd10cm
0  733187009  M89.50
1  449433008    I288
2   51264003  H16269
