![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/ER_ICD10_CM.ipynb)

## **Detect Problem with ICD-10**

To run this yourself, you will need to upload your license keys to the notebook. Just Run The Cell Below in order to do that. Also You can open the file explorer on the left side of the screen and upload license_keys.json to the folder that opens. Otherwise, you can look at the example outputs at the bottom of the notebook.

## **Colab Setup**

In [None]:
import json, os
from google.colab import files

if 'spark_jsl.json' not in os.listdir():
  license_keys = files.upload()
  os.rename(list(license_keys.keys())[0], 'spark_jsl.json')

with open('spark_jsl.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

In [3]:
import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.ml import Pipeline,PipelineModel
from pyspark.sql.types import StringType, IntegerType

import pandas as pd
pd.set_option('display.max_colwidth', 200)

import warnings
warnings.filterwarnings('ignore')

params = {"spark.driver.memory":"16G", 
          "spark.kryoserializer.buffer.max":"2000M", 
          "spark.driver.maxResultSize":"2000M"} 

spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

Spark NLP Version : 4.2.8
Spark NLP_JSL Version : 4.2.8


# **🔎 For about models**

📌 **sbiobertresolve_icd10cm**--> *This model maps clinical findings to their corresponding ICD-10-CM code in healthcare records using Entity Resolvers.*

*   Predicted Entities => **Problem**

📌 **sbiobertresolve_icd10cm_augmented** --> *This model maps clinical findings to ICD-10-CM codes using Entity Resolvers. It is augmented with synonyms, which results in having a four times richer vocabulary than the non-augmented version.*

*   Predicted Entities => **Problem** 

📌 **sbiobertresolve_icd10cm_augmented_billable_hcc**--> *This model maps clinical findings to ICD-10-CM codes, it is augmented with synonyms, what results in having a four times richer vocabulary than non-augmented version. It returns 7-digit codes from ICD-10-CM.*

*   Predicted Entities => **Problem** 

📌 **sbiobertresolve_icd10cm_generalised**--> *This model maps clinical findings to ICD-10-CM codes. It predicts ICD-10-CM 3-character codes what, according to ICD-10-CM code structure, represents the general type of the injury or disease.*

*   Predicted Entities => **Problem**

📌 **sbiobertresolve_icd10cm_slim_billable_hcc**--> *This model maps clinical findings to ICD-10-CM codes . It returns the official resolution text within the brackets. *

*   Predicted Entities => **Problem**

📌 **sbiobertresolve_icd10cm_slim_normalized**--> *This model synonyms having low cosine similarity to unnormalized terms are dropped, making the model slim. It also returns the official resolution text within the brackets inside the metadata *

*   Predicted Entities => **Problem**



# **🔎Define Spark NLP pipeline**

In [4]:
documentAssembler = DocumentAssembler()\
      .setInputCol("text")\
      .setOutputCol("document")

sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare","en","clinical/models")\
    .setInputCols(["document"])\
    .setOutputCol("sentence")

tokenizer = Tokenizer()\
      .setInputCols(["sentence"])\
      .setOutputCol("token")\

word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
      .setInputCols(["sentence", "token"])\
      .setOutputCol("embeddings")

clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models") \
      .setInputCols(["sentence", "token", "embeddings"]) \
      .setOutputCol("ner")

ner_converter = NerConverter() \
      .setInputCols(["sentence", "token", "ner"]) \
      .setOutputCol("ner_chunk")\
      .setWhiteList(['PROBLEM'])

c2doc = Chunk2Doc()\
      .setInputCols("ner_chunk")\
      .setOutputCol("ner_chunk_doc") 

sbert_embedder = BertSentenceEmbeddings\
      .pretrained("sbiobert_base_cased_mli",'en','clinical/models')\
      .setInputCols(["ner_chunk_doc"])\
      .setOutputCol("sbert_embeddings")

def pipeline(model_name):
  icd_resolver = SentenceEntityResolverModel.pretrained(model_name, "en", "clinical/models") \
        .setInputCols(["sbert_embeddings"]) \
        .setOutputCol("icd_code")\
        .setDistanceFunction("EUCLIDEAN")

  resolver_pipeline = Pipeline(
      stages = [
          documentAssembler,
          sentenceDetector,
          tokenizer,
          word_embeddings,
          clinical_ner,
          ner_converter,
          c2doc,
          sbert_embedder,
          icd_resolver
          ])

  data_ner = spark.createDataFrame([[""]]).toDF("text")
  models = resolver_pipeline.fit(data_ner)
  
  light_model = LightPipeline(models)
  return light_model

sentence_detector_dl_healthcare download started this may take some time.
Approximate size to download 367.3 KB
[OK!]
embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_clinical download started this may take some time.
[OK!]
sbiobert_base_cased_mli download started this may take some time.
Approximate size to download 384.3 MB
[OK!]


In [5]:
# returns spark df resolution results

def get_codes_from_df(result_df, chunk, output_col, hcc= False):
    
    
    if hcc:
        
        df = result_df.select(F.explode(F.arrays_zip(result_df[chunk].result, 
                                                     result_df[chunk].metadata, 
                                                     result_df[output_col].result, 
                                                     result_df[output_col].metadata)).alias("cols")) \
                      .select(F.expr("cols['1']['sentence']").alias("sent_id"),
                              F.expr("cols['0']").alias("ner_chunk"),
                              F.expr("cols['1']['entity']").alias("entity"), 
                              F.expr("cols['2']").alias("icd10_code"),
                              F.expr("cols['3']['all_k_results']").alias("all_codes"),
                              F.expr("cols['3']['all_k_resolutions']").alias("resolutions"),
                              F.expr("cols['3']['all_k_aux_labels']").alias("hcc_list")).toPandas()



        codes = []
        resolutions = []
        hcc_all = []

        for code, resolution, hcc in zip(df['all_codes'], df['resolutions'], df['hcc_list']):

            codes.append(code.split(':::'))
            resolutions.append(resolution.split(':::'))
            hcc_all.append(hcc.split(":::"))

        df['all_codes'] = codes  
        df['resolutions'] = resolutions
        df['hcc_list'] = hcc_all
        
    else:
                       
        df = result_df.select(F.explode(F.arrays_zip(result_df[chunk].result, 
                                                           result_df[chunk].metadata, 
                                                           result_df[output_col].result, 
                                                           result_df[output_col].metadata)).alias("cols")) \
                                     .select(F.expr("cols['1']['sentence']").alias("sent_id"),
                                             F.expr("cols['0']").alias("ner_chunk"),
                                             F.expr("cols['1']['entity']").alias("entity"), 
                                             F.expr("cols['2']").alias(f"{output_col}"),
                                             F.expr("cols['3']['all_k_results']").alias("all_codes"),
                                             F.expr("cols['3']['all_k_resolutions']").alias("resolutions")).toPandas()



        codes = []
        resolutions = []

        for code, resolution in zip(df['all_codes'], df['resolutions']):

            codes.append(code.split(':::'))
            resolutions.append(resolution.split(':::'))

        df['all_codes'] = codes  
        df['resolutions'] = resolutions
        
    
    return df

# **🔎 "sbiobertresolve_icd10cm" model**

In [6]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""


clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm").transform(clinical_note_df)

sbiobertresolve_icd10cm download started this may take some time.
[OK!]


In [7]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=False)

In [8]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd_code,all_codes,resolutions
0,0,a nonproductive cough,PROBLEM,R0600,"[R0600, M79639, M79609, J949, L299, S27339S, S27439S, M2550, R479, M25519, J989, R069, R609, R52, J9690, R452, R251, M79606, M79643, M25539, F4000, M79603, M25529, J189, M79669]","[Dyspnea, unspecified, Pain in unspecified forearm, Pain in unspecified limb, Pleural condition, unspecified, Pruritus, unspecified, Laceration of lung, unspecified, sequela, Laceration of bronchu..."
1,1,right-sided chest pain,PROBLEM,R1011,"[R1011, M79621, M79604, M79631, M25511, M79601, M79661, M79651, R071, R1031, M25551, I83811, G90511, M25531, M79641, H5711, M25521, R072, M25541, R0789, M79644, R10811, R079, R1084, G90521]","[Right upper quadrant pain, Pain in right upper arm, Pain in right leg, Pain in right forearm, Pain in right shoulder, Pain in right arm, Pain in right lower leg, Pain in right thigh, Chest pain o..."
2,1,fever,PROBLEM,A790,"[A790, A78, M041, A689, R502, R509, A921, A779, R5082, A680, R5083, R51, R10817, R601, R05, R10827, R5081, R290, R152, R100, R1084, R600, L290, A981, A962]","[Trench fever, Q fever, Periodic fever syndromes, Relapsing fever, unspecified, Drug induced fever, Fever, unspecified, O'nyong-nyong fever, Spotted fever, unspecified, Postprocedural fever, Louse..."
3,2,pericarditis,PROBLEM,I301,"[I301, B3323, I309, I32, I010, I310, I311, I300, A7481, A3953, I092, J9851, K650, I319, J36, I314, K651, I400, L03213, K653, S280XXS, I409, Q240, I41, S2763XS]","[Infective pericarditis, Viral pericarditis, Acute pericarditis, unspecified, Pericarditis in diseases classified elsewhere, Acute rheumatic pericarditis, Chronic adhesive pericarditis, Chronic co..."
4,2,cough,PROBLEM,R05,"[R05, R062, R0601, J384, R067, R51, R0600, R070, G4483, R0683, R290, R071, R110, R64, R1114, R0602, R4582, R0981, R093, R0781, R12, R490, R0603, J940, R29810]","[Cough, Wheezing, Orthopnea, Edema of larynx, Sneezing, Headache, Dyspnea, unspecified, Pain in throat, Primary cough headache, Snoring, Tetany, Chest pain on breathing, Nausea, Cachexia, Bilious ..."
5,2,right-sided chest pain,PROBLEM,R1011,"[R1011, M79621, M79604, M79631, M25511, M79601, M79661, M79651, R071, R1031, M25551, I83811, G90511, M25531, M79641, H5711, M25521, R072, M25541, R0789, M79644, R10811, R079, R1084, G90521]","[Right upper quadrant pain, Pain in right upper arm, Pain in right leg, Pain in right forearm, Pain in right shoulder, Pain in right arm, Pain in right lower leg, Pain in right thigh, Chest pain o..."
6,3,right-sided pleural effusion,PROBLEM,M25411,"[M25411, J810, M25421, H02841, H2141, M25111, H05221, H11421, M25461, R2231, Q340, J910, M25431, H7411, M25451, J84114, H21561, I50811, H02842, J471, J9692, R093, J219, J432, M25141]","[Effusion, right shoulder, Acute pulmonary edema, Effusion, right elbow, Edema of right upper eyelid, Pupillary membranes, right eye, Fistula, right shoulder, Edema of right orbit, Conjunctival ed..."


In [9]:
from sparknlp_display import EntityResolverVisualizer

sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm download started this may take some time.
[OK!]


# **🔎 "sbiobertresolve_icd10cm_augmented" model**

In [10]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""


clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm_augmented").transform(clinical_note_df)

sbiobertresolve_icd10cm_augmented download started this may take some time.
[OK!]


In [11]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=False)

In [12]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd_code,all_codes,resolutions
0,0,a nonproductive cough,PROBLEM,R05,"[R05, R059, R292, A379, J988, R098, R0989, R0600, H508, J989, R478, R0689, R4789]","[non-productive cough (finding), cough, unspecified, cough impulse of mass absent (finding), whooping cough, unspecified species, o/e - nonspecific respiratory lesion, blowing nose ineffectual (fi..."
1,1,right-sided chest pain,PROBLEM,R0789,"[R0789, R072, R079, M796, R52, I209, M7960, R1011, M79621, M7962, M79604, R078, R101, R0781]","[right sided chest pain (finding), retrosternal chest pain, acute chest pain, chronic pain of right upper limb (finding), localised chest pain, ischaemic chest pain, chronic pain of right upper li..."
2,1,fever,PROBLEM,P819,"[P819, R508, R509, B338, A938, F681, A689, P818, A230]","[fever, intermittent fever, fever symptoms, ossa fever, piry fever, artificial fever, recurrent fever, swinging fever, undulant fever]"
3,2,pericarditis,PROBLEM,I319,"[I319, I301, I308, I318, B3323, A188, I425, I30, I309, I310, I241, T465X, I313, S2780]","[pericarditis, infectious pericarditis, adhesive pericarditis, chronic pericarditis, viral pericarditis, constrictive pericarditis, obliterative pericarditis, acute pericarditis, acute pericarditi..."
4,2,cough,PROBLEM,R05,"[R05, B948, A37, R053, R051, R292]","[cough, persistent cough, whooping cough, chronic cough, acute cough, cough reflex impaired]"
5,2,right-sided chest pain,PROBLEM,R0789,"[R0789, R072, R079, M796, R52, I209, M7960, R1011, M79621, M7962, M79604, R078, R101, R0781]","[right sided chest pain (finding), retrosternal chest pain, acute chest pain, chronic pain of right upper limb (finding), localised chest pain, ischaemic chest pain, chronic pain of right upper li..."
6,3,right-sided pleural effusion,PROBLEM,J90,"[J90, P2889, M25411, J91, J869, J918, R849, Q338, I309, M25421, M2541, R848, R600]","[pleural effusion, bilateral pleural effusion, effusion, right shoulder, haemorrhagic pleural effusion, pleurisy with effusion, secondary pleural effusion, pleural fluid examination abnormal, bilo..."


In [13]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm_augmented").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm_augmented download started this may take some time.
[OK!]


# **🔎 "sbiobertresolve_icd10cm_augmented_billable_hcc" model**

In [14]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm_augmented_billable_hcc").transform(clinical_note_df)

sbiobertresolve_icd10cm_augmented_billable_hcc download started this may take some time.
[OK!]


In [15]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=True)

In [16]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd10_code,all_codes,resolutions,hcc_list
0,0,a nonproductive cough,PROBLEM,R05,"[R05, R292, A379, J988, R098, R0989, R0600, H508, J989, R478, R0689, R4789, H9319]","[non-productive cough (finding), cough impulse of mass absent (finding), whooping cough, unspecified species, o/e - nonspecific respiratory lesion, blowing nose ineffectual (finding), blowing nose...","[1||0||0, 1||0||0, 0||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0]"
1,1,right-sided chest pain,PROBLEM,R0789,"[R0789, R072, R079, M796, R52, I209, M7960, R1011, M79621, M7962, M79604, R078, R101, R0781]","[right sided chest pain (finding), retrosternal chest pain, acute chest pain, chronic pain of right upper limb (finding), localised chest pain, ischaemic chest pain, chronic pain of right upper li...","[1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||1||88, 0||0||0, 1||0||0, 0||0||0, 0||0||0, 0||0||0, 0||0||0, 0||0||0, 1||0||0]"
2,1,fever,PROBLEM,P819,"[P819, R508, R509, B338, A938, F681, A689, P818, A230]","[fever, intermittent fever, fever symptoms, ossa fever, piry fever, artificial fever, recurrent fever, swinging fever, undulant fever]","[1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0]"
3,2,pericarditis,PROBLEM,I319,"[I319, I301, I308, I318, B3323, A188, I425, I30, I309, I310, I241, T465X, I313, S2780]","[pericarditis, infectious pericarditis, adhesive pericarditis, chronic pericarditis, viral pericarditis, constrictive pericarditis, obliterative pericarditis, acute pericarditis, acute pericarditi...","[1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||1||85, 0||0||0, 1||0||0, 1||0||0, 1||1||87, 0||1||59, 1||0||0, 0||0||0]"
4,2,cough,PROBLEM,R05,"[R05, B948, A37, R053, R292, R093]","[cough, persistent cough, whooping cough, chronic cough, cough reflex impaired, productive cough (finding)]","[1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0]"
5,2,right-sided chest pain,PROBLEM,R0789,"[R0789, R072, R079, M796, R52, I209, M7960, R1011, M79621, M7962, M79604, R078, R101, R0781]","[right sided chest pain (finding), retrosternal chest pain, acute chest pain, chronic pain of right upper limb (finding), localised chest pain, ischaemic chest pain, chronic pain of right upper li...","[1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||1||88, 0||0||0, 1||0||0, 0||0||0, 0||0||0, 0||0||0, 0||0||0, 0||0||0, 1||0||0]"
6,3,right-sided pleural effusion,PROBLEM,J90,"[J90, P2889, M25411, J91, J869, J918, R849, Q338, I309, M25421, M2541, R848, R600]","[pleural effusion, bilateral pleural effusion, effusion, right shoulder, haemorrhagic pleural effusion, pleurisy with effusion, secondary pleural effusion, pleural fluid examination abnormal, bilo...","[1||0||0, 1||0||0, 0||0||0, 0||0||0, 1||1||115, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 0||0||0, 1||0||0, 1||0||0]"


In [17]:
from sparknlp_display import EntityResolverVisualizer

sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm_augmented_billable_hcc").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm_augmented_billable_hcc download started this may take some time.
[OK!]


# **🔎 "sbiobertresolve_icd10cm_generalised" model**

In [18]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""


clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm_generalised").transform(clinical_note_df)

sbiobertresolve_icd10cm_generalised download started this may take some time.
[OK!]


In [19]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=False)

In [20]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd_code,all_codes,resolutions
0,0,a nonproductive cough,PROBLEM,R05,"[R05, A37, R09, R06, J98, Y56, R47, H93, J96, Z53, R07, J94, R51, M25, R45, R68, M79]","[unproductive cough, whooping cough, unspecified species, blowing nose ineffectual, dyspnea, unspecified, respiratory disorder, unspecified, nonoxinol adverse reaction (disorder), difficulty produ..."
1,1,right-sided chest pain,PROBLEM,R07,"[R07, M79, G89, I99, M25]","[right sided chest pain, right leg pain, chronic right arm pain, right arm ischemic limb pain, right shoulder pain]"
2,1,fever,PROBLEM,R50,"[R50, A68, A78, A77, J17, A79, B23]","[fever, recurrent fever, acute q fever, boutonneuse fever, query fever, wolhynian fever, fever associated with aids (disorder)]"
3,2,pericarditis,PROBLEM,I31,"[I31, I30, B33, T46, I32, I09]","[pericarditis, infectious pericarditis, viral pericarditis, drug-induced pericarditis, parasitic pericarditis, rheumatic pericarditis]"
4,2,cough,PROBLEM,R05,"[R05, A37, R09]","[cough, whooping cough, respiratory tract congestion and cough (disorder)]"
5,2,right-sided chest pain,PROBLEM,R07,"[R07, M79, G89, I99, M25]","[right sided chest pain, right leg pain, chronic right arm pain, right arm ischemic limb pain, right shoulder pain]"
6,3,right-sided pleural effusion,PROBLEM,P28,"[P28, M25, H74, J91, I30, J94, J92, S20, R09, R22]","[bilateral pleural effusion, effusion, right shoulder, right middle ear effusion, secondary pleural effusion, acute pericardial effusion, thickening of pleura, thickening of pleura, right chest wa..."


In [21]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm_generalised").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm_generalised download started this may take some time.
[OK!]


# **🔎 "sbiobertresolve_icd10cm_slim_billable_hcc" model**

In [22]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""


clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm_slim_billable_hcc").transform(clinical_note_df)

sbiobertresolve_icd10cm_slim_billable_hcc download started this may take some time.
[OK!]


In [23]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=True)

In [24]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd10_code,all_codes,resolutions,hcc_list
0,0,a nonproductive cough,PROBLEM,R05,"[R05, R05.9, A37.9, R09.89, R06.00, J98.9, R47.89, H93.19, J96.9, Z53.9, R07.82, R07.89, R06.9, R47.9, J94.9, R51.9, M25.419, R45.89, R68.89, M79.60, H93.1]","[unproductive cough [Cough], cough, unspecified [Cough, unspecified], whooping cough, unspecified species [Whooping cough, unspecified species], blowing nose ineffectual [Other specified symptoms ...","[0||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 0||0||0]"
1,1,right-sided chest pain,PROBLEM,R07.89,"[R07.89, R07.2, R07.9, M79.604, M79.621, M79.601, G89.29, I99.8, M25.511, R10.11, M79.631]","[right sided chest pain (finding) [Other chest pain], retrosternal chest pain [Precordial pain], acute chest pain [Chest pain, unspecified], right leg pain [Pain in right leg], right upper arm pai...","[1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0]"
2,1,fever,PROBLEM,R50.9,"[R50.9, R50.8, A68.9, A68, A78, A77.1, R50.2, A77.9, A79.0]","[fever [Fever, unspecified], intermittent fever [Other specified fever], recurrent fever [Relapsing fever, unspecified], relapsing fevers [Relapsing fevers], acute q fever [Q fever], boutonneuse f...","[1||0||0, 0||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0]"
3,2,pericarditis,PROBLEM,I31.9,"[I31.9, I30.1, I31.0, B33.23, I31.1, I30, I30.9, T46.5X, I31.3, I09.2, I30.8]","[pericarditis [Disease of pericardium, unspecified], infectious pericarditis [Infective pericarditis], adhesive pericarditis [Chronic adhesive pericarditis], viral pericarditis [Viral pericarditis...","[1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0]"
4,2,cough,PROBLEM,R05,"[R05, A37, R05.3, R05.1]","[cough [Cough], whooping cough [Whooping cough], chronic cough [Chronic cough], acute cough [Acute cough]]","[0||0||0, 0||0||0, 1||0||0, 1||0||0]"
5,2,right-sided chest pain,PROBLEM,R07.89,"[R07.89, R07.2, R07.9, M79.604, M79.621, M79.601, G89.29, I99.8, M25.511, R10.11, M79.631]","[right sided chest pain (finding) [Other chest pain], retrosternal chest pain [Precordial pain], acute chest pain [Chest pain, unspecified], right leg pain [Pain in right leg], right upper arm pai...","[1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0]"
6,3,right-sided pleural effusion,PROBLEM,P28.89,"[P28.89, M25.41, M25.411, H74.8X1, J91.8, I30.9, M25.42, M25.421, M25.431, M25.43, M25.441, J94.1, J92.9, M25.46, M25.461, S20.321A, R09.89, R22.31]","[bilateral pleural effusion [Other specified respiratory conditions of newborn], effusion, right shoulder [Effusion, shoulder], effusion, right shoulder [Effusion, right shoulder], right middle ea...","[1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 0||0||0, 1||0||0, 1||0||0, 1||0||0, 1||0||0]"


In [25]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm_slim_billable_hcc").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm_slim_billable_hcc download started this may take some time.
[OK!]


# **🔎 "sbiobertresolve_icd10cm_slim_normalized" model**

In [26]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""


clinical_note_df = spark.createDataFrame([[sample_text]]).toDF("text")
icd_sdf = pipeline("sbiobertresolve_icd10cm_slim_normalized").transform(clinical_note_df)

sbiobertresolve_icd10cm_slim_normalized download started this may take some time.
[OK!]


In [27]:
res_pd = get_codes_from_df(icd_sdf, 'ner_chunk', 'icd_code', hcc=False)

In [28]:
res_pd.head(10)

Unnamed: 0,sent_id,ner_chunk,entity,icd_code,all_codes,resolutions
0,0,a nonproductive cough,PROBLEM,R05,"[R05, R05.9, A37.9, R09.89, R06.00, J98.9, R47.89, H93.19, J96.9, Z53.9, R07.82, R07.89, R06.9, R47.9, J94.9, R51.9, M25.419, R45.89, R68.89, M79.60, H93.1]","[unproductive cough [Cough], cough, unspecified [Cough, unspecified], whooping cough, unspecified species [Whooping cough, unspecified species], blowing nose ineffectual [Other specified symptoms ..."
1,1,right-sided chest pain,PROBLEM,R07.89,"[R07.89, R07.2, R07.9, M79.604, M79.621, M79.601, G89.29, I99.8, M25.511, R10.11, M79.631]","[right sided chest pain (finding) [Other chest pain], retrosternal chest pain [Precordial pain], acute chest pain [Chest pain, unspecified], right leg pain [Pain in right leg], right upper arm pai..."
2,1,fever,PROBLEM,R50.9,"[R50.9, R50.8, A68.9, A68, A78, A77.1, R50.2, A77.9, A79.0]","[fever [Fever, unspecified], intermittent fever [Other specified fever], recurrent fever [Relapsing fever, unspecified], relapsing fevers [Relapsing fevers], acute q fever [Q fever], boutonneuse f..."
3,2,pericarditis,PROBLEM,I31.9,"[I31.9, I30.1, I31.0, B33.23, I31.1, I30.9, I30, T46.5X, I31.3, I09.2, I30.8]","[pericarditis [Disease of pericardium, unspecified], infectious pericarditis [Infective pericarditis], adhesive pericarditis [Chronic adhesive pericarditis], viral pericarditis [Viral pericarditis..."
4,2,cough,PROBLEM,R05,"[R05, A37, R05.3, R05.1]","[cough [Cough], whooping cough [Whooping cough], chronic cough [Chronic cough], acute cough [Acute cough]]"
5,2,right-sided chest pain,PROBLEM,R07.89,"[R07.89, R07.2, R07.9, M79.604, M79.621, M79.601, G89.29, I99.8, M25.511, R10.11, M79.631]","[right sided chest pain (finding) [Other chest pain], retrosternal chest pain [Precordial pain], acute chest pain [Chest pain, unspecified], right leg pain [Pain in right leg], right upper arm pai..."
6,3,right-sided pleural effusion,PROBLEM,P28.89,"[P28.89, M25.41, M25.411, H74.8X1, J91.8, I30.9, M25.421, M25.42, M25.431, M25.43, M25.441, J94.1, J92.9, M25.461, M25.46, S20.321A, R09.89, M79.89]","[bilateral pleural effusion [Other specified respiratory conditions of newborn], effusion, right shoulder [Effusion, shoulder], effusion, right shoulder [Effusion, right shoulder], right middle ea..."


In [29]:
sample_text = """The patient is a 41-year-old Vietnamese female with a nonproductive cough that started last week. She has had right-sided chest pain radiating to her back with fever starting yesterday. She has a history of pericarditis and pericardectomy in May 2006 and developed cough with right-sided chest pain, and went to an urgent care center. Chest x-ray revealed right-sided pleural effusion."""

light_result = pipeline("sbiobertresolve_icd10cm_slim_normalized").fullAnnotate(sample_text)

er_vis = EntityResolverVisualizer()

er_vis.display(light_result[0],
               label_col='ner_chunk',
               resolution_col = 'icd_code',
               document_col='document'
               )

sbiobertresolve_icd10cm_slim_normalized download started this may take some time.
[OK!]
