## Tree Based Small Language Model Medical Coding Architecture

The approach is the leverage small language model(s) to traverse a heirarchy of ICD-9 codes and ask many small questions to classify a Note Event from the MIMI-III dataset.

For example, the image below is a representation of a small portion of the ICD9 code tree. The branch in the picture below shows a subset of the 'Infectious and Parasitic Disease' Chapter of the ICD9 code tree. [View a full json representation of the taxonomy here.](./icd9_full.json) For this implementation, ICD-9 Code levels are broken down into Chapters, Blocks, Categories. To expand on this implementation the Tree can be broken down further into Sub-categories, Extension I, and Extension II codes.
  
The coding algorithm recursively walks the tree, starting at the top level and continuing down any branch(es) directed by the mini model until the final codes is returned.

0        Infectious and Parasitic Diseases  
├── 00   Intestinal infectious diseases  
│   ├── 001 Cholera  
│   ├── 002 Typhoid and paratyphoid fevers  
│   ├── 003 Salmonella  
│   ├── 004 Shigellosis  
│   ├── 005 Other poisoning (bacterial)  
│   ├── 006 Amebiasis  
│   ├── 007 Other protozoal intestinal diseases  
│   ├── 008 Intestinal infections due to other organisms  
│   └── 009 Ill-defined intestinal infections  
├── 01   Tuberculosis  
│   ├── 010 Primary tuberculous infection  
│   ├── 011 Pulmonary tuberculosis  
│   ├── 012 Other respiratory tuberculosis  
│   ├── 013 Tuberculosis of meninges and central nervous system  
│   ├── 014 Tuberculosis of intestines, peritoneum, and mesenteric glands  
│   ├── 015 Tuberculosis of bones and joints  
│   ├── 016 Tuberculosis of genitourinary system  
│   ├── 017 Tuberculosis of other organs  
│   ├── 018 Miliary tuberculosis  
│   └── 019 Respiratory tuberculosis unspecified  
...  
├── 09   Rickettsioses and other arthropod-borne diseases  
│   ├── ...   
│   ├── 087 Relapsing fever  
│   └── 088 Other arthropod-borne diseases   

  
This approach is explored further in this paper:  
 [Automated clinical coding using off-the-shelf large language models](https://arxiv.org/pdf/2310.06552) (Boyle et al.)

In [2]:
from src.tree import TaxonomyParser
from zensols.mednlp import ApplicationFactory

from nltk import flatten
from tqdm import tqdm
from dotenv import load_dotenv, find_dotenv
from textwrap import dedent
from openai import AzureOpenAI
from typing import List, Dict, Any

import pandas as pd
import ast
import functools
import logging
import os


doc_parser = ApplicationFactory.get_doc_parser()
logger = logging.getLogger(__name__)

load_dotenv(find_dotenv(), override=True)
print(os.getenv("AZURE_OPENAI_BASE"))

pd.set_option('display.max_colwidth', None)

  from tqdm.autonotebook import tqdm, trange


https://medcode-aoai-useast.openai.azure.com/


In [3]:
# Initialize Code Tree
code_tree = TaxonomyParser()
code_tree.read_from_json("icd9_tax.json")

print(code_tree.find_by_name("00"))

Node('/root/0/00', description='Intestinal infectious diseases')


In [4]:
# View Tree
code_tree.visualize("0")

0        Infectious and Parasitic Diseases
├── 00   Intestinal infectious diseases
│   ├── 001 Cholera
│   ├── 002 Typhoid and paratyphoid fevers
│   ├── 003 Salmonella
│   ├── 004 Shigellosis
│   ├── 005 Other poisoning (bacterial)
│   ├── 006 Amebiasis
│   ├── 007 Other protozoal intestinal diseases
│   ├── 008 Intestinal infections due to other organisms
│   └── 009 Ill-defined intestinal infections
├── 01   Tuberculosis
│   ├── 010 Primary tuberculous infection
│   ├── 011 Pulmonary tuberculosis
│   ├── 012 Other respiratory tuberculosis
│   ├── 013 Tuberculosis of meninges and central nervous system
│   ├── 014 Tuberculosis of intestines, peritoneum, and mesenteric glands
│   ├── 015 Tuberculosis of bones and joints
│   ├── 016 Tuberculosis of genitourinary system
│   ├── 017 Tuberculosis of other organs
│   ├── 018 Miliary tuberculosis
│   └── 019 Respiratory tuberculosis unspecified
├── 02   Zoonotic bacterial diseases
│   ├── 020 Plague
│   ├── 021 Tularemia
│   ├── 022 Anthrax

#### Define Helper Functions

In [5]:
# Note Parsing Functions

def format_icd9(x):
    new_codes = []
    code_list = ast.literal_eval(x)
    for code in code_list:
        new_codes.append(f"{code:0>3}".format(num="1"))

    return str(new_codes)

def parse_note(note:str) -> str:
    
    doc = doc_parser(note)

    new_note = set([])
    for tok in doc.tokens:
        if tok.is_concept and tok.tuis_ in ['T184', 'T047', 'T046', 'T033', 'T037','T191','T005', 'T004', 'T007', 'T008']:
            
            # print(tok, tok.detected_name_, tok.sub_names, tok.pref_name_, tok.tuis_, tok.tui_descs_)
            new_note.add(tok.detected_name_.replace("~"," "))
            new_note.add(tok.pref_name_.lower())

    logger.info(f"Note Parsing Complete.")
    
    return " ".join(new_note)

In [6]:
#Scoring Functions

def recall_score(truth, generated):
    actual_list = ast.literal_eval(truth)
    generated_list = ast.literal_eval(generated)

    similar = len(set(actual_list) & set(generated_list))

    return similar / len(actual_list)

def precision_score(truth, generated):
    actual_list = ast.literal_eval(truth)
    generated_list = ast.literal_eval(generated)

    if len(generated_list) == 0:
        return 0

    similar = len(set(actual_list) & set(generated_list))

    return similar / len(generated_list)

def f1_score(truth, generated):
    precision = precision_score(truth, generated)
    recall = recall_score(truth, generated)

    if precision + recall == 0:
        return 0
    else:
        return 2 * (precision * recall) / (precision + recall)

In [7]:
# Make Call to AOAI

def call_aoai(sys:str, prompt:str) -> List:

    aoai_client = AzureOpenAI(
        azure_endpoint = os.getenv("AZURE_OPENAI_BASE"), 
        api_key=os.getenv("AZURE_OPENAI_KEY"),
        api_version="2024-02-01"
    )
    
    response = aoai_client.chat.completions.create(
        model=os.getenv("AZURE_DEPLOYMENT_NAME"), # model = "deployment_name".
        messages=[
            {"role": "system", "content": dedent(sys)},
            {"role": "user", "content": dedent(prompt)}
        ],
    )

    try:
        output = ast.literal_eval(response.choices[0].message.content)
        return output
    except Exception as e:
        logger.warning(f"{e}")
        return []

In [8]:
# Build Prompt Dymanically

def get_options(tree, parent_code):
    children = tree.get_children(parent_code)
    options = []
    for child in children:
        options.append(f"{child.name}: {child.description}")
    
    return '|'.join(options)

def build_prompt(tree, parent_code, note, categories):
    sys = """
    You are a medical expert. Your job is to classify notes of an event into one or more categories. ACCURACY is VERY IMPORTANT to your job.
    Choose the best option(s) based on the categories offered. ALWAYS return at least one index. ONLY choose from categories listed. 
    Respond with a list of quoted string indeces of the categories the note belongs to.
    Think through your answer. 
    
    ### EXAMPLE ###
    Categories = 0: Infectious and Parasitic Diseases | 1: Neoplasms | 2: Endocrine, Nutritional and Metabolic Diseases, and Immunity Disorders
    Note = Patient has Tuberculosis and an Immunity Disorder
    Answer: ['0','2']
    ## END EXAMPLE ##
    """
    
    
    prompt = f"""
    Categories = {categories}
    Note = {note}
    Answer:
    """

    return sys, prompt

In [9]:
# Recursive Walk of tree and call aoai to get codes

def get_codes_for_note(parent_code, tree, note, level=3):
    
    categories = get_options(tree, parent_code)
    sys, prompt = build_prompt(tree, parent_code, note, categories)

    codes = call_aoai(sys, prompt)
    
    logger.info(f"Parent Code: {parent_code} | Found: {codes}")
    logger.info(f"Prompt: {prompt}")

    if codes == [] or codes == ['']:
        return ['X'*level]
    elif all(len(i) == level for i in codes):
        return codes
    else:
        return list(map(functools.partial(get_codes_for_note, tree=tree, note=note, level=level), codes))
    

## Prepare Data

In [10]:
# df = transform_data("data/") # Only re-run if change in preparation logic
df = pd.read_csv("data/joined/dataset_single_001_088.csv.gz")
print(df.shape)
display(df.dtypes)

(4855, 3)


HADM_ID       int64
TEXT         object
ICD9_CODE    object
dtype: object

In [11]:
# Get L1 and L2 codes for grading purposes

def get_parent_codes(code_tree, codes):
    code_list = ast.literal_eval(codes)
    parent_codes = []
    for code in code_list:
        parent_codes.append(code_tree.find_by_name(code).parent.name)
    
    parent_codes = list(set(parent_codes))
    return str(parent_codes)

df['L2_CODES'] = df['ICD9_CODE'].apply(lambda x: get_parent_codes(code_tree, x))
df['L1_CODES'] = df['L2_CODES'].apply(lambda x: get_parent_codes(code_tree, x))
display(df[['ICD9_CODE', 'L2_CODES', 'L1_CODES']].head(5))

Unnamed: 0,ICD9_CODE,L2_CODES,L1_CODES
0,['041'],['03'],['0']
1,"['038', '070']","['08', '03']",['0']
2,['041'],['03'],['0']
3,['038'],['03'],['0']
4,['038'],['03'],['0']


In [12]:
# Take Final Subset

df = df[0:10]
print(df.shape)

(10, 5)


In [13]:
# Add Parsed Text field
tqdm.pandas()
df['PARSED_TEXT'] = df['TEXT'].progress_apply(parse_note)

  meta_cat.model.load_state_dict(torch.load(model_save_path, map_location=device))
2024-08-28 13:51:56,692 filtering on tuis: 
2024-08-28 13:52:13,749 parsing: ['Admission Date Discharge Date Date Birth Sex Service [...]
 20%|██        | 2/10 [00:21<01:25, 10.69s/it]2024-08-28 13:52:14,525 parsing: ['Admission Date Discharge Date Date Birth Sex F [...]
 30%|███       | 3/10 [00:21<00:44,  6.42s/it]2024-08-28 13:52:15,839 parsing: ['Admission Date Discharge Date Date Birth Sex Service [...]
 40%|████      | 4/10 [00:23<00:28,  4.68s/it]2024-08-28 13:52:17,083 parsing: ['Admission Date Discharge Date Date Birth Sex Service [...]
 50%|█████     | 5/10 [00:24<00:16,  3.38s/it]2024-08-28 13:52:18,072 parsing: ['Admission Date Discharge Date Date Birth Sex Service [...]
 60%|██████    | 6/10 [00:25<00:10,  2.61s/it]2024-08-28 13:52:20,150 parsing: ['Admission Date Discharge Date Date Birth Sex F [...]
 70%|███████   | 7/10 [00:28<00:07,  2.55s/it]2024-08-28 13:52:20,842 parsing: ['Sinus rhyt

In [18]:
print(df.shape)
display(df.head(2))

(10, 7)


Unnamed: 0,HADM_ID,TEXT,ICD9_CODE,L2_CODES,L1_CODES,PARSED_TEXT,Parsed_Generated
0,100020,"['Admission Date Discharge Date Date Birth Sex Service MEDICINE Allergies Percocet / Bactrim Ds / Lisinopril AttendingFirst Name LF Chief Complaint hypotension Major Surgical Invasive Procedure none History Present Illness Mr . Known lastname yo w/ multiple sclerosis seizure disorder presented OSH delusions AMS x days . OSH , noted Na . history hyponatremia Na mid since . seen nephrology . OSH , approx sec generalized tonic clonic seizure , received mg Ativan , transferred ED Hospital . also history seizures especially setting infection hyponatremia . unclear seizures without inciting event . currently weaned Keppra Gabapentin started Tegretol . ER , VS . / % L. given L NS . Given AMS setting infection known chronic UTIs indwelling suprapubic catheter neurogenic bladder , blood urine cultures obtained well CXR . urine culture grew pseudomonas CXR showed possible infiltrate treated vancomycin cefepime . head CT negative . Past Medical History MS since , progressive , quadriplegic , neurogenic bladder suprapubic catheter , restrictive PFTs History Aspiration PNAs Esophageal Ulcer NSAIDs , , small bowel bx negative Recurrent UTIs CHF EF > % moderate LVH HTN Legally Blind Social History married years lives wife home . three children three grandchildren . professor First Name Titles Last Name Titles engineering University/College , retired disability spring semester due MS. Name STitle wheelchair bound . denies tobacco , alcohol , recreational drug use . personal care assistant . Family History Father CAD CVA . Mother Name NI disease . Brother diabetes . Physical Exam General Alert , oriented , acute distress HEENT Sclera anicteric , MMM , oropharynx clear Neck supple , JVP elevated , LAD Lungs Clear auscultation bilaterally , wheezes , rales , ronchi CV Regular rate rhythm , normal + , murmurs , rubs , gallops Abdomen soft , non tender , non distended , bowel sounds present , rebound tenderness guarding , organomegaly Ext Warm , well perfused , + pulses , clubbing , cyanosis edema Pertinent Results PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD PT . PTT . INRPT . PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Na PM BLOOD Na BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Osmolal PM BLOOD Osmolal BLOOD ALT AST LDLDH AlkPhos TotBili . BLOOD Calcium . Phos . Mg . U/A nit + , LE + , WBC , RBC , Epi , bact U/A sm bld , prot/gluc WBC , RBC , Epi , bact mod U/A sm LE , WBC , RBC , Epi , bact none U/A prot , ket , lg LE WBC , RBC , Epi , bact U/A prot , mod LE WBC , RBC , Epi , bact none U/A neg leuk CULTURES BCx x neg BCx x neg UCx PSEUDOMONAS AERUGINOSA . > , ORGANISMS/ML UCx pseudomonas UCx yeast Ucx neg Ucx yeast Ucx neg c.diff neg x CXR Patchy opacity left base noted , significance setting low inspiratory volumes uncertain . CTA PE . Scattered patchy ground glass opacities may represent expiratory state air trapping . Renal u/s evidence abscess , hydronephrosis mass abd xray non specific bowel gas pattern , stool throughout colon , free air abd xray Stool air filled loops large small bowel consistent ileus . Liver u/s Hypoechoic right hepatic mass , measuring . cm size CT abd prelim read Arterially enhancing liver lesion fully characterized , may represent adenoma , FNH , less likely HCC . Brief Hospital Course yo male w/ progressive multiple sclerosis admitted AMS seizure GTC OSH responded mg Ativan . negative head CT found Na level . hyponatremic past often caused changes mental status . ED , treated L NS concern hypovolemic hyponatremia . time , urine osm serum osm . also CXR prelim concern pneumonia cause ADH like effect final read neagtive . Neurology consulted AMS seizure felt hyponatremia likely related recent initiation carbamezapine sensory illusions . Carbamezapine known ADH like effect cause hyponatremia . Following discontinuation carbamezapine along fluid restriction , Na increased . several days , pt appeared slightly dehydrated fluid restriction lifted . time discharge , serum Na . . past , seizures instigated underlying infection . However , upon admission afebrile leukocystosis . likely source either pneumonia UTI . suprapubic catheter neurogenic bladder day prior admission , urine sample grew pseudomonas , bacteria past . also several pneumonias past , likely frequent aspirations first CXR concerning lung infiltrate . treated one dose vancomycin cefepime pneumonia . Ultimately , repeat CXR CTA negative pneumonia . . pseudomonal bacteriuria , started ciprofloxacin . urine culture drawn prior abx inititian also grew pseudomonas . afebrile leukocytosis thought may actually colonization opposed infection . However , treated full course cipro complicated UTI . catheter changed cultures remained negative . . admission , pt afebrile hypertensive . However , shortly arriving floor , episode hypotension systolic . time mentating well , complaints , denied chest pain , headache , visual changes . IVFs given , however hypotension initially respond , however came eventually prior getting ICU . labile blood pressure likely secondary patients autonomic dysfunction secondary SPMS . considerations infection possible sepsis , however patient continued afebrile . Blood urine cultures negative . monitored ICU hours stable swings BP asymptomatic consistent autonomic dysfunction . Changed clonidine dosing .mg Hospital .mg TID . Maintained blood pressure medications home doses . . next day , transferred MICU returned floor . Shortly arrival , developed fever . blood urine cultures sent negative . Pneumonia ruled UTI treated medication appropriate per sensitivities . CTA negative PE . However , started meropenem treated days . still slightly febrile meropenem discontinued concern drug fever . defervesced without treatment . . However , mental status continued fluctuate despite afebrile , obvious source infection , eunatremic . occasionally aggressive would say murdered kidnapped . Neurology reconsulted feel symptoms related keppra think subclinical seizures . continued repetitive shaking moves head conscious able speak episodes . Also , despite Keppra , continued sensory illusions , mostly centered around feeling bowel movement actually . . work source infection source AMS , CTA revealed liver lesion . ultrasound multiphase liver CT describe lesion MRI implanted baclofen pump . Mr Known lastname family decided biopsy lesion time ruled completely malignancy , although unlikely . work also KUB concerning ileus continued BMs kept regular diet . . Prior discharge , mental status completely returned baseline alert oriented x longer aggressive towards staff . definite etiology elucidated hypothesized could result progression established disease . Medications Admission BACLOFEN , mcg/mL Kit pump BRIMONIDINE Dosage uncertain CARVEDILOL mg Tablet Hospital CARBAMEZAPINE mg Hospital CLONIDINE . mg Tablet Hospital CLOTRIMAZOLE BETAMETHASONE % . % Cream tid FENTANYL mcg/hour Patch hr FUROSEMIDE mg Tablet qd IPRATROPIUM ALBUTEROL prn LACTULOSE prn MINOCYCLINE mg Tablet Hospital MODAFINIL PROVIGIL Hospital OMEPRAZOLE Hospital OXYBUTYNIN CHLORIDE mg qhs SIMVASTATIN mg qd TRAVOPROST drop L eye day ACETAMINOPHEN prn ASCORBIC ACID Hospital BISACODYL hs CALCIUM mg Tid CRANBERRY mg Capsule Hospital ERGOCALCIFEROL VITAMIN Hospital MINERAL OIL prn OMEGA FATTY ACIDS Hospital PSYLLIUM METAMUCIL prn SENNA . mg Tablet prn Discharge Medications . Carvedilol . mg Tablet Sig Two Tablet PO BID times day . . Fentanyl mcg/hr Patch hr Sig One Patch hr Transdermal QH every hours . . Furosemide mg Tablet Sig One Tablet PO DAILY Daily . . Lactulose gram/ mL Syrup Sig Thirty ML PO QH every hours needed . . Acetaminophen mg Tablet Sig Tablets PO QH every hours needed . . Oxybutynin Chloride mg Tablet Sig Three Tablet PO QHS day bedtime . . Ascorbic Acid mg Tablet Sig One Tablet PO BID times day . . Docusate Sodium mg Capsule Sig One Capsule PO BID times day needed . . Senna . mg Tablet Sig One Tablet PO BID times day needed . . Calcium Carbonate mg Tablet , Chewable Sig One Tablet , Chewable PO TID times day . . Omeprazole mg Capsule , Delayed ReleaseE.C . Sig One Capsule , Delayed ReleaseE.C . PO BID times day . . Simvastatin mg Tablet Sig Four Tablet PO DAILY Daily . . Brimonidine . % Drops Sig One Drop Ophthalmic Hospital times day . . Modafinil mg Tablet Sig . Tablet PO BID times day . . Ciprofloxacin mg Tablet Sig One Tablet PO QH every hours days . . Clonidine . mg Tablet Sig One Tablet PO TID times day . . Bisacodyl mg Tablet , Delayed Release E.C . Sig Two Tablet , Delayed Release E.C . PO DAILY Daily needed . . Levetiracetam mg Tablet Sig Two Tablet PO BID times day . . Combivent mcg/Actuation Aerosol Sig One inh Inhalation twice day needed . . TRAVATAN Z . % Drops Sig One Ophthalmic day Left eye . . Cranberry mg Capsule Sig One Capsule PO twice day . . Omega Fatty Acids Capsule Sig One Capsule PO twice day . . Ergocalciferol Vitamin unit Tablet Sig One Tablet PO twice day . patient allergy listed ACE Inhibitors , therefore discharged ACE Inhibitor . communicated PCP . Discharge Disposition Home Service Facility Hospital Home Health Care Discharge Diagnosis . Multiple Sclerosis . Urinary Tract Infection , complicated . Hyponatremia . Secondary . Chronic Diastolic CHF Discharge Condition Stable vital signs . Discharge Instructions admitted altered mental status found low sodium urinary tract infection . started antibiotics urinary tract infection cipro complete week course . sodium corrected adjusting medications reducing water intake . . found abnormality liver . CT scan results pending final interpretation . provided phone number schedule appointment Hospital clinic . may necessary reimage liver take biopsy lesion seen CT scan . . medications changed . switched tegratol keppra . Please review recent medication list take medications , discard old medications list . . Please return hospital develop fevers , chills , worsening symptoms . Followup Instructions . First Name NamePattern First Name NamePattern Last Name NamePattern , MD PhoneTelephone/Fax Date/Time . . First Name Name Pattern Last Name NamePattern , MD PhoneTelephone/Fax Date/Time . . Hospital CLINIC Hospital Telephone/Fax Completed']",['041'],['03'],['0'],"cva communicated uti chf left ventricular hypertrophy leukocytosis congestive heart failure diabetes mellitus, insulin-dependent wheezing chief complaint ground glass opacity hypotension absence of fever urinary tract infection does communicate liver carcinoma afebrile discharge diagnosis chief complaint (finding) hydronephroses wheeze hcc cerebrovascular accident ggo tid hydronephrosis hyponatremia lvh","['00', '01', '04', '05', '08', '09']"
1,100074,"['Admission Date Discharge Date Date Birth Sex F Service SURGERY Allergies Ovral / Codeine / Sulfonamides AttendingDoctor First Name Chief Complaint bruising mild abdominal pain Major Surgical Invasive Procedure Exploratory laparotomy , debridement abdominal wall , small large bowel resection , closure Location un bag . Exploratory laparotomy . History Present Illness INDICATIONS SURGERY year old woman noted bruising mild abdominal pain large incisional hernia site . came emergency room developed profound sepsis CT scan showed intraperitoneal air . also found crepitance expanding hematoma bruising incisional hernia . patient taken emergently operating room . Past Medical History s/p MVC , s/p R AKA , ventral hernia repair w/ component seperation , anxiety Social History Mother son patients support system Family History noncontributory Physical Exam gen Intubated , secated CV +ss Pulm coarse BS diffusely Abd large Location un bag place Ext + edema Pertinent Results CT . Large ventral abdominal wall hernia two discrete defects . inferior hernia defect smaller defect contains several loops necrotic appearing bowel evidence pneumatosis possible perforation , suggesting strangulated ventral hernia . Large amount subcutaneous free air within ventral hernia sac inferiorly tracks retroperitoneally mesentery , necrotizing fascitis considered . . Likely aspiration lung bases , worse right side . Pathology Ventral hernial sac B Hernial sac acute inflammation serositis . II Abdominal wall C Skin subcutaneous tissue extensive necrosis abscess formation . III Distal ileum ascending colon , resection E L Extensive hemorrhagic necrosis transmural infarction small large intestine a. Transmural necrosis present proximal ileal resection margin . b . Viable distal colonic resection margin serositis acute inflammation focally extends subserosa muscularis . PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct PM BLOOD Neuts Bands Lymphs Monos Eos Baso Atyps Metas Myelos BLOOD Neuts Bands Lymphs Monos Eos Baso Atyps Metas Myelos PM BLOOD ALT AST LDLDH AlkPhos Amylase TotBili . BLOOD ALT AST LDLDH AlkPhos Amylase TotBili . BLOOD ALT AST AlkPhos Amylase TotBili . BLOOD ALT AST AlkPhos TotBili . PM BLOOD Lipase BLOOD Lipase BLOOD Lipase PM BLOOD Cortsol . PM BLOOD Cortsol . PM BLOOD Lactate . K . BLOOD Glucose Lactate . Na K . Cl BLOOD Glucose Lactate . Na K . Cl BLOOD Lactate . BLOOD Glucose Lactate . K . Brief Hospital Course patient admitted , underwent aforementioned surgical procedures details , please see operative notes . patient returned SICU intubated sedated care . , family decided make patient CMO two exploratory laparotomies . Neuro patient sedated received paralytics times keep comfortable ventilated . received pain medications IV appropriate . CV patients vital signs routinely monitored , put vasopressin , norepinephrine epinephrine stay maintain appropriate hemodynamics . Pulmonary Vital signs routinely monitored . intubated sedated throughout admission , ventilation settings adjusted based ABG values . Serial chest x rays performed . bronchoscopy performed , aspiration feculant material right bronchus intermedius , blood clot adherent left main bronchus . GI/GU/FEN Post operatively , patient made NPO IVF . unable extubated receive nutrition . , patient made CMO . patients intake output closely monitored , IVF adjusted necessary . patients electrolytes routinely followed hospitalization , repleted necessary . ID patients white blood count fever curves closely watched signs infection . white blood count continued rise throughout admission trends , please see results section . patient septic shock multiorgan failure . vancomycin , fluconazole Zosyn stay , culture data routinely monitored . Endocrine patients blood sugar monitored throughout stay insulin dosing adjusted accordingly , put drip necessary . received cosyntropin cortisol stimulation test . Hematology patients complete blood count examined routinely multiple units transfusions required stay . Prophylaxis patient received subcutaneous heparin stay . patient made CMO , passed away . Medications Admission serax , amitryptiline Discharge Disposition Expired Discharge Diagnosis Perforated viscus , dead bowel , deep tissue infection . Discharge Condition deceased Discharge Instructions none Followup Instructions none Name MD Last Name NamePattern MD , MD Number']","['038', '070']","['08', '03']",['0'],death (finding) necrotizing fasciitis incisional hernia active inflammation discharge diagnosis chief complaint (finding) chief complaint septic shock blood clot passed away abdominal pain acute inflammation,['03']


: 

## Get ICD9 Codes

### Part 1 - Get Codes from Gpt-4o mini

In [12]:
### SIMPLE TEST ###
"""
res = flatten(get_codes_for_note("root", code_tree, "Tuberculosis of the bones and joints and HIV"))
print(res)
"""
#### END SIMPLE TEST ###

'\nres = flatten(get_codes_for_note("root", code_tree, "Tuberculosis of the bones and joints and HIV"))\nprint(res)\n'

In [1]:
results = []
df['Generated'] = ""
for index, row in df.iterrows():

    # Parse Note
    note = ast.literal_eval(row['TEXT'])[0]
    print(f"Note: {note}")
    # Get Codes
    result = flatten(get_codes_for_note("0", code_tree, note, level=2)) # Change level here if needed

    # Add result to DF
    df.at[index, 'Generated'] = str(result)

NameError: name 'df' is not defined

In [14]:
# View Results

display(df[['ICD9_CODE','L1_CODES','L2_CODES', 'Generated']].head(10))

Unnamed: 0,ICD9_CODE,L1_CODES,L2_CODES,Generated
0,['041'],['0'],['03'],"['01', '03']"
1,"['038', '070']",['0'],"['03', '08']","['00', '01', '03']"
2,['041'],['0'],['03'],"['01', '03', '04', '08', '05']"
3,['038'],['0'],['03'],"['00', '01', '03']"
4,['038'],['0'],['03'],"['01', '03', '04', '05', '09']"
5,['053'],['0'],['06'],"['01', '00', '03', '04']"
6,['038'],['0'],['03'],['01']
7,"['038', '047']",['0'],"['05', '03']","['01', '03']"
8,"['041', '038']",['0'],['03'],"['01', '03', '04', '05', '08']"
9,['038'],['0'],['03'],"['01', '03', '04', '09']"


## Score Results

#### Grade L2 Output

In [15]:
results = pd.DataFrame()


results['ICD9_CODE'] = df['ICD9_CODE'].apply(format_icd9)
results['Recall'] = df.apply(lambda x: recall_score(x['L2_CODES'], x['Generated']), axis=1)
results['Precision'] = df.apply(lambda x: precision_score(x['L2_CODES'], x['Generated']), axis=1)
results['F1 Score'] = df.apply(lambda x: f1_score(x['L2_CODES'], x['Generated']), axis=1)
display(results[['Recall', 'Precision', 'F1 Score']].mean(axis=0)*100)

Recall       70.000000
Precision    25.166667
F1 Score     34.666667
dtype: float64

#### Grade Final  ICD 9 Code Output

In [24]:
results = pd.DataFrame()

results['ICD9_CODE'] = df['ICD9_CODE'].apply(format_icd9)
results['Recall'] = df.apply(lambda x: recall_score(x['ICD9_CODE'], x['Generated']), axis=1)
results['Precision'] = df.apply(lambda x: precision_score(x['ICD9_CODE'], x['Generated']), axis=1)
results['F1 Score'] = df.apply(lambda x: f1_score(x['ICD9_CODE'], x['Generated']), axis=1)

display(results[['Recall', 'Precision', 'F1 Score']].mean(axis=0)*100)

Recall       5.000000
Precision    3.333333
F1 Score     4.000000
dtype: float64

#### Results Summary

In [None]:
print(f"Recall = {round(results['Recall'].mean(),2)}")
print(f"Precision = {round(results['Precision'].mean(),2)}")

## Implement Med NLP Note Parsing

In [15]:
results = []
df['Parsed_Generated'] = ""
for index, row in df.iterrows():

    # Parse Note
    note = row['PARSED_TEXT']
    print(f"Note: {note}")

    # Get Codes
    result = flatten(get_codes_for_note("0", code_tree, note, level=2)) # Change level here if needed

    # Add result to DF
    df.at[index, 'Parsed_Generated'] = str(result)

Note: cva communicated uti chf left ventricular hypertrophy leukocytosis congestive heart failure diabetes mellitus, insulin-dependent wheezing chief complaint ground glass opacity hypotension absence of fever urinary tract infection does communicate liver carcinoma afebrile discharge diagnosis chief complaint (finding) hydronephroses wheeze hcc cerebrovascular accident ggo tid hydronephrosis hyponatremia lvh
Note: death (finding) necrotizing fasciitis incisional hernia active inflammation discharge diagnosis chief complaint (finding) chief complaint septic shock blood clot passed away abdominal pain acute inflammation
Note: cva communicated adverse drug reactions hyperkalemia rales acs - acute coronary syndrome acute renal failure chf kidney failure, acute leukocytosis congestive heart failure adverse reaction to drug severe cardiac valve stenosis wheezing chief complaint hypotension cholelithiasis constipated does communicate gestational thyrotoxicosis unstable angina severe stenosis

2024-08-28 13:53:01,804 invalid syntax (<unknown>, line 1)


Note: cva ulcerative colitis respiratory symptom gastric ulcer chorea chronic obstructive pulmonary disease hyperlipidemia uti chf dvts chronic obstructive airway disease congestive heart failure productive cough community acquired pneumonia deep vein thrombosis staphylococcus aureus rheumatic heart disease mass effect palpable mass diabetes mellitus, insulin-dependent staph wheezing chief complaint coughing productive hypotension pneumothorax constipated urinary tract infection gestational thyrotoxicosis masses palpable acquired community pneumonia discharge diagnosis chief complaint (finding) acquired hospital pneumonias respiratory signs symptoms iron deficiency anemia knee pain cerebrovascular accident tid vaginitis gtt nosocomial pneumonia constipation hypogammaglobulinemia abdominal pain pulmonary embolism
Note: left ventricular hypertrophy
Note: hypotension
Note: death (finding) aortic regurgitation adverse drug reactions esophageal cancer acute renal failure bacteria rales pass

2024-08-28 13:53:05,486 invalid syntax (<unknown>, line 1)


Note: acidosis bleeding, intracranial uti streak artifact rhabdomyolyses weight bearing azar kala suicide attempt staphylococcus aureus chronic pain streaking artifact hyperventilation ankylosing spondylitis aspiration pneumonia mass effect c . difficile diabetes mellitus, insulin-dependent staph wheezing chief complaint clostridium difficile (bacteria) pneumonia, ventilator associated hypotension absence of fever urinary tract infection gestational thyrotoxicosis tinea corporis rheumatoid arthritis intracranial hemorrhage afebrile chief complaint (finding) blood clot overdose wheeze unconscious iron deficiency anemia rhabdomyolysis tid gtt room air vap cvl tinea corporis (disorder) unconscious state


In [16]:
display(df[['ICD9_CODE','L1_CODES','L2_CODES', 'Parsed_Generated']].head(10))

Unnamed: 0,ICD9_CODE,L1_CODES,L2_CODES,Parsed_Generated
0,['041'],['0'],['03'],"['00', '01', '04', '05', '08', '09']"
1,"['038', '070']",['0'],"['08', '03']",['03']
2,['041'],['0'],['03'],"['01', '04']"
3,['038'],['0'],['03'],"['00', '03']"
4,['038'],['0'],['03'],['XX']
5,['053'],['0'],['06'],"['00', '01', '04', '05', '06', '08', '13']"
6,['038'],['0'],['03'],['XX']
7,"['038', '047']",['0'],"['03', '05']",['00']
8,"['041', '038']",['0'],['03'],['XX']
9,['038'],['0'],['03'],"['01', '03', '08']"


In [17]:
results = pd.DataFrame()


results['ICD9_CODE'] = df['ICD9_CODE'].apply(format_icd9)
results['Recall'] = df.apply(lambda x: recall_score(x['L2_CODES'], x['Parsed_Generated']), axis=1)
results['Precision'] = df.apply(lambda x: precision_score(x['L2_CODES'], x['Parsed_Generated']), axis=1)
results['F1 Score'] = df.apply(lambda x: f1_score(x['L2_CODES'], x['Parsed_Generated']), axis=1)
display(results[['Recall', 'Precision', 'F1 Score']].mean(axis=0)*100)

Recall       35.000000
Precision    19.761905
F1 Score     20.833333
dtype: float64