Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICD 10 CM Codes Seem Incorrect. #1710

Open
vanessailana opened this issue Feb 14, 2024 · 5 comments
Open

ICD 10 CM Codes Seem Incorrect. #1710

vanessailana opened this issue Feb 14, 2024 · 5 comments

Comments

@vanessailana
Copy link

Hello, I am conducting a study on the amazing ICD-10 Codes from the Physionet dataset. I am examining a subset of the clinical notes, and I have noticed that some notes do not mention the ICD diagnosis associated with them. For instance a patient could be associated to the hypothermia ICD-10 CM code, but there is no mention of hypothermia in any of the clinical notes associated to them.

How were these codes derived?

Best,
Vanessa

@alistairewj
Copy link
Member

Can you give the hadm_id for your example?

The codes are more or less directly from the source EHR. These are billed after hospital discharge, and codes are determined by reviewing signed notes from healthcare providers. It is possible that it is an incorrect billing code, or it is possible that we did not include the relevant note that mentions the diagnosis.

@vanessailana
Copy link
Author

vanessailana commented Feb 14, 2024

If it is an incorrect billing code, does it mean the dataset is reliable for ICD Code Prediction? This has happened in maybe 20 notes and then https://eicu-crd.mit.edu/ dataset. If you would like, I can show you the results I found.

@vanessailana
Copy link
Author

vanessailana commented Feb 14, 2024

@alistairewj For example, for ham_id=21037483, the only ICD-10 CM code I see associated with it is G4.3801. However, when I read the note, I see the individual has other diagnoses like:

Past Medical History:
Complex migraines (this was covered)
Asthma
Medullary kidney cysts
Strabismus s/p surgery
Vertigo/vertiginous migraine

@alistairewj
Copy link
Member

I should have been clearer - it's always possible for there to be a few errors (they will happen if you look at 100k+ hospitalizations). However, it should be rare. If you start to see a systematic issue that's when I'd start to worry that we made a mistake in the build. One ICD code for a hospitalization seems low, so I'll take a look. I will say these are only the hospital billed codes; we don't have any provider billing data in MIMIC.

@Anaudia
Copy link

Anaudia commented Feb 20, 2024

Hey, we are currently facing a similar problem. We tried to use the MIMIC IV dataset to train a model for ICD-10-CM coding. However, we quickly realized that some of the codes in the dataset do not have corresponding information in the discharge summaries. To date, we have evaluated a few hundred discharge summaries and concluded that information is missing most often for the following codes:

icd_code missing_count count missing_proportion
I471 23 23 1.000000
I482 22 22 1.000000
M545 21 21 1.000000
I272 20 20 1.000000
I472 17 17 1.000000
G20 12 12 1.000000
Z9114 28 28 1.000000
E872 65 65 1.000000
R740 32 32 1.000000
G92 40 40 1.000000
N183 58 60 0.966667
R51 19 20 0.950000
T814XXA 11 12 0.916667
Z23 26 40 0.650000
Z87891 151 316 0.477848
F17210 46 100 0.460000
Z006 13 30 0.433333
N400 25 63 0.396825
Y929 35 90 0.388889
Y92239 19 49 0.387755
Y838 15 42 0.357143

We believe some of these codes, such as Z87891, may be related to an issue that was previously raised. If you are interested, I can provide you with all the hadm_ids and the corresponding codes for which information is missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants