Hi all, this is my **medical review** of some annotation issues in the NBME clinical notes competition data. 

As you will see, **there are two features which are annotated inconsistently.** Feature 009 and feature 608. I assume that there are more annotation issues as textual data annotations are usually not 100% consistent.


# Table of Contents <a class="anchor" id="toc"></a>

* [Introduction](#intro)
* [Case 6 feature 608: Shortness of breath vs. dyspnea](#issue1)
* [Case 0 feature 9: Heart racing vs. tachycardia](#issue2)
* [Case 0 feature 7:  No annotation for "shortness of breath"](#issue3)
* [Conclusions](#conclusions)

# Introduction <a class="anchor" id="intro"></a>

The question of annotation quality has been raised by *Sindhu V* in a disscussion thread - https://www.kaggle.com/competitions/nbme-score-clinical-patient-notes/discussion/314436.

It was brought to my attention by *something4kag* in a comment to a previous notebook - https://www.kaggle.com/code/zahaviguy/eda-what-is-the-medical-meaning-of-nbme-features/notebook.



In [None]:
# Imports

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import re

train = pd.read_csv("../input/nbme-score-clinical-patient-notes/train.csv")
patient_notes = pd.read_csv("../input/nbme-score-clinical-patient-notes/patient_notes.csv")

train.head()

# Case 6 feature 608: Shortness of breath vs. dyspnea <a class="anchor" id="issue1"></a>

From *Sindhu V's* original post:
**id=61718_608**

* Feature Text =No shortness of breath
* Ground truth: No annotations.
* pn-history snippet: Denies nausea, vomiting, dyspnea, changes in urine or bowel function, night sweats, chills
* Expected Ground truth: Denies Dyspnea?

My thoughts are that "dyspnea" might not be equalivant to "shortness of breath" in the instructions for annotators.


Are there any more cases where feature 608 is empty and "dyspnea" can be found in the patient notes? Or is this an annotation error?

In [None]:
FEATURE_NUMBER = 608

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER and `location` == '[]'")

# Search and display additional cases
patient_ids = []
text_snippets = []

for pn_number in feature_df['pn_num']:
    if bool(re.search('[Dd]yspnea', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0])):
        patient_ids.append(pn_number)
        text_snippets.append(re.search('.{0,30}[Dd]yspnea', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0]).group())
        
print(patient_ids)
print(text_snippets)


We can see that there are 6 pn_num's where "denies dyspnea" was not annotated as "denies shortness of breath" so this is a recurring issue.
Let's see if dyspnea was ever accepted as a correct answer:

In [None]:
FEATURE_NUMBER = 608

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER and `location` != '[]'")

patient_ids = []
text_snippets = []

for pn_number in feature_df['pn_num']:
    if bool(re.search('[Dd]yspnea', feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])):
        patient_ids.append(pn_number)
        text_snippets.append(feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])
        
print(patient_ids)
print(text_snippets)

There are cases where "dyspnea" was annotated as a correct answer in feature 608.

# Conclusion regarding Case 6 feature 608
There are inconsistencies in the annotation of feature 608 - **"dyspnea" is inconsistently annotated as the correct answer.**

### [Back to table of contents](#toc)

# Case 0 feature 9: Heart racing vs. tachycardia <a class="anchor" id="issue2"></a>

From Sindhu V's original post:
**id=01133_009**

* Feature Text=heart pounding OR heart racing.
* Ground truth: heart racing.
* pn-history snippet: is a 17 yo M presenting with epidosidic heart racing. This started 2-3 mos ago and he has had 5-6 episodes of tachycardia during this time.
* Expected: heart racing and tachycardia?

In the [medical EDA for this competition](https://www.kaggle.com/code/zahaviguy/eda-what-is-the-medical-meaning-of-nbme-features) I proposed that the correct answer to feature 9 was "palpitations", which is different from "tachycardia".

Palpitations = feelings of having a fast-beating, fluttering or pounding heart. ([Mayo Clinic information page](https://www.mayoclinic.org/diseases-conditions/heart-palpitations/symptoms-causes/syc-20373196))
Tachycardia = a heart rate over 100 beats a minute. ([Mayo Clinic information page](https://www.mayoclinic.org/diseases-conditions/tachycardia/symptoms-causes/syc-20355127))

Let's examine feature 9 more closely and see some examples.

In [None]:
FEATURE_NUMBER = 9

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER")
feature_df.head()

The first few examples do not have "tachycardia" annotated. Is "tachycardia" ever annotated as the correct answer for feature 9?

In [None]:
FEATURE_NUMBER = 9

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER and `location` != '[]'")

patient_ids = []
text_snippets = []

for pn_number in feature_df['pn_num']:
    if bool(re.search('[Tt]achycardia', feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])):
        patient_ids.append(pn_number)
        text_snippets.append(feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])
        
print(patient_ids)
print(text_snippets)

The answer is **yes**, tachycardia is annotated as the correct answer. Let's see examples were tachycardia is not annotated in feature 9 but can be found in the patient notes.

In [None]:
FEATURE_NUMBER = 9

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER")

# Search and display additional cases
patient_ids = []
text_snippets = []
annotation_snippets = []

for pn_number in feature_df['pn_num']:
    if bool(re.search('[Tt]achycardia', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0])):
        patient_ids.append(pn_number)
        text_snippets.append(re.search('.{0,30}[Tt]achycardia', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0]).group())
        annotation_snippets.append(feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])
        
print(patient_ids)
print(text_snippets)
print(annotation_snippets)

Out of 4 patient notes containing the word "tachycardia" - 2 had "tachycardia" annotated in feature 9 and 2 did not.

# Conclusion regarding Case 0 feature 9
There are inconsistencies in the annotation of feature 9 - **"tachycardia" is inconsistently annotated as the correct answer.**

### [Back to table of contents](#toc)

# Case 0 feature 7:  No annotation for "shortness of breath" <a class="anchor" id="issue3"></a>

From a comment by *something4kag* in the original discussion:
Another example is pn_num 16 case_num 0 feature_num 7 Shortness of breath with no annotation, no location.
But pn_history excerpt has "Denies shortness of breath, … "

This is the patient note and annotations for pn_num 16

In [None]:
###### The following cell was copied was SANSKAR HASIJA's excellent EDA
PATIENT_IDX = 16

patient_df = train[train['pn_num'] == PATIENT_IDX]
print(f"\033[94mPatient Notes - ")
print(f'\033[94m',patient_notes[patient_notes["pn_num"] == PATIENT_IDX]["pn_history"].iloc[0])
print("------------")
print(f'\033[92mAnnotaions:')
for i in range(len(patient_df)):
    print(f'\033[92m',patient_df["annotation"].iloc[i])

In the medical EDA for this competition I proposed that the correct answer to feature 7 was complaints about "shortness of breath"/"SOB". However, in this patient note, it is written that the patient "Denies shortness of breath", which was not the expected answer.
The annotation here is consistent with medical case, since "denies shortness of breath" is not a correct answer.

Let's examine feature 7 more closely and see some examples.


In [None]:
FEATURE_NUMBER = 7

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER and `location` != '[]'")
feature_df.head(10)

Are there additional cases where "shortness of breath" is not annotated?

In [None]:
FEATURE_NUMBER = 7

# Return the index of the max number of feature that are not empty for this case_num
feature_df = train.query("`feature_num` == @FEATURE_NUMBER and `location` == '[]'")

# Search and display additional cases
patient_ids = []
text_snippets = []
annotation_snippets = []

for pn_number in feature_df['pn_num']:
    if bool(re.search('[Ss]hortness of breath|SOB', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0])):
        patient_ids.append(pn_number)
        text_snippets.append(re.search('.{0,30}[Ss]hortness of breath|.{0,30}SOB', patient_notes[patient_notes["pn_num"] == pn_number]["pn_history"].iloc[0]).group())
        annotation_snippets.append(feature_df[feature_df["pn_num"] == pn_number]["annotation"].iloc[0])
        
print(patient_ids)
print(text_snippets)
print(annotation_snippets)

It is consistant that writing that the patient **did not** have shortness of breath is not annotated in feature 7.

# Conclusion regarding Case 0 feature 7
The annotations are consistent.

### [Back to table of contents](#toc)

# Conclusions <a class="anchor" id="conclusions"></a>

In this notebook we reviewed some issues with the annotation of the data. **We found 2 cases of inconsistent annotations of specific features.** I assume that there are more annotation issues as textual data annotations are usually not 100% consistent.
You can ask additional questions about annotation issues in the comments of this notebook.

**Good luck!**

### [Back to table of contents](#toc)