## Error Analysis for CAD and Hyperlipidemia Tag Predictions (BERT)

In [1]:
import os
import string
import random
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt

In [2]:
# SK-learn libraries for evaluation.
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score


In [3]:
import numpy as np

### Test LABELS for TOKENS in TEST Dataset against BERT Outputs

BERT Classifier has returned results for the tokens passed in 'test.tsv' file.  The returned values are probabilities, that need to be converted into equivalent class labels based on majority class.  Then, the class label should be compared against the actual label from the code above to extract the IO-Coding from the xml files.  This is a brute-force approach or a manual way of verifying the validity of the predictions


Read in results from BERT Predicitons to the above dataset
The above dataset is derived from IO-Coding applied as done on the training set. This is what should be based on the annotation process. Nowe, we have to read in the predictions from bert, which is a set of class probabilities across all 3 classes and we have to merget that with the above dataset for comparison and error analysis.

### Data File Names

* Test files with Labels and Filenames : /data_for_bert_sent/test_files_with_labels/*_testfile.csv
* Bert label mapping /data_for_bert_sent/test_files_with_labels/*_labelmapping.csv
* BERT evaluation /data_for_bert_sent/BERT_run_results/*_eval_results.txt


In [4]:
print(os.path.dirname(os.path.abspath('__file__')))

C:\Users\Kalyan\Documents\Anu\W266 - NLP\Final Project\lheart-disease-risk-prediction\Code


### CAD Indicator

In [5]:
# read in the test files with labels

CI_test = pd.read_csv("data_for_bert_sent/test_files_with_labels/cad_ind_testfile.csv")

In [6]:
CI_test.rename( columns={'Unnamed: 0' :'sentenceId'}, inplace=True )

In [7]:
CI_test.head(10)

Unnamed: 0,sentenceId,sentence,label,file
0,0,Record date: 2080-02-18,Other,110-03.xml
1,1,SDU JAR Admission Note,Other,110-03.xml
2,2,Name: \t Yosef Villegas,Other,110-03.xml
3,3,MR:\t8249813,Other,110-03.xml
4,4,DOA: \t2/17/80,Other,110-03.xml
5,5,PCP: Gilbert Perez,Other,110-03.xml
6,6,Attending: YBARRA,Other,110-03.xml
7,7,CODE: FULL,Other,110-03.xml
8,8,HPI: 70 yo M with NIDDM admitted for cath aft...,test,110-03.xml
9,9,Pt has had increasing CP and SOB on exertion f...,Other,110-03.xml


In [11]:
# read in the test results captured for BERT Hypertension model and specify columns as the actual file has no header
bert_CI_results = pd.read_csv("data_for_bert_sent/BERT_run_results/sentence_based_output_data_cad_output_results_test_results.tsv", sep='\t')
                               
bert_CI_results.columns=["Class0", "Class1", "Class2", "Class3", "Class4"]

In [12]:
bert_CI_results.head()

Unnamed: 0,Class0,Class1,Class2,Class3,Class4
0,0.081534,0.253442,0.007367,0.403286,0.25437
1,0.135596,0.200983,0.007128,0.403985,0.252308
2,0.146809,0.201183,0.007665,0.401277,0.243066
3,0.05985,0.297005,0.007313,0.415374,0.220459
4,0.073586,0.400443,0.007947,0.301205,0.216819


In [13]:
bert_CI_results['predClass'] = bert_CI_results.idxmax(axis=1)

In [14]:
bert_CI_results.head()

Unnamed: 0,Class0,Class1,Class2,Class3,Class4,predClass
0,0.081534,0.253442,0.007367,0.403286,0.25437,Class3
1,0.135596,0.200983,0.007128,0.403985,0.252308,Class3
2,0.146809,0.201183,0.007665,0.401277,0.243066,Class3
3,0.05985,0.297005,0.007313,0.415374,0.220459,Class3
4,0.073586,0.400443,0.007947,0.301205,0.216819,Class1


In [15]:
bert_CI_results['predClass'].value_counts()

Class0    20778
Class2      357
Class1      253
Class3       25
Name: predClass, dtype: int64

In [16]:
def CI_set_labels(classlabel):
    if (classlabel=='Class1'):
        return 'event'
    elif (classlabel=='Class2'):
        return 'mention'
    elif (classlabel=='Class3'):
        return 'symptom'
    elif (classlabel=='Class4'):
        return 'test'
    else:
        return 'Other'

In [18]:
bert_CI_results['predLabel'] = bert_CI_results['predClass'].apply(CI_set_labels)

bert_CI_results.head(10)


Unnamed: 0,Class0,Class1,Class2,Class3,Class4,predClass,predLabel
0,0.081534,0.253442,0.007367,0.403286,0.25437,Class3,symptom
1,0.135596,0.200983,0.007128,0.403985,0.252308,Class3,symptom
2,0.146809,0.201183,0.007665,0.401277,0.243066,Class3,symptom
3,0.05985,0.297005,0.007313,0.415374,0.220459,Class3,symptom
4,0.073586,0.400443,0.007947,0.301205,0.216819,Class1,event
5,0.047652,0.45068,0.007981,0.280313,0.213373,Class1,event
6,0.142763,0.186687,0.007067,0.455583,0.2079,Class3,symptom
7,0.162023,0.168344,0.006922,0.455268,0.207443,Class3,symptom
8,0.044175,0.469306,0.008372,0.272334,0.205813,Class1,event
9,0.081133,0.295469,0.006694,0.413121,0.203584,Class3,symptom


In [19]:
# validating the counts by label
bert_CI_results['predLabel'].value_counts()

Other      20778
mention      357
event        253
symptom       25
Name: predLabel, dtype: int64

In [20]:
CI_combined = pd.concat([CI_test, bert_CI_results], axis=1)

In [21]:
CI_combined.head()

Unnamed: 0,sentenceId,sentence,label,file,Class0,Class1,Class2,Class3,Class4,predClass,predLabel
0,0,Record date: 2080-02-18,Other,110-03.xml,0.081534,0.253442,0.007367,0.403286,0.25437,Class3,symptom
1,1,SDU JAR Admission Note,Other,110-03.xml,0.135596,0.200983,0.007128,0.403985,0.252308,Class3,symptom
2,2,Name: \t Yosef Villegas,Other,110-03.xml,0.146809,0.201183,0.007665,0.401277,0.243066,Class3,symptom
3,3,MR:\t8249813,Other,110-03.xml,0.05985,0.297005,0.007313,0.415374,0.220459,Class3,symptom
4,4,DOA: \t2/17/80,Other,110-03.xml,0.073586,0.400443,0.007947,0.301205,0.216819,Class1,event


In [28]:
CI_combined[CI_combined['predLabel']!='Other']

Unnamed: 0,sentenceId,sentence,label,file,Class0,Class1,Class2,Class3,Class4,predClass,predLabel
0,0,Record date: 2080-02-18,Other,110-03.xml,0.081534,0.253442,0.007367,0.403286,0.254370,Class3,symptom
1,1,SDU JAR Admission Note,Other,110-03.xml,0.135596,0.200983,0.007128,0.403985,0.252308,Class3,symptom
2,2,Name: \t Yosef Villegas,Other,110-03.xml,0.146809,0.201183,0.007665,0.401277,0.243066,Class3,symptom
3,3,MR:\t8249813,Other,110-03.xml,0.059850,0.297005,0.007313,0.415374,0.220459,Class3,symptom
4,4,DOA: \t2/17/80,Other,110-03.xml,0.073586,0.400443,0.007947,0.301205,0.216819,Class1,event
5,5,PCP: Gilbert Perez,Other,110-03.xml,0.047652,0.450680,0.007981,0.280313,0.213373,Class1,event
6,6,Attending: YBARRA,Other,110-03.xml,0.142763,0.186687,0.007067,0.455583,0.207900,Class3,symptom
7,7,CODE: FULL,Other,110-03.xml,0.162023,0.168344,0.006922,0.455268,0.207443,Class3,symptom
8,8,HPI: 70 yo M with NIDDM admitted for cath aft...,test,110-03.xml,0.044175,0.469306,0.008372,0.272334,0.205813,Class1,event
9,9,Pt has had increasing CP and SOB on exertion f...,Other,110-03.xml,0.081133,0.295469,0.006694,0.413121,0.203584,Class3,symptom


In [23]:
CI_test_labels = CI_combined['label']
CI_pred_labels = CI_combined['predLabel']

#print(type(CI_test_labels))

In [24]:
accuracy_score(CI_test_labels, CI_pred_labels)

0.9356932704431887

In [25]:
print(classification_report(CI_pred_labels, CI_test_labels))

              precision    recall  f1-score   support

       Other       0.97      0.96      0.97     20778
       event       0.02      0.02      0.02       253
     mention       0.00      0.00      0.00       357
     symptom       0.00      0.00      0.00        25
        test       0.00      0.00      0.00         0

   micro avg       0.94      0.94      0.94     21413
   macro avg       0.20      0.20      0.20     21413
weighted avg       0.94      0.94      0.94     21413



  'recall', 'true', average, warn_for)


In [26]:
unique_label = np.unique(CI_test_labels)
print(pd.DataFrame(confusion_matrix(CI_test_labels, CI_pred_labels, labels=unique_label), 
                   index=['true:{:}'.format(x) for x in unique_label], 
                   columns=['pred:{:}'.format(x) for x in unique_label]))

              pred:Other  pred:event  pred:mention  pred:symptom  pred:test
true:Other         20030         238           350            24          0
true:event           258           5             2             0          0
true:mention         321           7             1             0          0
true:symptom         104           0             4             0          0
true:test             65           3             0             1          0


### Hyperlipidemia Indicator

In [30]:
# read in the test files with labels

HI_test = pd.read_csv("data_for_bert_sent/test_files_with_labels/hyperlipidemia_ind_testfile.csv")

In [31]:
HI_test.rename( columns={'Unnamed: 0' :'sentenceId'}, inplace=True )

In [32]:
HI_test.head(10)

Unnamed: 0,sentenceId,sentence,label,file
0,0,Record date: 2080-02-18,Other,110-03.xml
1,1,SDU JAR Admission Note,Other,110-03.xml
2,2,Name: \t Yosef Villegas,Other,110-03.xml
3,3,MR:\t8249813,Other,110-03.xml
4,4,DOA: \t2/17/80,Other,110-03.xml
5,5,PCP: Gilbert Perez,Other,110-03.xml
6,6,Attending: YBARRA,Other,110-03.xml
7,7,CODE: FULL,Other,110-03.xml
8,8,HPI: 70 yo M with NIDDM admitted for cath aft...,Other,110-03.xml
9,9,Pt has had increasing CP and SOB on exertion f...,Other,110-03.xml


In [34]:
# read in the test results captured for BERT Hyperlipidemia model and specify columns as the actual file has no header
bert_HI_results = pd.read_csv("data_for_bert_sent/BERT_run_results/hyperlipidemia_test_results.tsv", sep='\t',header=None)
 
bert_HI_results.columns=["Class0", "Class1", "Class2", "Class3"]

In [35]:
bert_HI_results.head()

Unnamed: 0,Class0,Class1,Class2,Class3
0,0.999941,2.4e-05,1.1e-05,2.4e-05
1,0.999939,2.4e-05,1.1e-05,2.6e-05
2,0.999814,4.5e-05,2.2e-05,0.000119
3,0.99975,6.5e-05,2.9e-05,0.000156
4,0.999768,6.1e-05,2.7e-05,0.000144


In [36]:
bert_HI_results['predClass'] = bert_HI_results.idxmax(axis=1)

In [37]:
bert_HI_results.head()

Unnamed: 0,Class0,Class1,Class2,Class3,predClass
0,0.999941,2.4e-05,1.1e-05,2.4e-05,Class0
1,0.999939,2.4e-05,1.1e-05,2.6e-05,Class0
2,0.999814,4.5e-05,2.2e-05,0.000119,Class0
3,0.99975,6.5e-05,2.9e-05,0.000156,Class0
4,0.999768,6.1e-05,2.7e-05,0.000144,Class0


In [38]:
bert_HI_results['predClass'].value_counts()

Class0    24943
Class3      338
Name: predClass, dtype: int64

In [39]:
def HI_set_labels(classlabel):
    if (classlabel=='Class1'):
        return 'high LDL'
    elif (classlabel=='Class2'):
        return 'high chol.'
    elif (classlabel=='Class3'):
        return 'mention'
    else:
        return 'Other'

In [40]:
bert_HI_results['predLabel'] = bert_HI_results['predClass'].apply(HI_set_labels)

bert_HI_results.head(10)


Unnamed: 0,Class0,Class1,Class2,Class3,predClass,predLabel
0,0.999941,2.4e-05,1.1e-05,2.4e-05,Class0,Other
1,0.999939,2.4e-05,1.1e-05,2.6e-05,Class0,Other
2,0.999814,4.5e-05,2.2e-05,0.000119,Class0,Other
3,0.99975,6.5e-05,2.9e-05,0.000156,Class0,Other
4,0.999768,6.1e-05,2.7e-05,0.000144,Class0,Other
5,0.999943,2.2e-05,1e-05,2.5e-05,Class0,Other
6,0.999944,2e-05,1e-05,2.6e-05,Class0,Other
7,0.999944,2e-05,9e-06,2.6e-05,Class0,Other
8,0.999938,2.6e-05,1e-05,2.6e-05,Class0,Other
9,0.999939,2.4e-05,1e-05,2.6e-05,Class0,Other


In [41]:
# validating the counts by label
bert_HI_results['predLabel'].value_counts()

Other      24943
mention      338
Name: predLabel, dtype: int64

In [42]:
HI_combined = pd.concat([HI_test, bert_HI_results], axis=1)

In [43]:
HI_combined.head()

Unnamed: 0,sentenceId,sentence,label,file,Class0,Class1,Class2,Class3,predClass,predLabel
0,0,Record date: 2080-02-18,Other,110-03.xml,0.999941,2.4e-05,1.1e-05,2.4e-05,Class0,Other
1,1,SDU JAR Admission Note,Other,110-03.xml,0.999939,2.4e-05,1.1e-05,2.6e-05,Class0,Other
2,2,Name: \t Yosef Villegas,Other,110-03.xml,0.999814,4.5e-05,2.2e-05,0.000119,Class0,Other
3,3,MR:\t8249813,Other,110-03.xml,0.99975,6.5e-05,2.9e-05,0.000156,Class0,Other
4,4,DOA: \t2/17/80,Other,110-03.xml,0.999768,6.1e-05,2.7e-05,0.000144,Class0,Other


In [44]:
HI_combined[HI_combined['predLabel']!='Other']

Unnamed: 0,sentenceId,sentence,label,file,Class0,Class1,Class2,Class3,predClass,predLabel
18,18,Hyperlipidemia,mention,110-03.xml,0.005793,0.000736,0.000558,0.992913,Class3,mention
104,104,hyperlipidemia,mention,110-04.xml,0.005793,0.000736,0.000558,0.992913,Class3,mention
185,185,His past medical history is significant for hy...,mention,112-02.xml,0.005675,0.000727,0.000548,0.993050,Class3,mention
227,227,His past medical history is significant for hy...,mention,112-03.xml,0.005675,0.000727,0.000548,0.993050,Class3,mention
265,265,"He is a 54-year-old man with obesity, dyslipid...",mention,112-04.xml,0.005495,0.000734,0.000553,0.993219,Class3,mention
310,310,High cholesterol.,mention,114-03.xml,0.007165,0.000818,0.000588,0.991429,Class3,mention
357,357,Mr. Slater is an 83 yo w/ h/o bull...,mention,114-04.xml,0.005627,0.000726,0.000547,0.993100,Class3,mention
376,376,&#183; Hypercholesterolemia,mention,114-04.xml,0.005720,0.000727,0.000557,0.992996,Class3,mention
430,430,: Mr. Slater is an 83 yo w/ h/o bullous pemphi...,mention,114-04.xml,0.005652,0.000726,0.000548,0.993074,Class3,mention
497,497,Hyperlipidemia MAJOR,mention,115-04.xml,0.005867,0.000740,0.000559,0.992834,Class3,mention


In [45]:
HI_test_labels = HI_combined['label']
HI_pred_labels = HI_combined['predLabel']

#print(type(HI_test_labels))

In [46]:
accuracy_score(HI_test_labels, HI_pred_labels)

0.9960049048692694

In [47]:
print(classification_report(HI_pred_labels, HI_test_labels))

  'recall', 'true', average, warn_for)


              precision    recall  f1-score   support

       Other       1.00      1.00      1.00     24943
    high LDL       0.00      0.00      0.00         0
  high chol.       0.00      0.00      0.00         0
     mention       0.90      0.91      0.91       338

   micro avg       1.00      1.00      1.00     25281
   macro avg       0.48      0.48      0.48     25281
weighted avg       1.00      1.00      1.00     25281



In [48]:
unique_label = np.unique(HI_test_labels)
print(pd.DataFrame(confusion_matrix(HI_test_labels, HI_pred_labels, labels=unique_label), 
                   index=['true:{:}'.format(x) for x in unique_label], 
                   columns=['pred:{:}'.format(x) for x in unique_label]))

                 pred:Other  pred:high LDL  pred:high chol.  pred:mention
true:Other            24871              0                0            28
true:high LDL            30              0                0             1
true:high chol.           9              0                0             0
true:mention             33              0                0           309


In [49]:
HI_combined[HI_combined['label'] =='high LDL']

Unnamed: 0,sentenceId,sentence,label,file,Class0,Class1,Class2,Class3,predClass,predLabel
1641,1641,181/39/112 WITH TG 149 11/85.,high LDL,131-01.xml,0.999933,2.9e-05,1.3e-05,2.5e-05,Class0,Other
1642,1642,194/42/123/4.6 WITH TG 147 7/86.,high LDL,131-01.xml,0.999937,2.7e-05,1.2e-05,2.4e-05,Class0,Other
1643,1643,12/88 188/42/118/4.5.,high LDL,131-01.xml,0.99994,2.5e-05,1.1e-05,2.4e-05,Class0,Other
2825,2825,Cholesterol-LDL 05/15/2090 165,high LDL,134-03.xml,0.859804,0.078042,0.025564,0.036589,Class0,Other
3759,3759,Please see prior notes for full lipid analysis...,high LDL,138-03.xml,0.961764,0.022053,0.006548,0.009635,Class0,Other
3891,3891,LDL 138,high LDL,139-02.xml,0.791359,0.116614,0.03948,0.052547,Class0,Other
4543,4543,"I restarted her on lipitor 20 mg po qd, given ...",high LDL,162-04.xml,0.999779,9.3e-05,3.6e-05,9.2e-05,Class0,Other
4753,4753,and LDL from 09/15/83 was 154 with a total cho...,high LDL,163-03.xml,0.810808,0.101496,0.034903,0.052792,Class0,Other
6186,6186,"However, cholesterol now of 186, HDL 46, LDL 105.",high LDL,169-01.xml,0.770852,0.111234,0.041844,0.07607,Class0,Other
6994,6994,"11/95 TC 199, HDL 42, LDL 122, TG 171, and sim...",high LDL,193-05.xml,0.946269,0.027221,0.008476,0.018034,Class0,Other


The entire class of 'high LDL' is getting predicted incorrectly as the model is not able to learn from the actual limits that determine high LDL levels.