# Noisy Labeling of Clinical Notes

This notebook allows you to assign "noisy" labels to clinical notes using heuristics known as labelling functions (LFs).

Because this is a largely exploratory process, it may be useful to run the following cell, which allows you to modify the `NoisyLabeler` code without restarting the kernel.

In [1]:
%load_ext autoreload
%autoreload 2

In [5]:
import sys
sys.path.append('/Users/karthik/Desktop/karthik-wanglab/deep-patient-cohorts/')

## Load the Data

First, you must load some text to label. You will want to have some source of "gold" labels to determine the accuracy of your labelling functions. Your labels should be `1`, indicating the presence of a disease, or `0`, indicating its absence. The following code assumes your data is in a [JSON Lines](https://jsonlines.org/) format, with the fields `"text"` and `"label"`, but you can load the data any way you like.

In [6]:
gold_data_filepath = "../data/MIMIC-III-HEART-DISEASE/valid.jsonl"

In [7]:
import json
from pathlib import Path

import numpy as np

valid = [json.loads(line) for line in Path(gold_data_filepath).read_text().strip().split("\n")]
texts = [example["text"] for example in valid]
labels = np.asarray([example["label"] for example in valid])

## (Noisy) Label the Data

First, initialize the labeller

In [8]:
from deep_patient_cohorts import NoisyLabeler

labeler = NoisyLabeler()

Although optional, it makes sense to preprocess the text with spaCy only one. We can do this easily like so

> note, this will take a few minutes per 1000 documents

In [9]:
texts = labeler.preprocess(texts)

5272it [07:50, 11.20it/s]


Finally, we can label the data and check the accuracy of each labelling function

In [10]:
noisy_labels = labeler(texts)

labeler.accuracy(noisy_labels=noisy_labels, gold_labels=labels)

100%|██████████| 2/2 [00:14<00:00,  7.36s/it]

LF 0: Accuracy 61%, Abstain rate 68%
LF 1: Accuracy 91%, Abstain rate 93%





### Adding New LFs

You may need to continually modify your LFs until they reach acceptable accuracy. The following example demonstrates how to add a new LF to the existing `labeler`, and evaluate its accuracy.

In [77]:
from typing import List
from deep_patient_cohorts import POSITIVE, NEGATIVE, ABSTAIN

import re

# heart_disease
def heart_disease(self, texts: List[str]) -> List[int]:
    return [POSITIVE if "heart disease" in text.text.lower() else ABSTAIN for text in texts]


# st elevation LF
def st_elevation(self, texts: List[str]) -> List[int]:
    search_list = ["stemi", "st elevation", "st elevation mi"]
    return [POSITIVE if any([x in text.text.lower() for x in search_list]) else ABSTAIN for text in texts]

# atherosclerosis
def atherosclerosis(self, texts: List[str]) -> List[int]:
    search_list = ["atherosclerosis", "arteriosclerosis", "atherosclerotic", "arterial sclerosis", "artherosclerosis", "atherosclerotic disease"]
    return [POSITIVE if any([x in text.text.lower() for x in search_list]) else ABSTAIN for text in texts]
'''
# heart_attack -- This labelling function is gives 0% accuracy thats why I commented it out

def heart_attack(self, texts: List[str]) -> List[int]:
    search_list = ["myocardial infarcation", "mi", "ischemic heart disease", "cardiac arrest", "coronary infarction", "asystole", "cardiopulmonary arrest", "coronary thrombosis", "heart arrest", "heart attack", "heart stoppage"]
    return [POSITIVE if any([x in text.text.lower() for x in search_list]) else ABSTAIN for text in texts]
'''
# heart_failure
def heart_failure(self, texts: List[str]) -> List[int]:
    search_list = ["congestive heart failure", "decomensated heart failure", "chf", "left-side heart failure", "right-sided heart failure"]
    return [POSITIVE if any([x in text.text.lower() for x in search_list]) else ABSTAIN for text in texts]


# angina LF

def angina(self, texts: List[str]) -> List[int]:
    search_list_1 = ["stable", "unstable",  "variant"]
    search_list_2 = ["angina", "chest_pain", "angina pectoris"]
    
    return [POSITIVE if (any([x in text.text.lower() for x in search_list_1])) and (any([x in text.text.lower() for x in search_list_2]))  else ABSTAIN for text in texts]


# abnormal diagnostic tests results LF
def abnormal_diagnostic_test(self, texts: List[str]) -> List[int]:
    search_list_1 = ["abnormal", "concerning"]
    search_list_2 = ["ecg", "echo", "echocarrdiogram"]
    
    return [POSITIVE if (any([x in text.text.lower() for x in search_list_1])) and (any([x in text.text.lower() for x in search_list_2]))  else ABSTAIN for text in texts]

# corelated procedures LF
def correlated_procedures(self, texts:List[str]) -> List[int]:
    search_list = ["coronary", "cardiac cath", "cardiac stent", "catheter", "catheterization", "stenting", "angioplasty",
                  "percutaneous coronary intervention","pci"]
    
    return [POSITIVE if any([x in text.text.lower() for x in search_list]) else ABSTAIN for text in texts]

#Common symptom of heart failure

def common_heart_failure(self, texts:List[str]) -> List[int]:
    pattern = re.compile(r"(swelling|edema|puffiness)[\s\w:<>=]+(in)?[\s\w:<>=]+(left|right|l|r)?[\s\w:<>=]+(ankle|leg|feet|foot|ankles|legs)", re.IGNORECASE)
    matches = [pattern.findall(text.text) for text in texts]
    #print(matches)
    return [ABSTAIN if not match else POSITIVE for match in matches]

labeler.add(heart_disease)
labeler.add(st_elevation)
labeler.add(atherosclerosis)
#labeler.add(heart_attack)
labeler.add(heart_failure)
labeler.add(angina)
labeler.add(abnormal_diagnostic_test)
labeler.add(correlated_procedures)
labeler.add(common_heart_failure)

noisy_labels = labeler(texts)
labeler.accuracy(noisy_labels=noisy_labels, gold_labels=labels)



  0%|          | 0/10 [00:00<?, ?it/s][A[A

 10%|█         | 1/10 [00:04<00:41,  4.64s/it][A[A

 20%|██        | 2/10 [00:11<00:42,  5.31s/it][A[A

 30%|███       | 3/10 [00:16<00:35,  5.09s/it][A[A

 40%|████      | 4/10 [00:28<00:44,  7.39s/it][A[A

 50%|█████     | 5/10 [00:52<01:01, 12.40s/it][A[A

 60%|██████    | 6/10 [01:13<00:59, 14.77s/it][A[A

 70%|███████   | 7/10 [01:35<00:50, 16.97s/it][A[A

 80%|████████  | 8/10 [01:49<00:32, 16.24s/it][A[A

 90%|█████████ | 9/10 [02:26<00:22, 22.37s/it][A[A

100%|██████████| 10/10 [02:40<00:00, 16.01s/it][A[A

LF 0: Accuracy 61%, Abstain rate 68%
LF 1: Accuracy 91%, Abstain rate 93%
LF 2: Accuracy 58%, Abstain rate 97%
LF 3: Accuracy 72%, Abstain rate 88%
LF 4: Accuracy 61%, Abstain rate 95%
LF 5: Accuracy 86%, Abstain rate 78%
LF 6: Accuracy 82%, Abstain rate 96%
LF 7: Accuracy 61%, Abstain rate 81%
LF 8: Accuracy 63%, Abstain rate 55%
LF 9: Accuracy 57%, Abstain rate 97%





Of course, you can also modify the `NoisyLabeler` code directly.

### Training a Label Model

Using [FlyingSquid](https://github.com/HazyResearch/flyingsquid), we can train a probablistic model to combine our LFs (assuming we have at least 3!)

In [78]:
from flyingsquid.label_model import LabelModel

m = noisy_labels.shape[1]
label_model = LabelModel(m)

label_model.fit(noisy_labels)

preds = label_model.predict(noisy_labels).reshape(labels.shape)
accuracy = np.sum(preds == labels) / labels.shape[0]

print(f"Label model accuracy: {int(100 * accuracy)}%")

Label model accuracy: 65%


### Removing LFs

If it turns out our new LF performs poorly, we can remove it and try again

In [75]:
del labeler.lfs[2:]

In [76]:
labeler.lfs

[<bound method NoisyLabeler._chest_pain of <deep_patient_cohorts.noisy_labeler.NoisyLabeler object at 0x7fef2d0bffd0>>,
 <bound method NoisyLabeler._ejection_fraction of <deep_patient_cohorts.noisy_labeler.NoisyLabeler object at 0x7fef2d0bffd0>>]