In [1]:
from project.data.data_loaders import get_clean_train_data, get_patient_notes

In [5]:
train_data = get_clean_train_data()
patient_notes = get_patient_notes()
unlabeled = patient_notes.loc[~patient_notes['pn_num'].isin(train_data['pn_num'])]  # we only want examples where the patient is not in the train set

In [17]:
feature_data = train_data[(train_data['feature_text'] == 'Meningococcal-vaccine-status-unknown')]
positive_samples = feature_data[feature_data['location'].map(lambda l: len(l)>0)]
negative_samples = feature_data[feature_data['location'].map(lambda l: len(l)==0)]
print(f'Number of positive samples: {len(positive_samples)}')
print(f'Number of negative samples: {len(negative_samples)}')
print(f'Case number: {feature_data["case_num"].unique()}')

Number of positive samples: 2
Number of negative samples: 98
Case number: [9]


In [22]:
r = positive_samples.iloc[0]
print(r['annotation'])
print(r['location'])
for s, e in r.location:
    print(r['pn_history'][s:e])

['unsure of meningitis shot']
[(512, 521), (544, 559)]
unsure of
meningitis shot


In [12]:
for i, r in positive_samples.iterrows():
    print(r['pn_history'])
    print(r['annotation'])
    print('*'*10)

Stephanie Madden is a 20 year old female who presents with a severe headache. She awoke yesterday morning with the headache, and it has been constant since then. It is now an 8/10 in pain, and feels dull. She has thrown up three times since then, and it has not improved with over the counter medications. She says that the headache is worse with bright lights, and denies any changes in vision. She denies any recent trauma, liquid running out of her nose with leaning forward, and denies sick contacts. She is unsure of whether she has had a meningitis shot. She has had headaches in the past, but never any of this severity.

Medications - Daily oral contraceptive pill use
PMH - None
Surgical history - None
FHx - Mother has migraine disorder, dad has hyperlipidemia
Social - Smokes 3-4 joints weekly, no tobacco, 2-3 beers or wine on weekends, uses condoms consistently with boyfriend
['unsure of meningitis shot']
**********
20yoF with no significant PMHx presents with headache. Dull achy qual

In [13]:
for i, r in negative_samples.sample(3).iterrows():
    print(r['pn_history'])
    print('*'*10)

20 yo F c/o headache since she woke up yesterday, holocranial, 8/10, dull, constant, worsen when walking and bending over, took Tylenol and Ibuprofen without any improvement. Patient reports this it the first time shes had a headache like this. Associated with nause and vomiting, #3 times since yesterday. Reports fever since this morning and feeling fatigued and achy. Disturbed by light and loud noises. Denies rash, recent travel, ill contacts, cough, chest pain, SOB, recent URI, changes in bowel movements, changes in urinary frequency.

ROS: as per HPI. PMH: None. PSH: None. Allergies: NKA. Medications: OCP for the past 2y. FH: Father had hypercholesterolemia; mother has migraines. SH: Works in a sporting good store. Sexually active with boyfriend, uses condoms consistently. Drinks 2-3 drinks/weekend. Smokes 3-4 ciggarettes of marijuana/week. Patient eats a healthy diet and exercises regularly.
**********
Ms. Stephanie Madden is a 20 year-old woman presenting with the chief complaint 

In [15]:
patient_notes.columns

Index(['pn_num', 'case_num', 'pn_history'], dtype='object')

In [18]:
import pandas as pd
def might_be_positive(r: pd.Series):
    text = r['pn_history'].lower()
    if r['case_num'] != 9: return False
    return ('meningococcal' in text  or 'meningitis' in text) and ('shot' in text or 'vaccine' in text)

In [19]:
possible_positives = patient_notes[patient_notes.apply(might_be_positive, axis=1)]
print(f'Number of possible positives: {len(possible_positives)}')

Number of possible positives: 55


In [20]:
for t in possible_positives.sample(4)['pn_history']:
    print(t)
    print('*'*10)

20 yo F presents with a headache beginning yesterday morning. It woke her up and has been getting progressively worse since yesterday. She is unable to rate the pain on a scale of 10, just saying it is "real bad." It is present all over her head. She has never had a headache like this before. It is associated with photophobia and nausea, and she has vomitied three times (non-bloody, non-bilious). She has tried ibuprofen, Tylenol, and sleep, but none of these have improved her headache. She has noticed that walking and bending forwards make her headache worse. She endorses feeling feverish. She is unsure if she received the meningococcal vaccine.
PMH: none
Meds: OCPs for 2 years
Soc Hx: no tobacco use. 2-3 drinks/week, 3-4 marijuana joints/week.
Fam Hx: mother with migraines, father with high cholesterol
**********
Stephanie Maddison is a 20 yo F presentign with 2 day hx of the worst headache that shes experienced.  Started in the morning 2 days ago.  assoicated with nausea and vomiting

In [38]:
def get_all_idxs_of_occurrence(text: str, word: str):
    occurrences = []
    for idx in range(len(text)):
        if text[idx:idx + len(word)] == word:
            occurrences.append((idx, idx + len(word)))
    return occurrences
from functools import reduce
def get_label(text: str):
    words_to_label = {'meningococcal', 'meningitis', 'shot', 'vaccine'}
    return list(reduce( lambda l1, l2: l1 + l2, [get_all_idxs_of_occurrence(text, w) for w in words_to_label], []))

In [42]:
possible_positives['location'] = possible_positives['pn_history'].str.lower().apply(get_label)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  possible_positives['location'] = possible_positives['pn_history'].str.lower().apply(get_label)


In [43]:
for idx, r in possible_positives.sample(4).iterrows():
    print(r['pn_history'])
    print(r['location'])
    for (s, e) in r['location']:
        print(r['pn_history'][s:e])
    print('*'*10)

20F with headaced that started yesterday when she woke up and has been constant and worsening in severity. It is dull and worse with walking and bending. She has tried ibuprofen, tylenol, and sleep and nothing has helped the pain. She did not rate the severity of the pain and said "it is really bad." She vomitted three times yesterady, feels warm, and achy since yesterday. Her neck pain started yesterday and is achy and constant and worse with movement. She has a runny nose and sore throat as well. She denies sick contacts, recent travel. She has received high school vaccines but not college vaccines. Unaware of meningitis vaccine. 
PMH: none
meds: birth control pill 
PSH none
allergies none
FH mom migraines and dad high cholesterol 
SH: 2-3 beers a week and no cigarette/tobacco, 3-4 joints of marijuana a week.
[(620, 630), (574, 581), (599, 606), (631, 638)]
meningitis
vaccine
vaccine
vaccine
**********
Ms. Madden is a 20yo F presenting with headache. Onset was yesterday, has been pro