<a href="https://colab.research.google.com/github/thotran2015/6.871/blob/master/Chart_Review.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>Chart Review</h1>

Welcome to chart review!

We hope that you find this exercise useful in understanding the clinical care received by patients. When everything is just dataframes and vectors, it might be easy to lose sight of the fact that we are trying to use data to help real people with serious problems.

In this notebook, we will walk through one patient's hospital course together. Then you will analyze a second patient's course and include that description in your pset writeup.

(The code in this notebook was inspired by a visualization project by Anu Vajapey, Willie Boag, Emily Alsentzer, and Matthew McDermott.)

In [0]:
from google.colab import auth
from google.colab import widgets

from collections import Counter
import numpy as np
import pandas as pd
import pickle

In [0]:
auth.authenticate_user()
!gsutil cp gs://hst-956/chartreview.pkl ./

In [0]:
with open('chartreview.pkl', 'rb') as f:
    notes       = pickle.load(f)
    backgrounds = pickle.load(f)

In [0]:
 def view_background(hadm_id):
    display(backgrounds[backgrounds.index==hadm_id].T)


 def visualize_notes(hadm_id):
    # When did this patient arrive (useful for getting first 48 hours)
    admittime = notes[notes.hadm_id==hadm_id].admittime.values[0]

    # Get the notes for this patient
    notes_subject = notes.loc[notes.hadm_id==hadm_id]

    # How many notes for each category?
    category_counts = Counter(notes_subject.category.values)
    category_sorted = sorted(category_counts.keys(), 
                             key=lambda t:category_counts[t], reverse=True)

    # Outer tab is for different category of notes
    outer_tab = widgets.TabBar(category_sorted, location='top')
    for category in category_sorted:
      with outer_tab.output_to(category):
        notes_cat = notes_subject.loc[notes_subject.category==category]
        titles = []
        for num,(i,row) in enumerate(notes_cat.iterrows()):
          # Format the text with additional metadata
          time_offset = (row.charttime - admittime).total_seconds()/3600.
          time_offset = int(time_offset) if not np.isnan(time_offset) else 'n/a'
            
          # Only first 48 hours of data
          titles += ['%s Note #%d (%s Hours)' % (category,num,time_offset)]

        # Inner tab is for each note in a category
        inner_tab = widgets.TabBar(titles, location='start')
        for i in range(len(titles)):
          with inner_tab.output_to(titles[i]):
            print(notes_cat.iloc[i]["text"])
        

<h1>Patient 1</h1>

We will look at this patient together for you to get a sense for what information to focus on when trying to learn about the patient's trajectory. Let's begin! 

First let's look at the overview of their stay


In [0]:
# Admission_id
hadm_id = 142861

view_background(hadm_id)

We can see that this patient is a 79 year-old female who came to the hospital via the emergency room. Her admitting diagnosis was cellulitis.

Let's look at what information we can find in their notes:

In [0]:
tab = visualize_notes(hadm_id)

**<h3>Kinds of Notes</h3>**

We can see that during her stay, there were 3 categories of notes written: nursing/progress notes, ECG reports, and a discharge summary. There are other note categories as well, including radiology reports, physician notes, echo, nutrition notes (rare), social work notes (rare), and others. Let's look a little closer at these 3 note categories that are here.

<h4>Nursing Notes</h4>

Nursing notes often tell the most comprehensive narrative of the patient's care. As is usually the case, the notes tend to be spaced 12 hours apart because that is the length of a nursing shift. Theese notes are how one nurse communicates to his/her successor nurse the current state of the patient. As a result, these notes tend to be one of the most readable parts of the picture.

Another interesting observation to note is that the first note is timestamped 148 hours. That time offset is 148 hours from the patient was admitted to the hospital. As we will see when we dig deeper, this patient arrived to the emergency department and was admitted to a non-ICU ward in the hospital. Because MIMIC is an ICU database, we don't have access to the progress notes written while she was there. Once she transferred to the medical ICU, the MICU nurses began taking notes.

You might find the deidentification notation (e.g. \[\*\*Hospital Ward Name \*\*\] or \[\*\*Name (NI) 8830\*\*\]) unintuitive. These symbols indicate that a piece of Protected Health Information (PHI) was originally in the note collected at the Beth Israel. As a condition for them sharing their data to MIMIC, the data was de-identified to be compliant with privacy laws like HIPAA.


<h4>ECG Reports</h4>

The ECG report is far less interpertable to non-experts. An electrocardiogram test was ordered to measure something about the patient's heart, and the report was written by a doctor. To be honest, for a variety of reasons, I tend to not use ECG notes very often.

Also, observe that the time offset is "n/a". This is because some note categories (e.g. ECG, echo, discharge summary) do not have their timestamp recorded in the EHR system. Such inconsistencies can make Machine Learning tasks hard. For instance, if we wanted to only use the first 48 hours' data as input, we can't readily tell whether a given ECG report actually was recorded in that timeframe. As a result, these notes are often excluded as a preprocessing step.

for more ECG info: https://www.medicinenet.com/electrocardiogram_ecg_or_ekg/article.htm#how_is_an_ecg_ekg_performed


<h4>Discharge Summary</h4>

The discharge summary also does not have a timestamp, but we know that every discharge summary is written at the end of the hospital admission. Note that there is not a summary per ICU stay, so even if a patient transfered from ED -> general ward -> MICU -> general ward -> CICU, they would still only have one discharge summary at the end of their hospital stay.

The discharge summary is often the best place to start for trying to understand the care a patient received. Despite the messiness of the layout, there is usually a consistent set of very helpful sections for understanding the patient's path, such as: History of Present Illness (HPI), Hospital Course, Social History, etc.


**<h3>Length of Stay</h3>**

The discharge summary says that the patient was admitted \[\*\*2134-12-26\*\*\] .

She was discharged on \[\*\*2135-1-2\*\*\].

Although these timestamps (e.g. Dec 26, 2134) are clearly fictious, it's important to note that they are all coherent with one another. The actual dates were shifted forward into the future by the same constant offset. As a result, we can tell that her length-of-stay was 7 days.

**<h3>History of Present Illness</h3>**

It can be very helpful to understand the 2-3 sentence summary of the patient. Recall from the structured admission information, we knew that the patient is a 79 year-old female who came to the hospital via the emergency room with an admitting diagnosis was cellulitis. But if we navigate to the History of Present Illness section of the discharge summary, we can see even more detail:


> History of Present Illness:
> Pt is a 79 yo female with chf, htn, chronic venous stasis of
> legs who comes in with inability to ambulate since friday due to
> pain in her legs.  She says that the pain is much worse in the
> right leg ("like a knife") and worsened with ambulation. She
> denies fevers, chills, n/v/d.  On further questioning she does
> report intermittant sob and cp with ambulation.

She has chronic heart failure (cnf), hypertension (htn), and the veins in her legs are not pumping enough blood. She feels a lot of pain in her legs, and it gets worse when she walks. Also when she walks, she feels shortness of breath (sob) and chest pain (cp).

**<h3>Care Timeline</h3>**

Let's look at this patient's Hospital Course to identify some of the major moments/events during her hospital stay.

**Dec 26**: Patient was admitted to the hospital.

**Dec 26**: Patient developed diarrhea.

**Dec 30**: White Blood Cell (wbc) count increased on to 19.5.

**Jan 1**: Patient's condition started to deteriorate 

**Jan 1 or 2**: Patient was made Do Not Resuscitate / Do Not Intubate (DNR/DNI).

**Jan 2**: In the MICU she was treated supportively, but her condition deteriorated further and she passed away.

**<h3>End of Life</h3>**

Unfortunately, this patient passed away. We can see a fuller account of this moment in the last nursing note:
 `Nursing/other Note #3 (174 Hours)`


> Accepted patient at 1900, family at bedside.
> 
> Patient decided to stop all medication sustaining the patient's blood pressure and to remain at bedside.
> 
> All drips stopped at \[\*\*2059\*\*\] and patient expired at \[\*\*2069\*\*\] with family at bedside.
> 
> Physician spoke with family and they are remaining at bedside until funeral home comes to ICU to remove patient.  They are \[\*\*Hospital1 \*\*\] and this is their wishes.  Son is in from \[\*\*Country \*\*\] and at bedside along with his sister.


**<h1>Patient 2</h1>**

Now you can give it a shot.

1) What is the patient's History of Present Illness?

2) How long were they in the hospital?

3) What are some of the major events in the timeline of this patient's care?

4) What stood out to you most when reading the nursing notes?


1. The patient's History of Present Illness is that she has a history of CVA with right hemiparesis and expressive aphasia, vascular dementia, atrial fibrillation, status post pacemaker implantation, and lower GI bleed.
2. The length of her stay is 2 days.
3. Three major events in the timeline of this patient's care include
4. What stood out to you most when reading the nursing notes?

In [0]:
hadm_id_2 = 194001

view_background(hadm_id_2)

In [0]:
visualize_notes(194001)