# (EX) Hospital readmission

## Background

A group of researchers {cite:p}`chan2016factors` is studying the financial burden of hospital readmissions after a total knee arthroplasty.  They are interested in the following questions:  
1. Do patients undergoing *total knee arthroplasty* get readmitted within a month (30 days) and within 3 months (90 days)?
2. How much do these patients in 1. incur during readmission?
3. Are these readmissions predictable?
4. Are these readmissions preventable?
5. Can we predict the costs incurred by a patient, if they were readmitted?

*Why these questions?*

### What do we need to answer this set of questions?  
- ?

## Data description

In 2012, there were ${1,249,805}$ patient discharges in Michigan hospitals recorded in the Healthcare Cost and Utilization Project (HCUP) State Inpatient Database, representing $>97\%$ of the total discharges.

The database used in the original study is protected under the HCUP-SID data use agreement.  However, we can get a sense of what the data look like (without seeing the exact data values).

In [1]:
import pandas as pd
import tabula  # requires java

pd.set_option('display.max_rows', 500)

In [2]:
# example of variables 
try:
    tables = tabula.read_pdf('https://hcup-us.ahrq.gov/db/state/sidc/tools/cdstats/MI_SIDC_2012_SummaryStats_CORE.PDF', pages='1-8')
except Exception as e:
    tables = tabula.read_pdf('ex-hospital-readmission/MI_SIDC_2012_SummaryStats_CORE.pdf', pages='1-8')
df = pd.concat([tables[i] for i in range(len(tables))], ignore_index=True)

In [3]:
select_inds = [0, 6, 7, 10, 18, 19, 20, 61, 62, 67, 78, 79, 87, 170, 171, 179, ]

df.iloc[select_inds, 0]

0                     AGE : Age in years at admission
6               AWEEKEND : Admission day is a weekend
7                  DIED : Died during hospitalization
10     DISPUNIFORM : Disposition of patient (uniform)
18                          DXCCS1 : CCS: diagnosis 1
19                          DXCCS2 : CCS: diagnosis 2
20                          DXCCS3 : CCS: diagnosis 3
61                          FEMALE : Indicator of sex
62                HCUP_ED : HCUP Emergency Department
67                     LOS : Length of stay (cleaned)
78      MRN_R : Medical record number (re-identified)
79            NCHRONIC : Number of chronic conditions
87            PAY1 : Primary expected payer (uniform)
170                             RACE : Race (uniform)
171                  TOTCHG : Total charges (cleaned)
179      HOSPID : HCUP hospital identification number
Name: Variable / Label, dtype: object

## A (rough but specific) data engineering lifecycle

### Identifying the patients and relevant information
- How do we isolate knee arthoplasty patients?
- How do we capture readmissions?
    - How about the 30/90 days threshold?
- How do we identify the costs associated with the visits?
- Did we include all relevant cases?
    - Who do we miss by using the criteria above?

### Storing the (relevant) data in accessible format for analysis
- Transformation of variables, e.g.,
    - Total charges vs incurred costs [(additional "data")](https://hcup-us.ahrq.gov/db/ccr/ip-ccr/ip-ccr.jsp#overview)
- Creation of new variables, e.g.,
    - Readmitted within 30 days
    - Readmitted within 90 days
- Inclusion / exclusion of data for analysis

### Analyzing the data
- Readmission analysis
    - Relationship between readmission and variables, $\textrm{Readmitted}_i = g(\boldsymbol{x}_i) + \varepsilon_i$?
- Cost analysis
    - Relationship between readmission cost and variables, $\textrm{Cost}_i = g(\boldsymbol{x}_i) + \varepsilon_i$?
 
### Reporting the findings
- e.g., {cite:t}`chan2016factors`
- Internal reports to surgeons, physicians, healthcare administrators, etc.

## Caveat
All seems good.  What is the problem?

**New data**

- It is common to conduct the same analyses with a new set of data
    - e.g., new data collected from 2012 onward.
  
**Change of data**

- Data format can change from one set of data to another
    - e.g., The International Classification of Diseases (ICD) is revised from ICD-9 to ICD-10 in September 2015.
    - Specifically, total knee was `81.54` in ICD-9 and in ICD-10... [(online converter)](https://www.icd10data.com/Convert/81.54/)**

**New question**

- Other analyses may have a similar approach
    - e.g., what about total hip arthroplasty? other procedures?
  
**Continuous analysis**

- Some analyses may require continuous reporting (say daily, weekly, or monthly)
    - e.g., COVID-19 patient outcomes are analyzed every day during the peak of pandemic.

```{bibliography}
:filter: docname in docnames
```