# Patient Survival Metrics and Prediction
### Amanda Foster

## Introduction

<img style="float: right; width: 300px;" src="https://www.mayoclinic.org/-/media/kcms/gbs/patient-consumer/images/2015/01/16/09/27/mcdc7_heart-failure.jpg">

Heart failure, a serious medical condition in which the heart fails to supply sufficient blood and oxygen to the body, poses significant challenges for patients and healthcare providers. Signs and symptoms include shortness of breath, rapid heart rate, and lower extremity edema. Although medical science has made remarkable strides over the past century, heart failure remains an intricate and often life-threatening condition that demands specialized and intensive care management.

A crucial knowledge gap remains in our understanding of the factors that contribute to the mortality of heart failure patients admitted to intensive care units (ICUs). The complexities surrounding the condition, along with the varied patient profiles and potential complications, have made it challenging to identify the determinants of in-hospital mortality. Consequently, there is a pressing need to address this gap and gain comprehensive insights into the factors influencing outcomes for this vulnerable patient population.

The primary objective of this tutorial is to explore these factors by performing data processing and exploratory data analysis (EDA), then using machine learning techniques to develop a prediction model to forecast all-cause in-hospital morality for this patient population. Armed with such knowledge, healthcare providers can better tailor their interventions, optimize resource allocation, and enhance decision-making processes, ultimately leading to improved patient outcomes.


### Data Source

We will be sourcing the patient survival data from Kaggle, available at this URL: https://www.kaggle.com/datasets/saurabhshahane/in-hospital-mortality-prediction. The data is part of a collection effort from the Medical Information Mart for Intensive Care III (MIMIC-III) Clinical Database, which has collected patient data from over 40,000 heart failure patients who stayed in critical care units from 2001 to 2012 at Beth Israel Deaconness Medical Center in Boston, MA. More information on data collection and other background is available here: https://www.nature.com/articles/sdata201635

Once the dataset is downloaded, place it into the same directory as your analysis code.

## Getting Started

### Installing Libraries

We will be using the basic python libraries pandas, numpy, matplotlib, and $$. For more compact code, we can rename the libraries to shorter aliases.

If you haven't installed pandas, numpy, or matplotlib yet, ensure these are installed using the terminal commands below: 

* pandas -- # pip install pandas #
* numpy -- # pip install numpy #
* matplotlib -- # pip install matplotlib #

In [None]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 

## Data Processing

The dataset is already in CSV format, so it can immediately be read in using the pandas library. 

In [None]:
data = pd.read_csv('dataset.csv')