# Forecasting of Staffing Needs

### Team Members:
- Marcelle Chiriboga
- Davy Guo
- Patrick Tung
- Iris Yang

## Agenda
- Introduction
- The Analysis
    - Predicted Number of Exceptions
    - Predicted Number of Urgent Exception Groups
    - Exceptions Classification
- The Dashboard

# Introduction

## The Partner - Providence Health Care

- Providence Health Care (PHC) is a non-profit organization.
- Almost 9,000 people working at their 16 facilities - 6,000 staff, 1,000 medical staff/physicians, 200 researchers, 1,600 volunteers.
- PHC is the provincial centre for the care of six groups of people with often-intensive health needs.

<div align="center"><img src="img/phc_logo.png"></div>

## The Problem

For most positions in the healthcare business, any staff absences must always be filled in by another staff.

The purpose of this project is to help the People Analytics and Innovation Team from Providence Health Care (PHC) to predict the short-term staff needs in order to prepare for unexpected potential costs and staff shortages.
The predictions are based on the historical records of scheduled exceptions, i.e. staff absences due to unexpected or previously arranged reasons such as vacation, sickness, maternity leave, etc.


<div align="center"><img src="img/phc_strategy.png"></div>

## Objective

The purpose of this project consisted of predict the short-term staff needs in order give to PHC some insight for unexpected potential costs and staff shortages.
- forecasting staffing needs in healthcare on a weekly basis, providing insight on how many back up staff PHC needs to have a full staff;
- forecasting how many exceptions would be overtime; and
- exceptions classification

# Exception Count Prediction

Forecasting the number of exceptions for Providence Health Care

## Methods for Exception Count Prediction

* Data
    * Training: 2013~2016
    * Validation: 2017
    * Testing: 2018
* Data Wrangling
    * Split data by SITE, JOB_FAMILY, and SUB_PROGRAM
    * e.g. St Paul's Hospital, Registered Nurse - DC1, Emergency
* Fit time series model for each “combination”
* Predict the number of exceptions for the combinations
* Adjusted models based on Mean Absolute Error
* Output a .csv file containing the forecasts

## Product/Interface

![](img/exception_gui.png)

## Output file

* .csv file containing all the predictions (on a weekly basis)

![](img/example_output.png)

## Difficulties

* Certain combinations of data had very little exceptions
    * Little to no pattern
    * Predictions are not meaningful

* e.g. SVH Langara, Registered Nurse - DC1, ALDER
![](img/SVHLangara-DC1-ALDER.png)

## Solution

* Fit meaningful data using a threshold
    * Must have 300 exceptions within the past 4 years

* e.g. St Paul's Hospital, Registered Nurse - DC1, EMERG
![](img/SPH-DC1-EMERGSPH.png)

# Urgent Exception Prediction

Predicting the number of urgent exceptions

## Urgent Exception

- Exceptions backfilled by **Overtime** and **Relief Not Found**
- Overtime: high cost that need to minimize
- Relief Not Found: need to avoid

## Motivation

- Give a insight so HR can arrange on-call and other backfills

## Method

- Linear Regression

## Data

- Dates: Until 2018, excluding 2014
- Job Family: DC1000, DC2A00, DC2B00
- Earning Category: Overtime & Relief Not Found

## Variables

- Day of week, day of month
- Week of year, month of year
- Productive hours

## Input file

- Exception Hours for past years
- Productive Hours for past years
- Productive Hours for the period you want to predict

## Output file

- `.csv` file with dates, job family, predicted count

<img src="../imgs/urgent_2.png" align=middle>

## Difficulties

- Low correlation to the predictors
- Randomness in daily basis

# Exception Label Prediction

Forecasting the label of exceptions for Providence Health Care

## Methods for Exception Label Prediction

* Classification: 
    * Random Forest

## Data Wrangling

#### * Target Group: 
    * Nurse (LABOR_AGREEMENT: NURS)

#### * Training Set: 
    * 2013 ~ 2018

#### * Test Set: 
    * 2018

#### * Site: 

    * St Paul's Hospital, Mt St Joseph, etc 6 sites in total suggested by partner

#### Label Grouping
* Original EARNING_CATEGORY has 12 values which is too much for prediction


* 3 label is more reasonable for prediction:


    * Straight Time: Regular Relief Utilized, Casual at Straight-Time, etc. 
    * Overtime and Beyond: Overtime, Insufficient Notice, etc.
    * Relief Not Needed: Relief Not Needed.

## Model Training

#### * Natural Prediction Model:
    
    * Predicting 3 labels, assuming some exceptions doesn’t require reliefs.

#### * Conservative Prediction Model

    * Predicting 2 labels, assuming every exceptions require reliefs.

### Feature Selection

    "EXCEPTION_HOURS" - How long the exception will be
    "EXCEPTION_CREATION_TO_SHIFTSTART_MINUTES" - The gap between submission time and shift start time
    "SITE" - location
    "PROGRAM" - 
    "SUB_PROGRAM" - 
    "EXCEPTION_GROUP" - The reason for exception
    "MONTH" - natural month
    "DEPARTMENT" - 
    "NOTICE" - Staff reponse time
    "SHIFT" -  Which shift is the exception

## Prediction Result Analysis
<img src="../imgs/rf_1.png" align=middle>

### Difficulites
#### Imblanced Data
<img src="../imgs/rf_2.png" align=middle>

### Solution
    `classweight = "balanced"`

#### Prediction result after adjustment
<img src="../imgs/rf_3.png" align=middle>

### Data Output
    
    * One dataframe with the prediction result and suggestion
<img src="../imgs/rf_4.png" align=middle>

#### Remaining Chanllenge