# SARIMAX Sick Leave Predictor (Q Healthcare)

This model card is part of a series of model cards created for different sectors, including the C Manufacturing and G Trade. Each card highlights the specific parameters, performance, and considerations for applying the SARIMAX model to a unique sector.

## Model Card Version 1.0

---

## 🧪 Model Description
The SARIMAX Sick Leave Predictor is a time series forecasting model developed to predict quarterly sick leave percentages for the **Q Healthcare sector** in the Netherlands. The model uses historical sick leave data from the CBS dataset combined with exogenous variables, such as weather data, to forecast sick leave trends up to Q3 2024.

The model was tuned to handle **COVID-19 outliers** and improve predictions for **Q1 and Q2** using adjusted hyperparameters.

---

## 🜟 Intended Use
The model is intended to:
- Help the **UWV** (Employee Insurance Agency) better predict future sick leave percentages.
- Assist policymakers and healthcare managers in planning staffing levels and managing workloads in the healthcare sector.

---

## 🗲 Model Architecture
- **Model Type**: SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous factors)
- **Order Parameters**: `(0, 1, 1)`
- **Seasonal Order Parameters**: `(0, 1, 2, 4)`

The model applies **rolling forecasts** for each quarter and uses a **recent 5-year rolling window** to improve predictions for recent years.

---

## 📊 Evaluation
The model was evaluated using the **Mean Absolute Error (MAE)** metric for each quarter.

| **Year**   | **Quarter** | **MAE**  |
|------------|-------------|----------|
| 2022       | Q1          | 0.1340   |
| 2022       | All Quarters| 0.1709   |
| 2023       | Q1          | 0.0453   |
| 2023       | All Quarters| 0.3183   |
| 2024       | Q1          | 0.0899
| 2024       | Q2          | 0.0541   |
| 2024       | Q3          | 0.1742   |

---

## 🔧 Hyperparameters
- **Order**: `(0, 1, 1)`
- **Seasonal Order**: `(0, 1, 2, 4)`
- **Seasonal Order Q2**:`(0, 1, 1, 4)`


Special adjustments were made for Q1 and Q2 to handle **COVID-19-related anomalies** in 2022 and the sharp drop in sick leave after the pandemic.

| **Quarter** | **Adjusted Parameters** |
|-------------|-------------------------|
| Q1          | Smoothed outlier for 2022 (7.65%) |
| Q2          | Smoothed outlier for 2022 (7.00%) |

---

## 🧬 Limitations
- The model's performance may decrease for future quarters (beyond Q3 2024) as it relies heavily on historical data trends.
- **COVID-19 outliers** have been smoothed out, but future pandemics or unexpected events could impact predictions.
- The model assumes **seasonality** based on historical data, which might not hold in rapidly changing environments.

---

## 🛠️ Future Improvements
- Incorporate more **exogenous variables** (e.g., flu cases, temperature) to improve prediction accuracy.
- Extend the forecasting period beyond Q3 2024 with additional validation.
- Develop **branch-specific models** to account for unique sectoral trends.

---

## 🛎️ Ethical Considerations
The model outputs should be interpreted with caution. Predictions could affect staffing decisions, which may have real-world implications for healthcare workers' workloads and patient care. It’s essential to consider other non-quantitative factors in decision-making.

---

## 📞 Contact Information
For questions or feedback regarding this model, please contact:

**Name**: Caroline Hakker  
**Email**: c.hakker@vistacollege.nl
**Organization**: Projectteam EAISI UWV 

