# Final Report

## Forecasting Paediatric Emergency Department Attendances

---

## Executive Summary

**Background**

Accurate forecasting of paediatric emergency department (ED) attendances is essential for NHS Acute Trusts to optimise staffing and resource allocation. Although improving ED performance often requires additional staffing and infrastructure, the ability to accurately forecast patient attendances enables managers to optimise the use of available resources [[1]](https://www-sciencedirect-com.uoelibrary.idm.oclc.org/science/article/pii/S1386505618302429). This study aims to evaluate multiple time series forecasting models to predict daily attendances over a 28-day horizon, supporting workforce planning.

**Methods**

Historical daily attendance data from 2014 to 2017 was used to develop forecasting models. A benchmark model was selected after trialling Naive 1 and Seasonal Naive, with Naive 1 chosen due to better performance. Naive 1 forecasting assumes that the next day's attendance is equal to the last observed value. The benchmark model was compared against Auto-Regressive Integrated Moving Average (ARIMA) and Facebook's Prophet. Model performance was evaluated using Mean Absolute Error (MAE) and Winkler Scores for 80% and 90% prediction intervals. Time series cross-validation was also employed to ensure robustness.

**Results**

The Naive 1 benchmark model achieved a MAE of 9.66, with 80% and 90% prediction interval scores of 96.15 and 123.41, respectively. Prophet outperformed Naive 1, achieving a MAE of 6.11, with 80% and 90% prediction interval scores of 24.43 and 25.59, respectively. The results indicate that Prophet provides more accurate short-term forecasts with significantly tighter prediction intervals, improving its suitability for operational decision-making and resource allocation. The results indicate that attendances are typically higher on weekends and Mondays, suggesting that increased staffing at the ED may be beneficial on these days.


**Conclusions**

This study establishes a validated forecasting approach for the ED, demonstrating that Prophet significantly improves upon naive forecasting methods. The recommended model incorporates UK public holidays, providing a practical and high-quality forecasting method. Future research could explore the impact of additional external factors, such as weather conditions, school term dates, or local outbreaks of infectious diseases, to further refine forecasting accuracy.

---

## 1. Introduction

#### **1.1 Problem Background**
Accurate forecasting of paediatric emergency department attendances is essential for optimising NHS staffing levels. Unanticipated surges in patient numbers can lead to longer waiting times, increased staff workload and resource strain.

The NHS Acute Trust requires a reliable forecasting model to predict the attendances over the next 28 days. This will help in resource planning and staffing adjustments.

This report aims to:

- Develop a reliable forecasting model for the next 28 days.
- Identify weekly and seasonal trends to inform workforce planning.
- Recommend an approach that NHS analysts can implement in practice.

#### **1.2 Data Overview**
The time series dataset provided (`paediatrics_train.csv`) contains:
- **Date column**: 'date' (YYYY-MM-DD format)
- **Attendance count**: 'paeds_ed_attends' (integer)

The time series data spans from **01/04/2014 to 19/02/2017**.

---

## 2. Exploratory Data Analysis (EDA)

To better understand the patterns and seasonality of the paediatric ED attendances data, several analyses were conducted and these can be found **[here](01_data_exploration.ipynb)**.

#### **2.1 Seasonality Observations:**


- ##### **Monthly Seasonality:**
Attendances fluctuate across months with an increase in the winter months (November and December) when compared to summer months. With the exception of March having a slight rise in attendances. Some months exhibited higher variability with outliers in March and low outliers in August.


- ##### **Weekly Attendance Trends:**
Attendances tend to be higher near weekends, particularly on Sundays and Mondays. The interquartile range (IQR) is also wider for these days, indicating greater variability in patient numbers.


#### **2.2 Stationarity Tests: Augmented Dickey-Fuller (ADF), Kwiatkowski-Phillips-Schmidt-Shin (KPSS):**
While the visualisations are suggesting strong weekly and monthly seasonality fluctuations, the ADF and KPSS statistical tests confirm that the data is stationary. Since both tests suggest the data is stationary, differencing *d* is likely not needed as much when implementing the ARIMA model.

> **ADF Test** (Null Hypothesis: Data is non-stationary)  
> - ADF Statistic: **-5.1146**, p-value: **0.0000** (Since the p-value is less than the typical significance level of 0.05, the null hypothesis is rejected, indicating that the data is stationary.)  

> **KPSS Test** (Null Hypothesis: Data is stationary)  
> - KPSS Statistic: **0.3399**, p-value: **0.1000** (Since the p-value is greater than 0.05, we fail to reject the null hypothesis, which supports the conclusion that the data is stationary. 

#### **2.3 Autocorrelation (ACF) and Partial Autocorrelation (PACF) Plot Analysis:**
- The ACF shows a gradual decline in autocorrelation values, suggesting that past values influence future attendances.
- The PACF Function plot shows significant spikes for the first 2-4 lags, followed by a rapid drop-off.

---

## 3. Model Selection & Forecasting

### 3.1. <u>**Naive Models**</u> 

The model selection process began with the implementation of two simple forecasting models: Naive 1 and Seasonal Naive. These models were chosen as initial benchmarks due to their simplicity and ease of implementation, providing a clear reference point for evaluating the performance of more complex models. Naive 1 will take the most recent observation as the predictor for future values, making it a useful baseline for comparison. Seasonal Naive extends this by assuming that the forecast will follow the last observed value from the same season (for example the same day of the week), making it more suited for seasonal patterns. The best performing model out of the two was **Naive 1** which had a much lower MAE score and was used as the baseline of this study.

   
- **MAE:** 9.66, 80% PI: 96.15, 90% PI: 123.41
- **Strength:** Simple and computationally efficient
- **Weakness:** Unable to capture trends or seasonality

Formula: $\hat{y}_t = y_{t-1}$

---
  
### 3.2 <u>**Auto-Regressive Integrated Moving Average (ARIMA)**</u>
The next model explored was ARIMA, which is a statistical analysis model. The components which make this model up are:

- Autoregressive (AR) predicting future values bases on a linear combination of past values
- Integrated (I) applying differencing to remove any trends from the data
- Moving Average(MA), incorporating past errors into the prediction process. This model works best with stationary data and tests within the Exploratory Data Analysis chapter 2, were carried out to confirm this.

Since ARIMA models work best with stationary data, preliminary statistical tests were conducted to assess stationarity. The Augmented Dickey-Fuller (ADF) test indicated strong stationarity *(p-value: 0.0000)*, while the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test confirmed no evidence against stationarity *(p-value: 0.1000)*. This suggested that differencing (I component) might not be necessary.

#### 3.2.1 **Auto ARIMA**
To optimise model selection, an Auto ARIMA approach was used, which automatically determined the best ARIMA configuration based on statistical criteria. The best-performing model was identified as ARIMA(0,1,1) (0,0,2,7), capturing both trend and autoregressive components while accounting for weekly seasonality. Auto ARIMA was also used to initially select the best parameters for this study. 
   
- **MAE:** 7.78, 80% PI: 35.03, 90% PI: 38.36
- (0, 1, 1), (0, 0, 2, 7)
- Captured trend and autoregressive components
- Performed worse than Prophet in terms of both MAE and prediction intervals

#### 3.2.2 **Manual ARIMA Models**
To further validate the Auto ARIMA results, two additional manual ARIMA models were implemented with cross validation to explore whether slight modifications would lead to better or worse performance. These models were carefully chosen based on the stationarity results and seasonality of the dataset.

**ARIMA(1, 1, 0) (1, 0, 0, 7)** – Higher MAE of 8.79

Reason for selection:
- Introduced an autoregressive term (AR=1) to see if incorporating past values directly would improve predictions.
- Maintained differencing (I=1), though it may not have been necessary given stationarity.
- Excluded a moving average (MA=0), potentially reducing its ability to adjust for short-term fluctuations.
- Seasonal component (1, 0, 0, 7) was included to capture weekly patterns.

**ARIMA(2, 1, 0) (1, 0, 0, 7)** – Higher MAE of 8.89

Reason for selection:
- Increased the autoregressive component to AR=2, testing whether a longer memory of past values could improve performance.
- Maintained differencing (I=1) though, as stationarity tests suggested, this may have been unnecessary.
- No moving average component (MA=0), reducing short-term adaptability.
- Seasonal component (1, 0, 0, 7) was included to acknowledge weekly seasonality.

---

### 3.3 <u>**Prophet**</u>
The final model used in this study was Prophet, designed to handle various components of a time series such as trend, seasonality and external factors such as holidays. Given the nature of the paediatric ED attendances where there may be shifts in demand seasonally, Prophet was a good candidate for this study. One thing to note is that computationally, this model is much more intensive when compared to Naive and ARIMA. One thing to note was Prophets use of Monte Carlo sampling for Prediction Intervals (PIs), which means that the intervals may vary across different runs unless a random_state is set, though the predicted values (yhat) remain unaffected [[2]](https://facebook.github.io/prophet/docs/diagnostics.html).

   
- **MAE:** 6.11, 80% PI: 24.43, 90% PI: 25.59
- A flexible time series model incorporating trend, seasonality and holiday effects
- Configured with UK public holidays
- Outperformed both Naive 1 and ARIMA in all accuracy metrics

---

Each model was evaluated based on forecast accuracy metrics. Cross validation was used in each model to ensure robustness and to give a more balanced view of the behaviour.

#### **3.4 Model Evaluation Metrics using Cross Validation**

| Model | MAE | Winkler Score (80%) | Winkler Score (90%) |
|--------|------|------|----------------|
| **Prophet (with UK Holidays)** | 6.11 | 24.43 | 25.59 |
| Prophet (with Weekly Seasonality) |6.13|24.21|30.11|
| Prophet (with Monthly Seasonality) |6.19|25.45|32.68|
| Prophet (Baseline Model) |6.33|24.54|30.32|
| **ARIMA Auto (0, 1, 1), (0, 0, 2, 7)**| 7.78 | 35.03 | 38.36 |
| ARIMA (1, 1, 0), (1, 0, 0, 7)|8.79|60.91|78.18|
| ARIMA (1, 1, 0), (0, 0, 0, 7)|8.89|58.65|75.27|
| **Naive 1 (baseline)** | 9.66 | 96.15 | 123.41 |
| Seasonal Naive|10.48|42.16|51.99|


<i> Table 1: Results of various model configurations</i>

---

## 4. Results & Forecast Visualisation

- In *Image 1*, the Naive 1 model simply projects the last observed value into the future, which results in predictions that remain static and do not track the fluctuations of the test data. As time progresses, the prediction intervals widen, indicating growing uncertainty about future values.
  
- *Image 2* displays the Auto ARIMA model's predictions. Compared to Naive 1, this model performs better by capturing the fluctuations and trends in the data, following the peaks and troughs with greater accuracy. However, towards the end of the forecast, the prediction intervals expand, suggesting increased uncertainty as the forecast horizon lengthens.

  
- *Image 3* shows the performance of the Prophet model, which delivers promising results. The predictions closely align with the test data, and the prediction intervals are much narrower, indicating more confidence in the forecasts for the given period.

<img src="images/naive_graph.png" width="600"/>

<i>Image 1: Naive 1 Predictions</i>

---

<img src="images/auto_arima_graph.png" width="600"/>

<i>Image 2: ARIMA Predictions</i>

---

<img src="images/prophet_graph.png" width="600"/>

<i>Image 3: Prophet Predictions. See [here](images/prophet_predictions.png) for detailed predictions</i>

---

## 5. Discussion & Insights

#### **5.1 Key Observations**

- **Weekly seasonality:** Higher attendances on **Sundays and Mondays** as seen in *Image 4*.
- **Monthly trends:** Spikes observed in **March, November and December**.
- **Prophet performed best**, it had a significantly lower MAE with a difference of 36.8% when comparing to the baseline model ( 9.66 vs. 6.11), meaning it made smaller forecasting errors.

<img src="images/box_plots.png" width="600"/>

<i>Image 4: Weekly and Monthly Boxplots</i>

#### **5.2 Strengths**
- This study focuses on well-known forecasting methods that are free and open source, this is relevant to the NHS in terms of keeping costs to a minimum.
- Prophet's accuracy demonstrates promising predictions with capturing seasonal patterns, especially in the short-term forecasting which is vital in NHS staffing decisions.
- Using a benchmarking forecasting method as Monks et al. [[3]](https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-023-02218-z) describes, is critical for assessing other, more complex forecasting methods. This study followed that approach.

#### **5.3 Limitations**
- The recommended forecasting tool Prophet, will still need further validation. If there is a change in workflow or a new intervention within the ED department, this could shift future patterns.
- The dataset does not include **external factors (e.g. weather forecasts, major events** such as sporting events etc).
- This model might decline when trying to forecast further than 28 days with longer-term predictions being more uncertain.
- Some outliers may impact forecast accuracy.

#### **5.4 Future Improvements**
- **Consider Neural Networks as a forecasting tool** Although computationally much more expensive, it could provide improvements in the metrics collected for model performance.
- **Incorporate exogenous variables** (e.g. school holidays or flu seasons). 


---

## 6. Conclusion & NHS Recommendations

#### **6.1 Summary of Findings**
- Best Model: Prophet, especially when including UK public holidays, it was found to be the best performer in terms of MAE and prediction intervals.
- Reliability for Short-term Forecasting: The selected models, particularly Prophet, can be trusted for staffing decisions over a short period (28-day forecast). This could be ran at the beginning of each working week for the following 28 days.
- Peak attendances seem to occur on Sundays and Mondays, suggesting increased staff allocation would be beneficial.

#### **6.2 Recommendations**
- NHS should increase staffing levels on high-attendance days. This is based on the weekly seasonality and trends found.  
- Update forecasts monthly, even weekly to capture new seasonal trends.  
- Consider integrating real-time data sources for improved accuracy. Such as local flu outbreaks and school holidays in the surrounding areas.
- Staff should be trained in using this forecasting tool and adjusting or fine tuning parameters if necessary.

#### **6.3 Conclusion**
This study aimed to explore various time series forecasting models including Naive, ARIMA and Prophet for helping predict the next 28 days of paediatric ED attendances. The Naive model served as a baseline, while ARIMA effectively captured trends and seasonality. However, the Prophet model, with its flexibility to handle seasonality and trends outperformed the others for short-term forecasting.


While this study successfully demonstrated the application of several forecasting techniques for these predictions, it highlights the importance of continuous model evaluation and adaptation to real-time data. The findings from this research will help healthcare professionals make more informed decisions about staffing and resource allocation, ultimately improving service delivery within the emergency department.

*Page 6*