# Health policy evaluation method 2: Interupted time series and regression discontinuity design

Simen Svenkerud

## Outline

* Interupted time series - Principles and methods
* Regression discontinuity designs - Principles and methods
* When to use each of these methods
* Strength and weaknesses
* A few examples

## Analysing interventions in 7 steps

1. Defiining policies and interventions
2. Specifying a 'theory of change'
3. Defining outcomes, indicators and data collection
4. **Establishing the counterfactual**
5. **Analysing and interpreting the effect**
6. Disseminating findings
7. Transferring results

## Interupted time series

{Insert a graph of Interupted timeseries}

## Interupted time series with control

{Insert graph of timeseries with controll}

## Using routine data to evaluate intervention

* Administrative datasets can be used to evaluate change using quasi-experimental methods
    * 'Second best' evaluation when RCTs are not feasible
* Particularly useful for interrupted time series
    * Depending on the data source, can take observations monthly, weekly or even daily

## Time series and interventions

* Time series - a 'sequence of values at a particular measure taken at regularly spaces intervals over time'
* Change points - specific points in time where values in a time series might change from a previously established pattern
    * Real world event
    * Policy change
    * Experimental intervention

## Interupted time series analysis

* Even when no control group is possible, e.g. when a change is one-off and national, interrupted time series analysis is possible
* Most often used with administrative data rather than survey data
    * needs regular and frequent measures, and numerous time points before and after the change
* Many examples of potential data sources - hospital episodes, prescription data, crime figures, road traffic accidents, etc.

## ITS Choices

* choice of change point
    * Beginning and end of each segment requires carefull thought, e.g. lag times to allow an intervention to take effect
* Choice of statistical analysis - a number of options
    * Segmented regression
        * e.g. Wagner AK et al. segmented regression analysis of interupted time series in medication use research. Jnl of clinical Pharmacy and Therapeutics 2002; 27: 299-309
    * ARIMA models (Often hierarchical)

## Segmented regression analysis

* Method for statistical modelling ITS data to draw conclusions
* Identify any change in *level* or *trend* after some intervention point
    * Abrupt intervention effect
    * Gradual change

## Change in level (Intercept)

{Insert figure on level change}

## Change in trend (slope)

{insert figure with slope change}

## Estimating changes

* First step - scatter plot or line graph to explore trends and change points
    * Visual inspection is essential
* Fit a least square regression ine to each segment
    * assuming a linear (or other) relationship between time and outcome
    
Yt = beta0 + bea1 * time + beta2*Intervetnion + beta3 * time after intervention + error t

## Data example

{Table of data}

## Graphical presentation of the data

{Insert linechart}

## Results for the data

## More than one time points

{Insert graph, look at old slides}
{add on mathematical derivation}

## Checks and controls

* Autocorrelation
    * Correlation error terms - ceck residuals and DW statistic
* Seasonality
    * Control for seasonal fluctuations when necessary
* Extreame values - 'Wild' data points
    * Only control if you have underlying explanition
* Bias and confounding
    * Use a control group if possible

## Adjusting for seasonal effects

{Insert graph on seasonal effect and adjusting for it}

## Strengths of ITS

* Uses existing routinley collected data
    * Relatively low cost and timely to use
* Rigorous method, good scientific basis
* Can incorporate a control group but it is not essential

## Threats to validity

* History
    * Co-occurring events
* Testing
    * Frequent assessment influences the outcome
* Instrumentation
    * Changes in the measuring instrument over time
* Instability
    * Fluctuating measures, inherently unstable
* Regression to the Mean
    * Essential to have an adequate length of time to ensure real effect
* Selection
    * Pre-existing differences between intervention and control groups (Regarding their response to the intervetnion)

## Other Problems

* Routine data is often not designed as a research tool
    * May lack useful information - e.g. hospital activity not outcomes
* Evaluations are retrospective
    * May lack timeliness to inform policy

## Quality criteria for ITS studies (Cochrane EPOC group)

* Protection against secular changes
    * Intervention is independent of other changes over time
* Data were analysed appropriately
    * ARIMA or time series regressions (test for serial correlation)
* Reason for the number of points pre and post intervention was given
* Shape of the intervention effect was specified
* Protection against detection bias (intervention was unlikley to affect measurment)
* Blinded or objective assesment of primary outcomes
* compleateness of dataset
* Reliable primary outcome measures

## Some examples (From UoY student dissertation)

### Interupted time series: uptake of clinical guidance

### Impact of NICE guidance on hernia repair

### Wisdom teeth extraction

### Use of orlistat for obesity

{add more later}

## Regression discontinuity design
credit to David Torgerson and Christian Stock

* Another robust, non-randomised evaluation design
* rediscovered several times since Thistlewaite and Campbell described in 1960s
    * Mainly used in sicial science and economics
* Sometimes known as a risk based cut-off design,
* Selects people into a group on the basis of a measureable continuoud variable
    * e.g. age, BMI, Blood pressure, deprivation level

## Regression discontinuity - what is it?

* Assign participants to intervention or controll based on a continouse variable
* Select a cutoff at a meaningufull value
    * Median is the most efficient (Highest statistical power)
* Participants are assigned on the basis of being different not the same
    * overcomes some ethical or practical obstacles to randomisation, e.g. when it is unethical to withhold treatment

## How does it work?

* Select on a pre-test variable
    * Can be related or unrelated to the desired outcome
    * Clearly quantifiable selection criterion, which cannot be manipulated or anticipated by participants
* Compare post-test scores on the intervention outcome
* Test to see if there is a 'discontinuity' in the regression line
    * Compares people just above and just below the cut-off score

{Graph of effective treatment - Shadish et al 2002}
{Graph of ineffective treatment - Shadish et al 2002}

## Assignment and cut-offs

* Assignment variable - can be ANY Quantitative variable (related or unrelated to the intervention)
    * E.g. age; quality of life; year; position on waiting list; income; clinical score; educational test score...
* Cut-off
    * Can be based on practical reasons; resource reasons (e.g., first X number on a waiting list); statistical reasons (the median cut gives best statistical power)

## Example

* Eercise and diatary advice for chldren
    * Might be thought unethical to withhold this from children with a weight problem.
    * Could this be addressing using RDD?
    * What decisios might you need to consider?

{Graph of hypothetical RDD}
{Combining an RDD and RCTs}

## Simulation

* To see how the RD design performed in a health situation Stock 2007.
* UKBEAM was a RCT of maipulation for low back pain, stock divided the data at the median baseline pain scores and control group for low scores

{UK BEAM simulation ?}

* The RD method showed a difference between the groups of just over 1 point in low back pain at the cut-point - this was not statistically significant
* In the trial we observed a 2.5 point statistically significant difference.

## Comparing RDD to RCTs

#### RDD
* Cut-off assignment
* Assignment is perfectly known and measured
* Compare regressions
* Exchangeability at cutoff

#### RCT
* Random assignment
* Assignment is perfectly known and measured
* Compare post-test means
* Exchangeability

## Comparing RDD to ITS

#### ITS
* Time
* Population level data
* Compare leve (Intercept) and trend
* Sensitive to competing interventions
* Sensitive to population composition
* easy to do retrospectivly

#### RDD
* Continuous variable
* Individual level data
* Compare level (Intercept) and trend
* Sensitive to competing interventions
* Robust to population composition
* Hard to do retrospectively

## Problems with RD

* **Low statistical power**
    * we typically need atleast 2.75 times the sample size as a rct to have the same power IF the cut-point is the median
* **Fuzzy cut point**
    * If people do not keep to the cut-point then the break becomes fuzzy and inference and power is lost

## Wrong functional form

{Insert RDD with a qubic relationship in the data}

## Necessary conditions for using RDD

* 'Forcing variable' (on which selection is based) mut be continuous or ordinal, with a sufficient number of unique values
* There must be no factor confounded with the forcing variable
    * To ensure that causal effects can be isolated
* Value of the variable cannot be manipulated by individuals (e.g. misreporting or deciding to remain one side of a treshold)

# Conclusions

* ITS is a usefull technique to evaluate policy change retrospectively
    * Particularly useful with an existing administrative dataset
    * A control group is not essential - change can be one-off and national
* RDD is a useful technique to evaluate policy prospectively when randomisation is unfeasible
* Requires a continouse variable and a cut-off point to assign participants
* Both need good data and carefull analysis