Skip to content

FMenchetti/CausalArima

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CausalArima

The goal of CausalArima is to estimate the causal effect of an intervention on a univariate time series using ARIMA models.

Installation

You can install the development version of CausalArima from GitHub with:

# install.packages("devtools")
devtools::install_github("FMenchetti/CausalArima")

Example

This is a basic example which shows you how to use the package:

library(CausalArima)
#> Loading required package: forecast
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
#> Loading required package: ggplot2
#> Loading required package: gridExtra

# simulate data
n<-100
set.seed(1)
x1 <- 100 + arima.sim(model = list(ar = 0.999), n = n)
y <- 1.2 * x1 + rnorm(n)
y[ floor(n*.71):n] <- y[ floor(n*.71):n] + 10
data <- cbind(y, x1)
dates <- seq.Date(from = as.Date("2014-01-05"), by = "days", length.out = n)
start<-as.numeric(strftime(as.Date(dates[1], "%Y-%m-%d"), "%u"))

# Adding a fictional intervention
int.date <- as.Date("2014-03-16")

# fit the model - Causal effect estimation
ce <- CausalArima(y = ts(y, start = start, frequency = 1), dates = dates, int.date = int.date,
                  xreg =x1, nboot = 1000)

How to obtain the plot of the forecast:

forecasted<-plot(ce, type="forecast")
forecasted

How to obtain the plot of the estimated effects and cumulative effects:

impact_p<-plot(ce, type="impact")
grid.arrange(impact_p$plot, impact_p$cumulative_plot)

How to obtain a quick summary of the estimated effect:

summary(ce)
#>                                       
#> Point causal effect            12.257 
#> Standard error                 1.211  
#> Left-sided p-value             1      
#> Bidirectional p-value          0      
#> Right-sided p-value            0      
#>                                       
#> Cumulative causal effect       310.709
#> Standard error                 6.634  
#> Left-sided p-value             1      
#> Bidirectional p-value          0      
#> Right-sided p-value            0      
#>                                       
#> Temporal average causal effect 10.357 
#> Standard error                 0.221  
#> Left-sided p-value             1      
#> Bidirectional p-value          0      
#> Right-sided p-value            0

How to obtain a detailed summary of the results, with an option to produce tables in html format (notice that to proper display the results as html on a Rmarkdown chunk you have to set result as ‘asis’). Other possible format include “numeric”, useful to retrieve the statistics and use them in calculations, and “latex”. Estimated model:

summary_model<-impact(ce, format="html")
summary_model$arima

$arima_order

p d q
arima_order 0 0 0
$param
coef se t value
xreg 1.199333 0.0016581 723.3279
$accuracy
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.0043228 1.202503 0.9464393 -0.0051705 0.9072633 0.5734012 0.1407503
$log_stats
loglik aic bic aicc
metrics -112.234 228.4681 232.9651 228.6472

Causal impact:

summary_model$impact_norm

$average

estimate sd p_value_left p_value_bidirectional p_value_right
10.35698 0.2211311 1 0 0
$sum
estimate sd p_value_left p_value_bidirectional p_value_right
310.7095 6.633933 1 0 0
$point_effect
estimate sd p_value_left p_value_bidirectional p_value_right
12.25715 1.211185 1 0 0

Causal impact based on boostrap:

summary_model$impact_boot

$average

estimates inf sup sd
observed 117.0485168 NA NA NA
forecasted 106.6915345 106.2768920 107.1550453 0.2184921
absolute_effect 10.3569824 9.8934716 10.7716249 0.2184921
relative_effect 0.0970741 0.0927297 0.1009604 0.0020479
$effect_cum
estimates inf sup sd
observed 3511.4555050 NA NA NA
forecasted 3200.7460337 3188.3067592 3214.6513578 6.5547623
absolute_effect 310.7094713 296.8041472 323.1487458 6.5547623
relative_effect 0.0970741 0.0927297 0.1009604 0.0020479
$p_values
x
alpha 0.05
p 0.00

How to inspect the residuals, with the plots of autocorrelation (ACF) and partial autocorrelation (PACF) functions and QQ-plots:

residuals<-plot(ce, type="residuals")
grid.arrange(residuals$ACF, residuals$PACF, residuals$QQ_plot)

Example with more horizons

# simulate data
n<-100
set.seed(1)
x1 <- 100 + arima.sim(model = list(ar = 0.999), n = n)
y <- 1.2 * x1 + rnorm(n)
y[ floor(n*.71):n] <- y[ floor(n*.71):n] + 10
data <- cbind(y, x1)
dates <- seq.Date(from = as.Date("2014-01-05"), by = "days", length.out = n)
start<-as.numeric(strftime(as.Date(dates[1], "%Y-%m-%d"), "%u"))

# Adding a fictional intervention
int.date <- as.Date("2014-03-16")
horizon<-as.Date(c("2014-03-25", "2014-04-05")) # add horizons

# fit the model - Causal effect estimation
ce <- CausalArima(y = ts(y, start = start, frequency = 1), ic = "aicc", dates = dates, int.date = int.date,
                  xreg =x1, nboot = 1000)

How to obtain the plot of the estimated effects and cumulative effects:

impact_p<-plot(ce, type="impact", horizon = horizon)
grid.arrange(impact_p$plot, impact_p$cumulative_plot)

How to obtain a quick summary of the estimated effect:

summary(ce, horizon = horizon)
#>                                2014-03-25 2014-04-05
#> Point causal effect            9.962      9.673     
#> Standard error                 1.211      1.211     
#> Left-sided p-value             1          1         
#> Bidirectional p-value          0          0         
#> Right-sided p-value            0          0         
#>                                                     
#> Cumulative causal effect       104.356    216.327   
#> Standard error                 3.83       5.55      
#> Left-sided p-value             1          1         
#> Bidirectional p-value          0          0         
#> Right-sided p-value            0          0         
#>                                                     
#> Temporal average causal effect 10.436     10.301    
#> Standard error                 0.383      0.264     
#> Left-sided p-value             1          1         
#> Bidirectional p-value          0          0         
#> Right-sided p-value            0          0

How to obtain a detailed summary of the results, with an option to produce tables in html format (notice that to proper display the results as html on a Rmarkdown chunk you have to set result as ‘asis’). Other possible format include “numeric”, useful to retrieve the statistics and use them in calculations, and “latex”. Estimated model:

summary_model<-impact(ce, format="html", horizon = horizon)
summary_model$arima

$arima_order

p d q
arima_order 0 0 0
$param
coef se t value
xreg 1.199333 0.0016581 723.3279
$accuracy
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.0043228 1.202503 0.9464393 -0.0051705 0.9072633 0.5734012 0.1407503
$log_stats
loglik aic bic aicc
metrics -112.234 228.4681 232.9651 228.6472

Causal impact:

summary_model$impact_norm

$average

horizon estimate sd p_value_left p_value_bidirectional p_value_right
2014-03-25 10.43560 0.3830103 1 0 0
2014-04-05 10.30127 0.2643022 1 0 0
$sum
horizon estimate sd p_value_left p_value_bidirectional p_value_right
2014-03-25 104.3560 3.830103 1 0 0
2014-04-05 216.3267 5.550346 1 0 0
$point_effect
horizon estimate sd p_value_left p_value_bidirectional p_value_right
2014-03-25 9.961521 1.211185 1 0 0
2014-04-05 9.673077 1.211185 1 0 0

Causal impact based on boostrap:

summary_model$impact_boot

$2014-03-25

estimates inf sup sd
observed 117.57705 NA NA NA
forecasted 107.08878 106.336204 107.8783426 0.3887218
absolute_effect 10.48827 9.698704 11.2408425 0.3887218
relative_effect 0.09794 0.090567 0.1049675 0.0036299
estimates inf sup sd
observed 1058.19342 NA NA NA
forecasted 963.79898 957.025838 970.9050836 3.4984965
absolute_effect 94.39444 87.288337 101.1675823 3.4984965
relative_effect 0.09794 0.090567 0.1049675 0.0036299
x
alpha 0.05
p 0.00
$`2014-04-05`
estimates inf sup sd
observed 117.8232981 NA NA NA
forecasted 107.4906190 106.9643496 108.0108449 0.2687184
absolute_effect 10.3326792 9.8124532 10.8589485 0.2687184
relative_effect 0.0961263 0.0912866 0.1010223 0.0024999
estimates inf sup sd
observed 2356.4659629 NA NA NA
forecasted 2149.8123792 2139.2869920 2160.2168984 5.3743690
absolute_effect 206.6535837 196.2490645 217.1789708 5.3743690
relative_effect 0.0961263 0.0912866 0.1010223 0.0024999
x
alpha 0.05
p 0.00

Modify the plots

The plotting functions have some graphical parameters that make easier to personalize the plots:

forecasted_2<-plot(ce, type="forecast", fill_colour="orange",
               colours=c("red", "blue"))
forecasted_2

All plotting functions return a ggplot object or a list of ggplot objects, which makes easy to modify any ggplot parameters of the theme. The ggthemes package can be useful to employ directly some pre-customized themes, for example we can use the Wall Street Journal theme simply typing:

library(ggthemes)
forecasted+theme_wsj()

Learn more

You can read more on Estimating the causal effect of an intervention in a time series setting: the C-ARIMA approach (Fiammetta Menchetti, Fabrizio Cipollini, Fabrizia Mealli, 2021).

It is also available on youtube a video of a webinar on the topic: Fiammetta Menchetti: Estimating the causal effect of an intervention in a time series setting.

About

An R package to estimate the effect of interventions on univariate time series using ARIMA models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages