## Terrorism, Spoiling, and the Resolution of Civil Wars (Findley & Young 2015): Replication in R

We need the following packages to run this notebook:
```R
install.packages(c("haven", "stargazer", "easypackages", "coxed", 'eha', "flexsurv"))
```

In [None]:
install.packages(c("coxed", "easypackages"))

In [None]:
easypackages::libraries("haven", "stargazer", "survival","ggplot2", "eha", "coxed", "dplyr", "flexsurv", "survminer")
options(scipen = 999)

### Log-normal Survival Models of War Ending: Models 1 & 2

Here we replicate two models reported in the original article by Findley & Young 2015 (p. 1124) as an example of reproducing their results using R. As main data includes zeros that cannot be used in R, we load the data saved by the authors after processing in Stata using `stset`. Stata drops zero values in duration column during analysis using log-normal distribution, whearas R does not. In order to recreate the original article's sample, we only keep observations that were used in the Stata anslysis ('st'== 1).

The authors use `streg` function (AFT), the equivalent for which is usually survreg from Survival package that runs AFT but not for left censoring data or time-varying covariables. Therefore results of models differ from `streg`. Here we use `aftreg` from `eha` package and `flexsurvreg` from `flexsurv` package that handle left censoring and time-varying covariables to reproduce results in the article. 

We prepare data for survival analysis by creating a survival object using Surv function that involves time and failure. Surv object is then used as dependent variable in the model. The first model is fit using 'lognormal' distribution. 

In [None]:
duration <- haven::read_dta('replication-data/duration_main_est.dta')
duration <- duration[duration$`_st` == 1,]

duration$start_date <- duration$`_t0`
duration$end_date <- duration$`_t`

During the process of reading and preparing, data turns into a tibble, which causes problems with later functions. Therefore it is necessary to convert it back into a dataframe.

In [None]:
duration <- as.data.frame(duration)

Here we use `aftreg` from `eha` package and `lognormal` distribution that is used by the authors. The signs for `aftreg` is same as `streg` in Stata and `flexsurv` package, but the signs are different.

In [None]:
model1 <- aftreg(Surv(start_date, end_date, warend) ~ lagLogTotalWarRelated+logpop+elf+lngdp+uppsalaMaxed+logbattledeaths+mountains+guarantee,
                 data = duration, dist = 'lognormal')

model2 <- aftreg(Surv(start_date, end_date, warend) ~ smterrorwarrelated+logpop+elf+lngdp+uppsalaMaxed+ logbattledeaths+mountains+guarantee, 
                      data=duration, dist = 'lognormal')

In [None]:
stargazer::stargazer(model1, model2, 
          covariate.labels = c('Terrorism (log/lag)', 'Terrorism (log/smooth)', 'Population(log)','Ethnic fractionalization', 'GDP(log)', 'Number of Actors', 
                     'Battle deaths (log)', 'Mountainous terrain', 'Security guarantee'),
         column.labels = c('Model 1', 'Model 2'), dep.var.labels.include = FALSE, 
          keep.stat = c('aic', 'll'), dep.var.caption="",
          model.names = FALSE,  type='text')

In the next step, we use `flexsurvreg` function from `flexsurv`

In [None]:
model1a <- flexsurv::flexsurvreg(Surv(start_date, end_date, warend)~ lagLogTotalWarRelated+logpop+elf+lngdp+uppsalaMaxed+logbattledeaths+mountains+guarantee,
                 data = duration, dist = 'lognormal'); model1a

model2a <- flexsurv::flexsurvreg(Surv(start_date, end_date, warend)~ smterrorwarrelated+logpop+elf+lngdp+uppsalaMaxed+logbattledeaths+mountains+guarantee,
                 data = duration, dist = 'lognormal'); model2a

#### Log-normal Survival Models of War Recurrence: Models 3 & 4  

In [None]:
recurrence = read_dta('replication-data/recurrence_main_est.dta')
recurrence = recurrence[recurrence$`_st` == 1,]

recurrence$start_date <- recurrence$`_t0`
recurrence$end_date <- recurrence$`_t`

In [None]:
model3 = aftreg(Surv(start_date, end_date, pcens) ~ 
                lagLogTotalWarRelated+lpopns+ethfrac+ln_gdpen+inst2+ regd4+absent, data=recurrence, dist = 'lognormal')

model4 = aftreg(Surv(start_date, end_date, pcens) ~ 
                smterrorWarRelated+lpopns+ethfrac+ln_gdpen+inst2+ regd4+absent, data=recurrence, dist = 'lognormal')

In [None]:
stargazer::stargazer(model3, model4, 
          covariate.labels = c('Terrorism (log/lag)', 'Terrorism (log/smooth)', 'Population(log)','Ethnic fractionalization', 'GDP(log)', 'Number of Actors', 
                     'Battle deaths (log)', 'Mountainous terrain', 'Security guarantee'),
         column.labels = c('Model 3', 'Model 4'), dep.var.labels.include = FALSE, 
          keep.stat = c('aic', 'll'), dep.var.caption="",
          model.names = FALSE,  type='text')

In [None]:
model3a <- flexsurv::flexsurvreg(Surv(start_date, end_date, warend)~ lagLogTotalWarRelated+logpop+elf+lngdp+uppsalaMaxed+logbattledeaths+mountains+guarantee,
                 data = duration, dist = 'lognormal'); model3a

model4a <- flexsurv::flexsurvreg(Surv(start_date, end_date, warend)~ smterrorwarrelated+logpop+elf+lngdp+uppsalaMaxed+logbattledeaths+mountains+guarantee,
                 data = duration, dist = 'lognormal'); model4a