In [5]:
coups$coupriskLOG <- log(coups$couprisk)

In [7]:
coups$date <- as.Date(coups$date, "%m/%d/%Y")

In [8]:
coups$Year <- format(coups$date, "%Y")

*In this video, you'll try out a mixed measures ANOVA scenario.  What if you didnt' just want to see whether couprisk decreased over time, but also wanted to know whether military career had any influence on the risk of coup? Well then, you'd have yourself one within subjects independent variable (year) and one between subjects independent variable (military career), and need to perform a mixed measures ANOVA.*

## Load in Libraries

In [10]:
library("rcompanion")
library("car")
library("dplyr")


Attaching package: 'dplyr'

The following object is masked from 'package:car':

    recode

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union



## Load in Data

In [2]:
coups <- read.csv('../data/african_coups.csv')

In [3]:
head(coups)

date,country,commodities,commodities_excl_energy,energy,minerals,forestry,agriculture,fish,age,...,loss,irregular,prev_conflict,pt_suc,pt_attempt,precip,global_policy_uncertainty_current,couprisk,pctile_risk,relative_risk_classification
1/1/1997,Angola,313.45,254.08,609.35,285.31,341.36,170.16,870.14,55,...,5.545177,3.970292,1,0,0,-0.41474915,79.9359,0.001705101,0.7225125,High
1/1/1997,Algeria,313.45,254.08,609.35,285.31,341.36,170.16,870.14,56,...,4.564348,2.70805,1,0,0,0.70164311,79.9359,0.001822458,0.7359829,High
1/1/1997,South Africa,313.45,254.08,609.35,285.31,341.36,170.16,870.14,79,...,3.526361,3.526361,0,0,0,0.9881689,79.9359,0.001837309,0.7371534,High
1/1/1997,Uganda,313.45,254.08,609.35,285.31,341.36,170.16,870.14,53,...,5.267858,2.197225,1,0,0,0.07549572,79.9359,0.001940203,0.7469584,High
1/1/1997,Guinea Bissau,313.45,254.08,609.35,285.31,341.36,170.16,870.14,58,...,5.272999,3.401197,0,0,0,0.04635003,79.9359,0.002256377,0.7715377,High
1/1/1997,Liberia,313.45,254.08,609.35,285.31,341.36,170.16,870.14,58,...,4.343805,4.343805,0,0,0,-0.27843896,79.9359,0.002272123,0.7725644,High


*Since the dependent variable is the same as the repeated measures ANOVA you just ran, you have already tested for most of the assumptions. Just use the log for couprisk, so that you meet the assumption of normality, and note that you did violate the assumption of homogeneity of variance. The only assumption that has changed, really, is sample size - since you had added a second independent variable, you now need to have a sample size of at least 40.  You meet that so, time to jump straight into calculating your first mixed-measures ANOVA, which you'll do on the next page.*

# end pg 6 video

*Welcome back! Now that you've done all the prep work, it's time to create your first mixed-measures ANOVA in R. You will continue to use the same aov function as before, but there will be some additional elements added to the model so make it mixed measures and add in the factor of whether or not there was a military career in that country.*

*So start off by giving your ANOVA a name. How about keeping it straight forward, with mixed measures? Then pull out that aov function.  Next you want to put in your DV that you have log transformed. After that, you'll place a tilde, indicating by, and include your IVs.  You want to cross your IVs, so you can look at the interaction effect.  So, include your between subjects variable, military career, first, then your within subjects variable, year.  Then use the plus sign to indicate that you want to do more with this model! You still need to include the error term to make this a mixed model with the within subjects, rather than a two-way between subjects ANOVA.  You want to do this by country, since that is the unique ID, and by year.*

In [9]:
MixedMeasures <- aov(coupriskLOG~(militarycareer*Year)+Error(country/(Year)), coups)
summary(MixedMeasures)

"Error() model is singular"


Error: country
                    Df Sum Sq Mean Sq F value Pr(>F)  
militarycareer       1  118.6  118.64   8.737 0.0317 *
Year                22 2554.8  116.13   8.552 0.0127 *
militarycareer:Year 20 1863.7   93.19   6.863 0.0210 *
Residuals            5   67.9   13.58                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: country:Year
                      Df Sum Sq Mean Sq F value Pr(>F)    
militarycareer         1    0.1    0.10   0.035  0.851    
Year                  22 1102.2   50.10  16.826 <2e-16 ***
militarycareer:Year   22   61.3    2.79   0.936  0.547    
Residuals           1018 3031.2    2.98                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: Within
                       Df Sum Sq Mean Sq F value Pr(>F)    
militarycareer          1    0.0  0.0144   0.069  0.793    
militarycareer:Year    21   26.2  1.2469   5.999 <2e-16 ***
Residuals           11903 2474.0  0.2078                   
---


*Alright! You now have a lot of output to digest.  You want to pay attention to the third block, that has the heading "Error:Within".  This gives you the appropriate information you need to examine the differences within country.  In the first row, you see that military career by itself does not significantly effect the risk of coup.  However, when it is crossed, or interacts, with year, as indicated by this colon here, you see that there is a significant effect.  Which means that for some years, military career did influence coup risk, and some years, it did not.*

*It will take a little exploration of the means to determine what is going on here. That is where your good friend dplyr comes into play. You may not have needed to examine means before using two grouping variables, but it is relatively simple - just add a comma in your group_by argument and put in the second variable name.*

In [17]:
coupsMeans <- coups %>% group_by(Year, militarycareer) %>% summarize(Mean = mean(couprisk))

In [18]:
coupsMeans

Year,militarycareer,Mean
1997,0,0.00403867
1997,1,0.006728392
1998,0,0.005278594
1998,1,0.005888893
1999,0,0.004716584
1999,1,0.006059192
2000,0,0.004665321
2000,1,0.006338446
2001,0,0.003905755
2001,1,0.004684326


*It looks like in the early years, from 1997 to 2001ish, that a country having a military career did increase their risk of having a coup.  But as the years went on, those differences all but vanished, and in fact, in a few cases, like 2015, for instance, the risk of coup even dropped with a military career. What caused those changes? As a data scientist, you would want to start exploring how the early years differed from the later years with the data you have, and you'd want to use your expertise in African political science (or let's be honest, someone else's expertise) to think through the problem and maybe even test out a few hypotheses.*