# POLSCI PS137L Spring 2025

## Data Homework 4: Challenges with objective measures of democracy

This is the second part to a two-part notebook on the debate regarding the current state of democracy as well as broader questions of measuring democratic health (and backsliding). We will first dig deeper into how aggregate scores are produced from expert judgement, and explore a recent debate within the discipline on whether there is, in fact, global democratic backsliding.

**Question 1.1** Imagine two countries, A and B. In Country A, elections are held regularly where incumbents regularly lose. However, many citizens in Country A report feeling that the parties do not represent their interests, and as a result they generally don't report valuing living in a democracy very much. In Country B, the incumbent party always wins the national elections since its founding, but polls show high levels of trust in government.

How can you assess which country is more democratic, and why? How do you think expert coders will compare these conutries?

*Answer to 1.1*

# Part 2: Comparing objective and subjective indices

In Little and Meng (2024), the authors caution against putting too much stock in the score assigned to a particular country-year. Let's explore why, and why this raises some limitations of relying on more objective measures of democracy.

Let's load up some replication data from the paper, which also includes some V-dem indices.

In [None]:
all <- read.csv("data/all.csv", stringsAsFactors=FALSE)

In [None]:
# There is lots of data here, 
# including the individual objective variables you may want to explore,
# But for now let's just look at the key indices
tokeep <- subset(all, select=c(country_name, year, oindex, v2x_polyarchy))
head(tokeep)

To keep things simple let's compare how these two indices changed from 2000 to 2020. I'll do the data manipulation for you here, but check out the comments if you want to know how it works

In [None]:
# Making individual data frames for the 2000 and 2020 data
data2000 <- subset(tokeep, year==2000, select=c("country_name", "oindex", "v2x_polyarchy"))
data2020 <- subset(tokeep, year==2020, select=c("country_name", "oindex", "v2x_polyarchy"))
# Renaming the columns to add the year
names(data2000) <- c("country_name",  "oindex2000", "polyarchy2000")
names(data2020) <- c("country_name", "oindex2020", "polyarchy2020")
# Creating a data frame with both by merging on the country_name variable
slices <- merge(data2000, data2020, by=c("country_name"))
head(slices)

As you can see, "slices" is a data frame where each row is a country, and there are seperate variables for the 2000 and 2020 values of the indices.

**Question 2.1. Create a variable in `slices` called `d_oindex` which is the change in `oindex` from 2000 to 2020. (Hint: positive numbers should correspond to getting more democratic.) Then create a variable called `d_poly` which is the change in the polyarchy score. Finally, make a plot with `d_oindex` on the x axis and `d_poly` on the y axes.**

In [2]:
# Answer to 2.1

**Question 2.2. Interpret this graph. Do these two measures generally agree about which countries are getting more or less democratic?**

*Answer to 2.2*

From eyeballing the graph above we can see that the countries with the biggest decrease in the polyarchy score went down by about 0.2. We can see which countries these are with the following code

In [None]:
subset(slices, d_poly < -.2)

**Question 2.3. Does this list of countries seem consistent with the set of countries we have talked about in class as examples of backsliding? Pick one country that we haven't talked a lot about and do a quick internet search (using an LLM is fine here too!) to figure out why it might be coded as backsliding.**

*Answer to 2.3*

**Question 2.4. Now do the same for the change in the oindex (while they havea somewhat different scale, using a threshold of -.2 is fine). Do any of the countries on here surprise you?**

*Answer to 2.4*

## Part 3. An Events Approach

Let's delve into the data from the Baron et al paper we read, which might also be helpful for your case study. 

This data focuses on concrete events related to backsliding. In a sense the are generally quite objective, though as we will see differential media coverage may affect how many events enter the data set.

Let's load up the data, which I've cleaned a bit:

In [None]:
deed <- read.csv("data/DEED_v6_final_1.17.24.csv", stringsAsFactors=FALSE)

Here is a random set of entries

In [None]:
deed[sample(1:nrow(deed), 10),]

As you can see, each row corresponds to an event, which has a general type (resistance, symptom, precursor), a more precise category, and then a longer description of the event.

Let's see which countries show up most in this data set

In [None]:
table(deed$country)

An easier way to see this is to sort it. The R code for this is short if somewhat wonky: in short we use the "order" function to reorder the vector based on the number of the entries:

In [None]:
table(deed$country)[order(-table(deed$country))]

**Question 3.1 How does this list compare to the countries that had the biggest decrease in the polyarchy score and the objective index?**

*Answer to 3.1*

Now let's look more deeply into the different kinds of events. let's make a table of the type:

In [None]:
table(deed$type)

Not we have a seperate count for "precursor" and "Precursor". This is probably because there is a typo and one of these was incorretly entered with a lower case p. Let's fix this.

In [None]:
deed$type[deed$type=="precursor"] <- "Precursor"

In [None]:
table(deed$type)

Now let's look at the trends in these different kinds of events over time. We can make a data frame that counts the number of "destabilizing events" by year like this.

In [None]:
byyear <- data.frame(year=2000:2023)
byyear$symptom <- tapply(deed$type=="Symptom", deed$year, sum)
byyear

**Question 3.2. Plot the trend in symptoms of backsliding over time. Intrepret this graph.**

In [None]:
# Coder for 3.2

**Question 3.3. Add the number of "resistance" events to to the byyear data. Plot the trend off these over time, and interpret the graph.**

In [None]:
# Code for 3.3

# Part 4 (Optional) Comparing events to the indices

Finally let's do a more detailed comparison between the indices described above and the events data. First I'll create a data which counts the number of each type of event

In [None]:
res_country <- tapply(deed$type=="Resistance", deed$country, sum)
by_country <- data.frame(country_name=names(res_country),
                        resistance=res_country)
by_country$precursor <- tapply(deed$type=="Precursor", deed$country, sum)
by_country$symptom <- tapply(deed$type=="Symptom", deed$country, sum)
by_country$de <- tapply(deed$type=="Destabilizing Event", deed$country, sum)
head(by_country)

Now we can merge this with the previous "slices" data by the country name. (Note this won't be perfect since the name may not always match, and the DEED data doesn't cover every country, but let's roll with it).

**Question 4.1 (optional). Assess whether the change in polyarchy or the change in the objective index seems to correspond more closely with the counts of events from DEED. You can do this with some graphs (e.g., which index change correlates more closely with symptoms?), or some regressions (e.g., run regressions predicting the change in the indices with the event counts, and see which makes more sense).**