# Estimate Party Identification

The dataset is rather large, about 410,000 observations, from 2000 - 2020. Below I load the data and estimate a simple Bayesian multilevel model. The model is a multinomial logit. The dependent variable is three point party identification. We can always change this to something with more categories, but the survey questions vary somewhat, so three categories seems reasonable. The intercepts vary across years and states. This should eventually be expanding to include survey organization (ANES, NAES, CCES, VSG) The model is estimated using the `brms` package in R. It's pretty slow, so I use variational bayes. It then takes about 14 minutes.





\begin{align}
pr(y_i = k) &\sim \text{Multinomial}(\alpha_{Year} + \alpha_{State}) \\
\alpha_{Year} &\sim \text{Normal}(0, \sigma_{Year}) \\
\alpha_{State} &\sim \text{Normal}(0, \sigma_{State}) \\
\alpha_{org} &\sim \text{Normal}(0, \sigma_{State}) \\
\end{align}

I use the default priors. Maybe it makes sense to try something different.



In [1]:
library(dplyr)
library(brms)
library(tidybayes)
dat <- read.csv("/Users/Chris/Dropbox/masterData/pooledData/pooled.csv")

dat <- dat %>%
  mutate(ideo3 = ifelse(ideology == "", NA, ideology)) %>%
  mutate(pid3 = ifelse(pid3 == "", NA, pid3)) %>%
  mutate(pid3 = ifelse(pid3 == "Democrat", 1, ifelse(pid3 == "Republican", 3, 2))) %>%
  mutate(pid3 = as.ordered(pid3)) %>%
  select(ideo3, pid3, year, stateFIPS, org) %>%
  filter(year == 2000 | year == 2004 | year == 2008 | year == 2012 | year == 2016 | year == 2020) %>%
  filter(stateFIPS != "DC") %>%
  filter(stateFIPS != "0") %>%
  select(stateFIPS, year, pid3, org) %>%
  na.omit()


model1 <- brm(pid3 ~ 1 +
    (1 | year) + (1 | stateFIPS) + (1|org),
data = dat,
control = list(adapt_delta = .99, max_treedepth = 15),
algorithm = "meanfield",
family = categorical("logit"))



Attaching package: 'dplyr'


The following objects are masked from 'package:stats':

    filter, lag


The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union


Loading required package: Rcpp

Loading 'brms' package (version 2.17.0). Useful instructions
can be found by typing help('brms'). A more detailed introduction
to the package is available through vignette('brms_overview').


Attaching package: 'brms'


The following object is masked from 'package:stats':

    ar



Attaching package: 'tidybayes'


The following objects are masked from 'package:brms':

    dstudent_t, pstudent_t, qstudent_t, rstudent_t


Compiling Stan program...

Start sampling



In [2]:
coef(model1)$year %>% length()

In [5]:
table(dat$year)


 2000  2004  2008  2012  2016  2020 
56260 81641 91808 60449 68870 69280 

In [38]:
# Increase the plot size
options(repr.plot.width = 14, repr.plot.height = 14)

library(dplyr)
library(ggplot2)
df = model1$data %>%
  group_by(year, stateFIPS) %>%
  add_epred_draws(model1, ndraws = 1000)



#
  # summarize(
  #   count = n()
  # ) %>%
  # ggplot(aes(x = year, y = count, fill = pid3)) + facet_wrap(~stateFIPS) + geom_col() + theme_bw() + theme(legend.position = "none")
  # #  %>%
  # # # Change colors
  # # scale_fill_manual(values = c("red", "blue", "purple"))





  # add_linpred_draws(model1, draws = 1000) %>%
  # head()
# expanded_dat_1 <- fixed_data %>%
#   group_by(year) %>%
#   mutate(authoritarianism = quantile(authoritarianism, 0.975)) %>%
#   add_linpred_draws(fit0b, draws = 1000) %>%
#   mutate(high_auth = .linpred) %>%
#   select(high_auth)

# expanded_dat_0$high_auth <- expanded_dat_1$high_auth
# expanded_dat_0$marginal <- plogis(expanded_dat_0$high_auth) - plogis(expanded_dat_0$low_auth)

# marginals <- expanded_dat_0 %>%
#   group_by(year) %>%
#   mutate(min = quantile(marginal, 0.025)) %>%
#   mutate(med = quantile(marginal, 0.50)) %>%
#   mutate(max = quantile(marginal, 0.975)) %>%
#   summarize(
#     min = quantile(min, 0.025),
#     med = quantile(med, 0.50),
#     max = quantile(max, 0.975)
#   )
# marginals

ERROR: Error: vector memory exhausted (limit reached?)


In [34]:
df

pid3,year,stateFIPS
<chr>,<int>,<int>
Independent,2000,27
Independent,2000,26
Democrat,2000,17
Independent,2000,23
Republican,2000,25
Democrat,2000,25
Republican,2000,42
Democrat,2000,34
Democrat,2000,36
Democrat,2000,19


In [5]:
get_prior