paper/dcpo_trust_bureaucracy.Rmd

---
output: 
  bookdown::pdf_document2:
    fig_caption: yes
    keep_tex: yes
    toc: no
    number_sections: no
    latex_engine: xelatex
    pandoc_args: --lua-filter=multibib.lua
    
title: |
  | Trust in Bureaucracy: 
  | Public Opinion on Public Servants in
  | Dynamic Comparative Perspective

date: "`r format(Sys.time(), '%B %d, %Y')`"
editor_options: 
  markdown: 
    wrap: sentence
tables: true # enable longtable and booktabs
citation_package: natbib
citeproc: false
fontsize: 12pt
indent: true
linestretch: 1.5 # double spacing using linestretch 1.5
bibliography:
  text: dcpo-trust-bureaucracy.bib
  app: dcpo-trust-bureaucracy-app.bib
biblio-style: apsr
citecolor: black
linkcolor: black
endnote: no
header-includes:
      - \usepackage{array}
      - \usepackage{caption}
      - \usepackage{graphicx}
      - \usepackage{siunitx}
      - \usepackage{colortbl}
      - \usepackage{multirow}
      - \usepackage{hhline}
      - \usepackage{calc}
      - \usepackage{tabularx}
      - \usepackage{threeparttable}
      - \usepackage{wrapfig}
      - \usepackage{fullpage}
      - \usepackage{lscape} #\usepackage{lscape} better for printing, page displayed vertically, content in landscape mode, \usepackage{pdflscape} better for screen, page displayed horizontally, content in landscape mode
      - \newcommand{\blandscape}{\begin{landscape}}
      - \newcommand{\elandscape}{\end{landscape}}
      - \usepackage{titlesec}
      - \titleformat*{\section}{\normalsize\bfseries}
      - \titleformat*{\subsection}{\normalsize\itshape}
      - \usepackage{titling} #use \maketitle repeatedly  
---

\pagenumbering{gobble}

# Authors {.unnumbered}

-   Yuehong Cassandra Tai, ORCID: <https://orcid.org/0000-0001-7303-7443>, Postdoctoral Fellow, Center for Social Data Analytics, Pennsylvania State University, [yhcasstai\@psu.edu](mailto:yhcasstai@psu.edu){.email}
-   Frederick Solt, corresponding author, ORCID: <https://orcid.org/0000-0002-3154-6132>, Associate Professor, Department of Political Science, University of Iowa, [frederick-solt\@uiowa.edu](mailto:frederick-solt@uiowa.edu){.email}

\pagebreak

```{=tex}
\renewcommand{\baselinestretch}{1}
\selectfont
\maketitle
\renewcommand{\baselinestretch}{1.5}
\selectfont
```

```{=tex}
\begin{abstract}
blah blah blah
\end{abstract}
```
\pagebreak

\pagenumbering{arabic}

```{r setup, include=FALSE}
options(tinytex.verbose = TRUE)

knitr::opts_chunk$set(
  echo = FALSE,
  message = FALSE,
  warning = FALSE,
  cache = TRUE,
  dpi = 600,
  fig.width=7,
  fig.height = 2.5,
  plot = function(x, options)  {
    hook_plot_tex(x, options)
  }
)

# If `DCPOtools` is not yet installed:
# remotes::install_github("fsolt/DCPOtools")

if (!require(pacman)) install.packages("pacman")
library(pacman)
# load all the packages you will use below 
p_load(
  DCPOtools,
  cmdstanr,
  tidyverse,
  here,
  countrycode,
  patchwork,
  ggthemes,
  rsdmx,
  osfr,
  kableExtra
) 

# define functions
validation_plot <- function(v_data_raw,
                            lab_x = .38, lab_y = 92,
                            theta_summary, theta_results) {
    
    # defaults per https://stackoverflow.com/a/49167744/2620381
    if ("theta_summary" %in% ls(envir = .GlobalEnv) & missing(theta_summary))
        theta_summary <- get("theta_summary", envir = .GlobalEnv)
    if ("theta_results" %in% ls(envir = .GlobalEnv) & missing(theta_results))
        theta_results <- get("theta_results", envir = .GlobalEnv)

    median_val <- Vectorize(function(x) median(1:x),
                            vectorize.args = "x")
    
    v_vars <- v_data_raw %>% 
      select(item0 = item) %>% 
      unique() %>% 
      mutate(v_val = str_extract(item0, "\\d+") %>% 
               as.numeric() %>% 
               median_val(.) %>%
               {. + .1} %>% 
               round())
    
    validation_summarized <- v_data_raw %>% 
      DCPOtools::format_dcpo(scale_q = v_vars$item0[[1]], # these arguments are required
                             scale_cp = 1) %>% # but they don't matter
      pluck("data") %>% 
      mutate(item0 = str_remove(item, " \\d or higher"),
             title = factor(title, 
                            levels = v_data_raw %>%
                              pull(title) %>%
                              unique())) %>% 
      right_join(v_vars, by = "item0") %>%
      arrange(title) %>% 
      filter(str_detect(item, paste(v_val, "or higher"))) %>%
      mutate(iso2c = countrycode::countrycode(country,
                                              origin = "country.name",
                                              destination = "iso2c"),
             prop = if_else(neg, 1-y_r/n_r, y_r/n_r),
             se = sqrt((prop*(1-prop))/n),
             prop_90 = prop + qnorm(.9)*se,
             prop_10 = prop - qnorm(.9)*se) %>%
      inner_join(theta_summary %>% select(-kk, -tt), by = c("country", "year"))
    
    validation_cor <- theta_results %>%
      inner_join(validation_summarized %>%
                   select(country, year, title, prop, se),
                 by = c("country", "year")) %>% 
      rowwise() %>% 
      mutate(sim = rnorm(1, mean = prop, sd = se)) %>% 
      ungroup() %>% 
      select(title, theta, sim, draw) %>% 
      nest(data = c(theta, sim)) %>% 
      mutate(r = lapply(data, function(df) cor(df) %>% nth(2)) %>% 
               unlist()) %>%
      select(-data) %>% 
      group_by(title) %>% 
      summarize(r = paste("R =", round(mean(r), 2)))

    if ({validation_summarized %>%
        pull(country) %>%
        unique() %>% 
        length()} > 1) {
      val_plot <- validation_summarized %>%
        ggplot(aes(x = mean,
                   y = prop * 100)) +
        geom_segment(aes(x = q10, xend = q90,
                         y = prop * 100, yend = prop * 100),
                     na.rm = TRUE,
                     alpha = .2) +
        geom_segment(aes(x = mean, xend = mean,
                         y = prop_90 * 100, yend = prop_10 * 100),
                     na.rm = TRUE,
                     alpha = .2) +
        geom_smooth(method = 'lm', formula = 'y ~ x', se = FALSE) +
        facet_wrap(~ title, ncol = 4) +
        geom_label(data = validation_cor, aes(x = lab_x,
                                              y = lab_y,
                                              label = r),
                   size = 2)
    } else {
      val_plot <- validation_summarized %>%
        ggplot(aes(x = year,
                   y = mean)) +
        geom_line() +
        geom_ribbon(aes(ymin = q10,
                        ymax = q90,
                        linetype = NA),
                    alpha = .2) +
        geom_point(aes(y = prop),
                   fill = "black",
                   shape = 21,
                   size = .5,
                   na.rm = TRUE) +
        geom_path(aes(y = prop),
                  linetype = 3,
                  na.rm = TRUE,
                  alpha = .7) +
        geom_segment(aes(x = year, xend = year,
                         y = prop_90, yend = prop_10),
                     na.rm = TRUE,
                     alpha = .2) +
        facet_wrap(~ title, ncol = 4) +
        geom_label(data = validation_cor, aes(x = lab_x,
                                              y = lab_y,
                                              label = r),
                   size = 2)
    }
    
    return(val_plot)
}

covered_share_of_spanned <- function(dcpo_input_raw) {
  n_cy <- dcpo_input_raw %>%
    distinct(country, year) %>% 
    nrow()
  
  spanned_cy <- dcpo_input_raw %>% 
    group_by(country) %>% 
    summarize(years = max(year) - min(year) + 1) %>% 
    summarize(n = sum(years)) %>% 
    pull(n)
  
  {(n_cy/spanned_cy) * 100}
}

set.seed(324)
```


# Intro

[Y]et the comparative aspects of public administration have largely been ignored, and as long as the study of public administration is not comparative, claims for a "science of public administration" sound rather hollow [@Dahl1947, pp. 8].
Comparative studies provide novel insights and innovatory concepts but face methodological challenges [@Pollitt2011].
Measurement equivalence is a critical one of these challenges. 
Specifically, non-equivalent measures across countries posts a threat to comparative public administration study, yielding biased results, wrong theoretical conclusions, and misleading policy implications [@Jilke2015].

The lack of comparable data is especially prominent in comparative public administrative survey research.
As a core topic in comparative administration, analyses on trust connect classic administration theories and behavioral perspectives [@VanRyzin2011].
However, studies on trust and public administration in across national analyses are susceptible to lack of comparable data and have to restricted to a small number of country-level unites, mainly focusing on OECD countries [see summary in @VandeWalle2022].
@Jilke2015 found that ignoring the incaparability of trust measures cross countries can produce misleading conclusions. For example, without accounting for item bias, Swedens have higher trust in public institutions than citizens in Canada and the United States, which is exactly the opposite when non-equivalence caused by item bias is modeled.

The causes and consequences of trust in government, in legislative institutions, and judiciary on democratic governance, legitimacy of legitimacy, direction of public policy, and public compliance in emergencies have been widely discussed over decades [@Easton1975;@Chanley2000;@Rogowski2021;@Goldstein2021].
However, trust in public administration has long been ignored by scholarly inquiry [@Rogowski2021], although public trust is particular important for administrative agencies.
Since the public cannot monitor and control agencies and civil servants directly, trust in bureaucracy is required to grant agencies and officials to act in the public’s interest [@Thomas1998].
Agencies and civil servants are agents who implement policies, deliver public services and goods, and contact with citizens directly and frequently, public trust determines the public's acceptance to delivered goods and compliance with public policies [@Morelock2021]. 
With trust in bureaucracy, the public could support "the implementation of policy programs" [@Kim2005, p. 611]. 
On contrast, without public trust, public officials could struggle in performing their tasks and attaining the public's collaborative responses in emergencies [@Yates1982;@VanRyzin2011].

Among few existing studies, data is limited in both regional scale  and time periods [@Morelock2021; @Choi2018; @Houston2016], which severely impedes causal inferences in dynamic relationship between trust in bureaucracy and quality of administration.
Specifically, the lack of comparative data on trust on bureaucracy makes it impossible to test the competing theories about what affects trusting attitudes, government performance, or the quality of governance [@Bouckaert2012; @Kettl2000; @VandeWalle2022; @Morelock2021].

```{r dcpo_input_raw, include=FALSE}
surveys_tb <- read_csv(here::here("data-raw",
                                  "surveys_bureaucracy.csv"),
                       col_types = "cccccc")

dcpo_input_raw <- DCPOtools::dcpo_setup(vars = surveys_tb,
                                         datapath = here("..",
                                                         "data",
                                                         "dcpo_surveys"),
                                         file = here("data",
                                                     "dcpo_input_raw.csv"))
```

```{r tb_summary_stats}
dcpo_input_raw <- read_csv(here::here("data", "dcpo_input_raw.csv"),
                                  col_types = "cdcddcd")

process_dcpo_input_raw <- function(dcpo_input_raw_df) {
  dcpo_input_raw_df %>% 
    with_min_yrs(3) %>% 
    with_min_cy(5) %>% 
    with_min_yrs(3) %>% #?
    filter(year >= 1985 & n > 0) %>% 
    group_by(country) %>% 
    mutate(cc_rank = n()) %>% 
    ungroup() %>% 
    arrange(-cc_rank)
}

dcpo_input_raw1 <- dcpo_input_raw %>% 
  process_dcpo_input_raw()

n_surveys <- surveys_tb %>% 
  distinct(survey) %>% 
  nrow()

n_items <- dcpo_input_raw1 %>%
  distinct(item) %>% 
  nrow()

n_countries <- dcpo_input_raw1 %>%
  distinct(country) %>% 
  nrow()

n_cy <- dcpo_input_raw1 %>%
  distinct(country, year) %>% 
  nrow() %>% 
  scales::comma()

n_years <- as.integer(summary(dcpo_input_raw1$year)[6]-summary(dcpo_input_raw1$year)[1])

spanned_cy <- dcpo_input_raw1 %>% 
  group_by(country) %>% 
  summarize(years = max(year) - min(year) + 1) %>% 
  summarize(n = sum(years)) %>% 
  pull(n) %>% 
  scales::comma()

total_cy <- {n_countries * n_years} %>% 
  scales::comma()

year_range <- paste("from",
                    summary(dcpo_input_raw$year)[1],
                    "to",
                    summary(dcpo_input_raw$year)[6])

n_cyi <- dcpo_input_raw1 %>% 
  distinct(country, year, item) %>% 
  nrow() %>% 
  scales::comma()

back_to_numeric <- function(string_number) {
  string_number %>% 
    str_replace(",", "") %>% 
    as.numeric()
}

covered_share <- covered_share_of_spanned(dcpo_input_raw1)
```

In this letter, we present the trust in civil servants/(bureaucracy/public administration) (TCS) dataset, which is based on the host of national and cross-national survey data available and recent advances in latent variable modeling of public opinion that allow us to make use of this sparse and incomparable data.
It provides comparable estimates of the trust and confidence the public puts in civil servants and public administrators/administration across countries and over time.
We validate the data by showing that these TCS scores are strongly correlated with responses to single survey items as well as with measures of [perceived corruption, unemployment, income inequality, and internal and external efficacy, the rule of law, government effectiveness].
We expect that the TCS data will become an invaluable source for broadly cross-national and longitudinal research on the causes and effects of trust in the civil service.


# Examining the Source Data on Trust in Bureaucracy

National and cross-national surveys have asked questions on trust attitudes toward public administrations over the past half-century, but the resulting data are both sparse, that is, unavailable for many countries and years, and incomparable, generated by many different survey items.
In all, we identified `r n_items` such survey items that were asked in no fewer than five country-years in countries surveyed at least twice; these items were drawn from `r n_surveys` different survey datasets.^[
The complete list of trust in civil servants/public administration survey items is included in online Appendix A.]

Together, the survey items in the source data were asked in `r n_countries` different countries in at least two time points over `r n_years` years, `r year_range`, yielding a total of `r n_cyi` country-year-item observations.
Observations for every year in each country surveyed would number `r total_cy`, and a complete set of country-year-items would encompass `r {n_countries * n_years * n_items} %>% scales::comma()` observations.
Compared to this complete set of country-year-items, the available data can be seen to be very, very sparse.
From a more optimistic standpoint, we note there there are `r n_cy` country-years in which we have at least _some_ information about the trust in civil servants of the population, that is, some `r round(covered_share)`% of the `r spanned_cy` country-years spanned by the data we collected.
But there can be no denying that the many different survey items employed renders these data incomparable and difficult to use together.

```{r itemcountry, fig.cap="Countries and Years with the Most Observations in the Source Data \\label{item_country_plots}", fig.height=3.5, fig.pos='h', cache=FALSE}
items_plot <- dcpo_input_raw1 %>%
  distinct(country, year, item) %>%
  count(item) %>%
  arrange(desc(n)) %>% 
  # head(12) %>% 
  ggplot(aes(forcats::fct_reorder(item, n, .desc = TRUE), n)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  theme(axis.title.x = element_blank(),
        axis.text.x  = element_text(angle = 90, vjust = .45, hjust = .95),
        axis.title.y = element_text(size = 9),
        plot.title = element_text(hjust = 0.5, size = 11)) +
  ylab("Country-Years\nObserved") +
  ggtitle("Items")

trust4_cy <- dcpo_input_raw1 %>% 
  filter(item == "trust4") %>%
  distinct(country, year) %>%
  nrow()

trust4_surveys <- dcpo_input_raw1 %>%
  filter(item == "trust4") %>%
  distinct(survey) %>%
  pull(survey) %>% 
  str_split(", ") %>% 
  unlist() %>% 
  unique() %>% 
  sort()


countries_plot <- dcpo_input_raw1 %>%
  mutate(country = if_else(stringr::str_detect(country, "United"),
                           stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                           country)) %>% 
  distinct(country, year, item) %>% 
  count(country) %>%
  arrange(desc(n)) %>% 
  head(12) %>% 
  ggplot(aes(forcats::fct_reorder(country, n, .desc = TRUE), n)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  theme(axis.title.x = element_blank(),
        axis.text.x  = element_text(angle = 90, vjust = .45, hjust = .95),
        axis.title.y = element_text(size = 9),
        plot.title = element_text(hjust = 0.5, size = 11)) +
  ylab("Year-Items\nObserved") +
  ggtitle("Countries")

cby_plot <- dcpo_input_raw1 %>%
  mutate(country = if_else(stringr::str_detect(country, "United"),
                           stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                           country),
         country = stringr::str_replace(country, "South", "S.")) %>% 
  distinct(country, year) %>%
  count(country) %>% 
  arrange(desc(n)) %>% 
  head(12) %>% 
  ggplot(aes(forcats::fct_reorder(country, n, .desc = TRUE), n)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  theme(axis.title.x = element_blank(),
        axis.text.x  = element_text(angle = 90, vjust = .45, hjust = .95),
        axis.title.y = element_text(size = 9),
        plot.title = element_text(hjust = 0.5, size = 11)) +
  ylab("Years\nObserved") +
  ggtitle("Countries")


ybc_plot <- dcpo_input_raw1 %>%
  distinct(country, year) %>%
  count(year, name = "nn") %>%
  ggplot(aes(year, nn)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  theme(axis.title.x = element_blank(),
        # axis.text.x  = element_text(angle = 90, vjust = .45, hjust = .95),
        axis.title.y = element_text(size = 9),
        plot.title = element_text(hjust = 0.5, size = 11)) +
  ylab("Countries\nObserved") +
  ggtitle("Years")

us_obs <- dcpo_input_raw1 %>% 
  distinct(country, year, item) %>%
  count(country) %>%
  filter(country == "United States") %>%
  pull(n)

others <- dcpo_input_raw1 %>%
  distinct(country, year, item) %>%
  count(country) %>%
  arrange(desc(n)) %>%
  slice(2:5) %>%
  pull(country) %>% 
  knitr::combine_words()

countries_cp <- dcpo_input_raw1 %>%
  mutate(country = if_else(stringr::str_detect(country, "United"),
                           stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                           country),
         country = stringr::str_replace(country, "South", "S.")) %>% 
  distinct(country, year, item) %>%
  count(country) %>% 
  arrange(desc(n)) %>% 
  head(12) %>% 
  pull(country)

countries_cbyp <- dcpo_input_raw1 %>%
  mutate(country = if_else(stringr::str_detect(country, "United"),
                           stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                           country),
         country = stringr::str_replace(country, "South", "S.")) %>% 
  distinct(country, year) %>%
  count(country) %>% 
  arrange(desc(n)) %>% 
  head(12) %>% 
  pull(country)

adding <- setdiff(countries_cbyp, countries_cp) %>% 
  knitr::combine_words()

dropping <- setdiff(countries_cp, countries_cbyp) %>% 
  knitr::combine_words()

y_peak_year <- dcpo_input_raw1 %>%
  distinct(country, year) %>%
  count(year, name = "nn") %>% 
  filter(nn == max(nn)) %>% 
  pull(year)

y_peak_nn <- dcpo_input_raw1 %>%
  distinct(country, year) %>%
  count(year, name = "nn") %>% 
  filter(nn == max(nn)) %>% 
  pull(nn)

data_poorest <- dcpo_input_raw1 %>%
  distinct(country, year, item) %>%
  count(country) %>%
  arrange(n) %>%
  filter(n == 2) %>%
  pull(country) %>% 
  knitr::combine_words() %>% 
  paste0("---", ., "---")

wordify_numeral <- function(x) setNames(c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", " seventeen", "eighteen", "nineteen"), 1:19)[x]

n_data_poor <- {data_poorest %>%
    str_split(",") %>% 
    first()} %>% 
  length() 

  if(n_data_poor < 20) {
    n_data_poorest <- n_data_poor %>% 
      wordify_numeral()
  } else {
    n_data_poorest <- n_data_poor
    data_poorest <- " "
  }

(countries_plot + cby_plot) / (ybc_plot)
```

\pagebreak
```{r obs, fig.cap = "Source Data Observations by Country and Year \\label{obs_by_cy}", fig.height = 9}
dcpo_input_raw1 %>% 
  mutate(country = str_replace(country, "’", "'")) %>% 
  distinct(country, year, item, cc_rank) %>% 
  group_by(country, year) %>% 
  summarize(n = n(),
            cc_rank = cc_rank) %>% 
  ungroup() %>% 
  distinct() %>% 
  ggplot(aes(x = year, 
             y = forcats::fct_reorder(country, cc_rank),
             fill = n)) + 
  geom_tile() +
  scale_fill_stepsn(colors = rev(hcl.colors(5, "inferno")),
                    n.breaks = 5,
                    show.limits = TRUE,
                    right = FALSE,
                    name = "Observations") +
  labs(x = NULL, y = NULL) +
  scale_x_continuous(breaks=seq(1984, 2024, 4),
                     sec.axis = dup_axis()) +
  scale_y_discrete(position = "right") +
  theme(legend.justification=c(0, 0), 
        legend.position=c(0.01, 0.01),
        axis.text.y  = element_text(size = 7)) 
```


Consider the most frequently asked item in the data we collected, which asks respondents whether they strongly agree, agree, disagree, or strongly disagree with the statement "I am going to name a number of institutions. For each one, could you tell me how much trust you have in them. Is it a great deal of trust, some trust, not very much trust or none at all? Civil service."^[Question text may vary slightly across survey datasets, but not, roughly speaking, by more than the translation differences across languages found within the typical cross-national survey dataset. In this case, some questions ask about "the public administration" or "government officials" rather than "the civil service," and some refer to "confidence" rather than "trust."  These words are often translated identically.]
Employed by the Arab Barometer,the Asia Europe Survey, the Asian Barometer, the British Social Attitudes Survey, the Latino Barometer, the East Asian Social Survey, the European Values Survey, the Italian National Election Study, the South Asian Barometer, and the World Values Survey, this question was asked in a total of `r trust4_cy` different country-years.
That this constitutes only `r {trust4_cy*100/(spanned_cy %>% str_replace(",", "") %>% as.numeric())} %>% round()`% of the country-years spanned by our data---and again, this is the _most common_ survey item---again underscores just how sparse and incomparable the available public opinion data is on this topic.

The upper left panel of Figure\nobreakspace{}\ref{item_country_plots} shows the dozen countries with the highest count of country-year-item observations.
The United States, with `r us_obs` observations, is far and away the best represented country in the source data, followed by `r others`.
At the other end of the spectrum, `r n_data_poorest` countries`r data_poorest`have only the minimum two observations required to be included in the source dataset at all.
The upper right panel shows the twelve countries with the most years observed; this group is similar, but with `r adding` joining the list and `r dropping` dropping off.
The bottom panel counts the countries observed in each year and reveals just how few relevant survey items were asked before 1990.
Country coverage reached its peak in `r y_peak_year`, when respondents in `r y_peak_nn` countries were asked items about trust in civil servants.
In the next section, we describe how we are able to make use of all of this sparse and incomparable survey data to generate complete, comparable time-series TCS scores using a latent variable model.

# Estimating trust in civil servants

A number of recent studies have developed latent variable models of public opinion based on cross-national survey data [see @Claassen2019; @Caughey2019; @McGann2019; @Kolczynska2020].
To estimate trust in civil servants across countries and over time, we employ the latest of these methods that is appropriate for data that is not only incomparable but also sparse, the Dynamic Comparative Public Opinion (DCPO) model elaborated in @Solt2020c.^[
@Solt2020c demonstrates that the DCPO model provides a better fit to survey data than the models put forward by @Claassen2019 or @Caughey2019.
The @McGann2019 model depends on dense survey data unlike the sparse data on trust in civil servants described in the preceding section.
@Kolczynska2020 is the very most recent of these five works and builds on each of the others, but the MRP approach developed in that piece is suitable not only when the available survey data are dense but also when ancillary data on population characteristics are available, so it is similarly inappropriate to this application.]
The DCPO model is a population-level two-parameter ordinal logistic item response theory (IRT) model with country-specific item-bias terms.
For a detailed description of the DCPO model, see Appendix B and @Solt2020c [, 3-8]; here, we focus on how it deals with the principal issues raised by our source data, incomparability and sparsity.

The DCPO model accounts for the incomparability of different survey questions with two parameters.
First, it incorporates the _difficulty_ of each question's responses, that is, how much trust in civil servants is indicated by a given response. 
That each response evinces more or less of our latent trait is most easily seen with regard to the ordinal responses to the same question: strongly agreeing with the statement "you can generally trust the people who run our government to do what is right," exhibits more trust in civil servants than responding "agree," which in turn is more trusting than responding "disagree," which is a more trusting response than "strongly disagree."
But this is also true across questions.
For example, expressing "great trust" in civil servants "to look after your interests" likely expresses even more trust than just strongly agreeing that civil servants can be trusted to do what is right.
Second, the DCPO model accounts for each question's _dispersion_, its noisiness with regard to our latent trait.
The lower a question's dispersion, the better that changes in responses to the question map onto changes in trust of civil servants.
Together, the model's difficulty and dispersion estimates work to generate comparable estimates of the latent variable of trust in civil servants from the available but incomparable source data.

To address the sparsity of the source data---the fact that there are gaps in the time series of each country, and even many observed country-years have only one or few observed items---DCPO uses simple local-level dynamic linear models, i.e., random-walk priors, for each country.
That is, within each country, each year's value of trust in civil servants is modeled as the previous year's estimate plus a random shock.
These dynamic models smooth the estimates of trust in civil servants over time and allow estimation even in years for which little or no survey data is available, albeit at the expense of greater measurement uncertainty.

```{r dcpo_input, eval=FALSE, include=FALSE, results=FALSE}
dcpo_input <- DCPOtools::format_dcpo(dcpo_input_raw1,
                                     scale_q = "trust4",
                                     scale_cp = 2)
save(dcpo_input, file = here::here("data", "dcpo_input.rda"))
```

```{r dcpo, eval=FALSE, include=FALSE, results=FALSE}
iter <- 1000
 
dcpo <- cmdstan_model("~/Documents/Projects/DCPO/inst/stan/dcpo.stan")

dcpo_output <- dcpo$sample(
   data = dcpo_input[1:13], 
   max_treedepth = 14,
   adapt_delta = 0.99,
   step_size = 0.005,
   seed = 324, 
   chains = 4, 
   parallel_chains = 4,
   iter_warmup = iter/2,
   iter_sampling = iter/2,
   refresh = iter/50
 )

results_path <- here::here(file.path("data", 
                                     iter, 
                                  {str_replace_all(Sys.time(),
                                                      "[- :]",
                                                   "") %>%
                                         str_replace("\\d{2}$",
                                                     "")}))

dir.create(results_path, 
           showWarnings = FALSE, 
           recursive = TRUE)

dcpo_output$save_data_file(dir = results_path,
                           random = FALSE)
dcpo_output$save_output_files(dir = results_path,
                              random = FALSE)
```

```{r dcpo_results}
 if (!exists("results_path")) {
   latest <- "202303030905"
   results_path <- here::here("data", "1000", latest)
   
   # Define OSF_PAT in .Renviron: https://docs.ropensci.org/osfr/articles/auth
   if (!file.exists(file.path(results_path, paste0("dcpo-", latest, "-1.csv")))) {
     dir.create(results_path, showWarnings = FALSE, recursive = TRUE)
     osf_retrieve_node("82w36") %>% 
       osf_ls_files() %>% 
       filter(name == latest) %>% 
       osf_download(path = here::here("data", "1000"))
   }
}
 
dcpo_output <- as_cmdstan_fit(here(results_path,
                                   list.files(results_path,
                                              pattern="csv$")))

```

```{r dcpo_summary}
load(file = here::here("data", "dcpo_input.rda"))
theta_summary <- DCPOtools::summarize_dcpo_results(dcpo_input,
                                                   dcpo_output,
                                                   "theta")

res_cy <- nrow(theta_summary) %>% 
  scales::comma()

res_c <- theta_summary %>% 
  pull(country) %>% 
  unique() %>% 
  length()

save(theta_summary, file = here::here("data",
                                      "theta_summary.rda"))
```

```{r theta_results}
theta_results <- extract_dcpo_results(dcpo_input,
                                      dcpo_output,
                                      par = "theta")
```


```{r cs, fig.cap="TCS Scores, Most Recent Available Year \\label{cs_mry}", fig.height=10, fig.width=8}
n_panes <- 2
axis_text_size <- 10

p1_data <- theta_summary %>%
  group_by(country) %>%
  top_n(1, year) %>%
  ungroup() %>%
  arrange(mean) %>%
  transmute(country_year = paste0(country, " (", year, ")") %>% 
              str_replace("’", "'"),
            estimate = mean,
            conf.high = q90,
            conf.low = q10,
            pane = n_panes - (ntile(mean, n_panes) - 1),
            ranked = as.factor(ceiling(row_number())))

p_theta <- ggplot(p1_data,
                  aes(x = estimate, y = ranked)) +
  geom_segment(aes(x = conf.low, xend = conf.high,
                   y = ranked, yend = ranked),
               na.rm = TRUE,
               alpha = .4) +
  geom_point(fill = "black", shape = 21, size = .5, na.rm = TRUE) +
  theme_bw() + theme(legend.position="none",
                     axis.text.x  = element_text(size = axis_text_size,
                                                 angle = 90,
                                                 vjust = .45,
                                                 hjust = .95),
                     axis.text.y  = element_text(size = axis_text_size),
                     axis.title = element_blank(),
                     strip.background = element_blank(), 
                     strip.text = element_blank(),
                     panel.grid.major = element_line(size = .3),
                     panel.grid.minor = element_line(size = .15)) +
  scale_y_discrete(breaks = p1_data$ranked, labels=p1_data$country_year) +
  coord_cartesian(xlim=c(0, 1)) +
  facet_wrap(vars(pane), scales = "free", nrow = 1)


p_theta +
  plot_annotation(caption = "Note: Gray whiskers represent 80% credible intervals.")

bottom5 <- p1_data %>% 
  arrange(ranked) %>% 
  slice(1:5) %>% 
  pull(country_year) %>% 
  str_replace(" \\(.*", "") %>% 
  knitr::combine_words()

```

We estimated the model using the `DCPOtools` package for R [@Solt2020a], running four chains for 1,000 iterations each and discarding the first half as warmup, which left us with 2,000 samples.
The $\hat{R}$ diagnostic had a maximum value of 1.01, indicating that the model converged.

<!--The dispersion parameters of the survey items indicate that all of them load well on the latent variable (see Appendix A).-->

The result is estimates, in all `r res_cy` country-years spanned by the source data, of mean trust in civil servants, what we call TCS scores.
Figure\nobreakspace{}\ref{cs_mry} displays the most recent available TCS score for each of the `r res_c` countries and territories in the dataset.

[The Scandinavian countries, Finland and Denmark which have high quality of governance, and Asian countries, China, India and Singapore which have e history of meritocracy, are at the top of the list.]
<!--Niger in input date only has two time points. The question is about tax.2015's data of trust in institutions  was higher than data in later years https://bti-project.org/en/reports/country-report/NER -->
The latest scores for `r bottom5` have them as the places where the public has the lowest trust toward civil servants.

```{r ts, fig.cap="TCS Scores Over Time Within Selected Countries \\label{ts_plots}", fig.height=3.5}
countries <- c("Finland", "Germany", "Luxembourg", "Bangladesh",
               "Malaysia", "Turkey", "United Kingdom", "South Korea",
               "Belgium", "Spain", "United States", "Chile", 
               "Ukraine", "Italy", "Argentina", "Mexico")

countries2 <- countries %>% 
  str_replace("United States", "U.S.") %>% 
  str_replace("United Kingdom", "U.K.")

c_res <- theta_summary %>% 
  filter(country %in% countries) %>%
  mutate(country = str_replace(country, "United States", "U.S.") %>% 
           str_replace("United Kingdom", "U.K.") %>% 
           factor(levels = countries2))

ggplot(data = c_res, aes(x = year, y = mean)) +
  theme_bw() +
  theme(legend.position = "none") +
  coord_cartesian(xlim = c(1980, 2025), ylim = c(0, 1)) +
  labs(x = NULL, y = "TCS Scores") +
  geom_ribbon(data = c_res, aes(ymin = q10, ymax = q90, linetype=NA), alpha = .25) +
  geom_line(data = c_res) +
  facet_wrap(~country, nrow = 2) +
  theme(axis.text.x  = element_text(size=7,
                                    angle = 90,
                                    vjust = .45,
                                    hjust = .95),
        strip.background = element_rect(fill = "white", colour = "white")) +
  plot_annotation(caption = "Note: Countries are ordered by their TCS scores in their most recent\navailable year; gray shading represents 80% credible intervals.")
```

We show the changes of TCS over time in sixteen countries in Figure\nobreakspace{}\ref{ts_plots}.
As displayed in Figure\nobreakspace{}\ref{cs_mry}, the dataset covers a wide range of geographic breadth, allowing comparative studies of countries and regions too often neglected [see @Wilson2021].
Figure\nobreakspace{}\ref{ts_plots} also shows that while TCS has risen prominently in some countries, such as Finland and U.K., trusting attitudes have maintained high or low over time in others, like China and Turkey, or fallen steadily, as in U.s. and Mexico. 
They have advanced and retreated as in Spain or have declined and recovered as in Argentina .
It is worthwhile to further explore the causes and consequences of these trends.


# Validating Trust in Civil Servants

```{r internal_val_dat, include=FALSE}
internal_tscs_dat <- dcpo_input_raw1 %>% 
  filter(item == "trust4") %>%  
  mutate(title = "All Country-Years",
         neg = FALSE)

internal_cs_dat <- dcpo_input_raw1 %>% 
  filter(survey == "eb963") %>%  
  mutate(title = "Eurobarometer 96.3 (2022)",
         neg = FALSE)

internal_ts_dat <- dcpo_input_raw1 %>% 
  filter(survey == "usgss" & item == "trust3") %>%  
  mutate(title = "United States",
         neg = FALSE)
```

```{r internalval, fig.height = 4, fig.cap = "Convergent Validation: Correlations Between TSC Scores and Individual TSC Source-Data Survey Items \\label{internal_val}"}
internal_tscs_plot <- validation_plot(internal_tscs_dat,
                                lab_x = .1,
                                lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Expressing Quite a Lot or a Great Deal of Trust")

internal_cs_plot <- validation_plot(internal_cs_dat,
                                lab_x = .1,
                                lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        # strip.text.x = element_text(size=5),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Tending to Trust Public Administration in Country")

internal_ts_plot <- validation_plot(internal_ts_dat,
                                    lab_x = 1989,
                                    lab_y = .95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(ylim = c(0,1)) +
  labs(x = "Year",
       y = "Score") +
  annotate("text", x = 2005, y = .8, size = 2,
           label = 'U.S. GSS') + #confidence in executive branch
  annotate("text", x = 2008, y = .32, size = 2,
           label = "TCS Score")

internal_tscs_plot + internal_cs_plot + internal_ts_plot  +
  patchwork::plot_annotation(caption = "Note: Gray whiskers and shading represent 80% credible intervals.")
```

Figure\nobreakspace{}\ref{internal_val}

```{r ext_val1_dat, include=FALSE}
ext_wvs_gov_dat <- tibble(survey = c("wvs_combo", "wvs7"),
                          item = "trust_gov4",
                          variable = c("e069_11", "q71"),
                          values = "c(4:1)") %>% 
  DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%  
  mutate(title = "World Values Survey",
         neg = FALSE)

ext_lb_courts_dat <- read_csv(here("data-raw",
                                   "surveys_lb_courts.csv")) %>% 
    DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%   
  mutate(title = "Latinobarómetro",
         neg = FALSE) 

ext_evs_parl_dat <- read_csv(here("data-raw",
                     "surveys_evs_parliament.csv")) %>% 
  DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%  
    mutate(title = "European Values Survey",
           neg = FALSE)
```

```{r extval1, fig.cap="Construct Validation: Correlations Between TCS Scores and Trust in Institutions Survey Items \\label{ext_val1}", fig.height=4}
ext_wvs_gov_plot <- validation_plot(ext_wvs_gov_dat,
                                    lab_x = .1,
                                    lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Expressing Quite a Lot of Trust or More\nin National Government")

ext_lb_courts_plot <- validation_plot(ext_lb_courts_dat,
                                      lab_x = .1,
                                      lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Expressing Some Trust or More\nin Judiciary")

ext_evs_parl_plot <- validation_plot(ext_evs_parl_dat,
                                       lab_x = .1,
                                       lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Expressing Quite a Lot of Confidence or More\nin Parliament")

ext_wvs_gov_plot + ext_evs_parl_plot + ext_lb_courts_plot +
  plot_annotation(caption = "Note: Gray whiskers and shading represent 80% credible intervals.")
```

Figure\nobreakspace{}\ref{ext_val1}

```{r ext_val2_dat, include=FALSE}
ext_dat <- read_csv(here("data-raw",
                         "surveys_cor.csv"))

ext_wvs_cor_dat <- ext_dat %>%
  filter(str_detect(survey, "wvs")) %>% 
  DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%
  mutate(title = "World Values Survey", 
         neg = FALSE)

ext_cor_dat <- ext_dat %>% 
    filter(!str_detect(survey, "wvs") &
             !str_detect(survey, "issp")) %>% 
  DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%
  mutate(title = "Arab, Asian & New Europe Barometers\nand Latinobarómetro",
         neg = FALSE)

ext_issp_cor_dat <- ext_dat %>%
  filter(str_detect(survey, "issp")) %>% 
  DCPOtools::dcpo_setup(datapath = here("..",
                                        "data",
                                        "dcpo_surveys")) %>%  
  mutate(title = "ISSP Citizenship I & II",
         neg = FALSE)
```

```{r extval2, fig.height = 4, fig.cap = "Construct Validation: Correlations Between TCS Scores and Corruption of Public Servants Survey Items \\label{ext_val2}"}
ext_wvs_cor_plot <- validation_plot(ext_wvs_cor_dat,
                                    lab_x = .1,
                                    lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Saying Most or All State Authorities\nAre Involved in Corruption")

ext_cor_plot <- validation_plot(ext_cor_dat,
                                    lab_x = .1,
                                    lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Saying Most or Almost All\nGovernment Officials Are Corrupt")
 
ext_issp_cor_plot <- validation_plot(ext_issp_cor_dat,
                                    lab_x = .1,
                                    lab_y = 95) +
  theme_bw() +
  theme(legend.position="none",
        axis.text  = element_text(size=8),
        axis.title = element_text(size=9),
        plot.title = element_text(hjust = 0.5, size = 9),
        strip.background = element_blank()) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,100)) +
  labs(x = "TCS Score",
       y = "% Saying a Lot or Almost Everyone in the\nCountry's Public Service Is Involved in Corruption")

ext_wvs_cor_plot + ext_cor_plot + ext_issp_cor_plot + patchwork::plot_annotation(caption = "Note: Gray whiskers and shading represent 80% credible intervals.")
```

Figure\nobreakspace{}\ref{ext_val2}

\pagebreak
\pagebreak
# Appendix A: Survey Items Used to Estimate Trust in Civil Servants
National and cross-national surveys have often included questions tapping trusting attitudes over the past half-century, but the resulting data are both sparse, that is, unavailable for many countries and years, and incomparable, generated by many different survey items.
In all, we identified `r n_items` such survey items that were asked in no fewer than five country-years in countries surveyed at least twice; these items were drawn from `r n_surveys` different survey datasets.
These items are listed in the table below, along with the dispersion ($\alpha$) and difficulty ($\beta$) scores estimated for each from the DCPO model.
Question text may vary slightly across survey datasets, but not, roughly speaking, by more than the translation differences across languages found within the typical cross-national survey dataset.
Lower values of dispersion indicate questions that better identify publics with a higher level of trust from those with lower.
Items have one less difficulty score than the number of response categories.
Survey dataset codes correspond to those used in the `DCPOtools` R package; they appear in decreasing order of country-years contributed.

Together, the survey items in the source data were asked in `r n_countries` different countries in at least two time points over `r n_years` years, `r year_range`, yielding a total of `r n_cyi` country-year-item observations.
The number of items observed in the source data for each country-year is plotted in Figure\nobreakspace{}\@ref(fig:obs_by_cy) below.
The TCS scores of country-years with more observed items are likely to be estimated more precisely.
The estimates for country-years with fewer (or no) observed items rely more heavily (or entirely) on the random-walk prior and are therefore less certain.

\noindent Table A1: Indicators Used in the Unidimensional Latent Variable Model of Democratic Support

```{r tsc_items}
alpha_results <- summarize_dcpo_results(dcpo_input,
                                        dcpo_output = dcpo_output,
                                        pars = "alpha") %>% 
  transmute(item = question,
            dispersion = mean)

beta_results <- summarize_dcpo_results(dcpo_input,
                                       dcpo_output,
                                       "beta") %>% 
  group_by(question) %>% 
  summarize(difficulties0 = paste0(sprintf("%.2f", round(mean, 2)),
                                   collapse = ", ")) %>% 
  mutate(item = question,
         cp = if_else(str_detect(item, "threestate"),
                      2, 
                      as.numeric(str_extract(item, "\\d+")) - 1),
         term = str_glue("(( ?-?[0-9].[0-9][0-9]?,?){{{cp}}})"),
         difficulties = str_extract(difficulties0, 
                                    term) %>%
           str_replace(",$", "") %>% 
           str_trim()) %>% 
  transmute(item, difficulties)
                                    
items_summary <- dcpo_input_raw1 %>%
  dplyr::select(country, year, item, survey) %>%
  distinct() %>%
  separate(survey, c("surv1", "surv2", "surv3"), sep=", ", fill = "left") %>%
  pivot_longer(cols = starts_with("surv"), values_to = "survey") %>%
  filter(!is.na(survey)) %>% 
  group_by(item) %>% 
  mutate(survey = str_extract(survey, "^[a-z]*"),
         all_surveys = paste0(unique(survey), collapse = ", ")) %>% 
  ungroup() %>% 
  distinct(country, year, item, .keep_all = TRUE) %>% 
  group_by(item) %>% 
  mutate(n_cy = n()) %>% 
  ungroup() %>%
  distinct(item, n_cy, all_surveys) %>% 
  left_join(surveys_tb %>%
              select(item, question_text, response_categories) %>%
              distinct(item, .keep_all = TRUE),
            by = "item") %>% 
  left_join(alpha_results, by = "item") %>% 
  left_join(beta_results, by = "item") %>% 
  arrange(-n_cy)
```

```{r tsc_items_table}  
items_summary %>% 
  transmute(`Survey\nItem\nCode` = item,
            `Country-Years` = as.character(n_cy),
            `Question Text` = str_replace(question_text, "([^(]*)\\(.*", "\\1"),
            `Response Categories` = response_categories,
            `Dispersion` = dispersion,
            `Difficulties`= difficulties,
            `Survey Dataset Codes` = all_surveys) %>% 
  modelsummary::datasummary_df(output = "kableExtra",
                               longtable = TRUE) %>% 
  kableExtra::column_spec(1, width = "7em") %>%
  kableExtra::column_spec(2, width = "4em") %>%
  kableExtra::column_spec(3, width = "13em") %>%
  kableExtra::column_spec(4, width = "16em") %>%
  kableExtra::column_spec(5, width = "4em") %>%
  kableExtra::column_spec(c(6, 7), width = "8em") %>% 
  kableExtra::kable_styling(font_size = 7) %>%
  kableExtra::kable_styling(latex_options = c("repeat_header")) %>%
  kableExtra::kable_styling(latex_options = "striped")
```

\clearpage
\pagebreak

# Appendix B: The DCPO Model
A number of recent studies have developed latent variable models of public opinion based on cross-national survey data [see @Claassen2019; @Caughey2019; @McGann2019; @Kolczynska2020].
To estimate trust in civil servants across countries and over time, we employ the latest of these methods that is appropriate for data that is not only incomparable but also sparse, the Dynamic Comparative Public Opinion (DCPO) model elaborated in @Solt2020c.^[
@Solt2020c demonstrates that the DCPO model provides a better fit to survey data than the models put forward by @Claassen2019 or @Caughey2019.
The @McGann2019 model depends on dense survey data unlike the sparse data on trust in civil servants described in the preceding section.
@Kolczynska2020 is the very most recent of these five works and builds on each of the others, but the MRP approach developed in that piece is suitable not only when the available survey data are dense but also when ancillary data on population characteristics are available, so it is similarly inappropriate to this application.]
The DCPO model is a population-level two-parameter ordinal logistic item response theory (IRT) model with country-specific item-bias terms.

DCPO models the total number of survey responses expressing at least as much trust in civil servants as response category $r$ to each question $q$ in country $k$ at time $t$, $y_{ktqr}$, out of the total number of respondents surveyed, $n_{ktqr}$, using the beta-binomial distribution:

\begin{equation}
a_{ktqr} = \phi\eta_{ktqr} \label{eq:bb_a}
\end{equation}
\begin{equation}
b_{ktqr} = \phi(1 - \eta_{ktqr}) \label{eq:bb_b}
\end{equation}
\begin{equation}
y_{ktqr} \sim \textrm{BetaBinomial}(n_{ktqr}, a_{ktqr}, b_{ktqr}) \label{eq:betabinomial}
\end{equation}

where $\phi$ represents an overall dispersion parameter to account for additional sources of survey error beyond sampling error and $\eta_{ktqr}$ is the expected probability that a random person in country $k$ at time $t$ answers question $q$ with a response at least as positive as response $r$.^[
The ordinal responses to question $q$ are coded to range from 1 (expressing the least trust in civil servants) to $R$ (expressing the most trust in civil servants), and $r$ takes on all values greater than 1 and less than or equal to $R$.]

This expected probability, $\eta_{ktqr}$, is in turn estimated as follows:

\begin{equation}
\eta_{ktqr} = \textrm{logit}^{-1}(\frac{\bar{\theta'}_{kt} - (\beta_{qr} + \delta_{kq})}{\sqrt{\alpha_{q}^2 + (1.7*\sigma_{kt})^2}}) \label{eq:dcpo}
\end{equation}

In this equation, $\beta_{qr}$ represents the difficulty of response $r$ to question $q$, that is, the degree of trust in civil servants the response expresses.  The $\delta_{kq}$ term represents country-specific item bias: the extent to which all responses to a particular question $q$ may be more (or less) difficult in a given country $k$ due to translation issues, cultural differences in response styles, or other idiosyncrasies that render the same survey item not equivalent across countries.^[
Estimating $\delta_{kq}$ requires repeated administrations of question $q$ in country $k$, so
when responses to question $q$ are observed in country $k$ in only a single year, the DCPO model sets $\delta_{kq}$ to zero by assumption, increasing the error of the model by any country-item bias that is present.
Questions that are asked repeatedly over time in only a single country pose no risk of country-specific item bias, so $\delta_{kq}$ in such cases are also set to zero.]
The dispersion of question $q$, its noisiness in relation to our latent variable, is $\alpha_{q}$. The mean and standard deviation of the unbounded latent trait of trust in civil servants are $\bar{\theta'}_{kt}$ and $\sigma_{kt}$, respectively.

Random-walk priors are used to account for the dynamics in $\bar{\theta'}_{kt}$ and $\sigma_{kt}$, and weakly informative priors are placed on the other parameters.^[
The dispersion parameters $\alpha_{q}$ are drawn from standard half-normal prior distributions, that is, the positive half of N(0, 1).
The first difficulty parameters for each question, $\beta_{q1}$, are drawn from standard normal prior distributions, and the differences between $\beta$s for each $r$ for the same question $q$ are drawn from standard half-normal prior distributions.
The item-bias parameters $\delta_{kq}$ receive normally-distributed hierarchical priors with mean 0 and standard deviations drawn from standard half-normal prior distributions.
The initial value of the mean unbounded latent trait for each country, $\bar{\theta'}_{k1}$, is assigned a standard normal prior, as are the transition variances $\sigma_{\bar{\theta'}}^2$ and $\sigma_{\sigma}^2$; the initial value of the standard deviation of the unbounded latent trait for each country, $\sigma_{k1}$, is drawn from a standard lognormal prior distribution.
The overall dispersion, $\phi$, receives a somewhat more informative prior drawn from a gamma(4, 0.1) distribution that yields values that are well scaled for that parameter.]
The dispersion parameters $\alpha_q$ are constrained to be positive and all survey responses are coded with high values indicating more trust in civil servants to fix direction.
The difficulty $\beta$ of "disagree" (on the four-point, "strongly agree" to "strongly disagree" scale) to the statement "On the whole, men make better political leaders than women do" is set to 1 to identify location, and for each question $q$ the difficulties for increasing response categories $r$ are constrained to be increasing.
The sum of $\delta_{kq}$ across all countries $k$ is set to zero for each question $q$:

\begin{equation}
\sum_{k = 1}^K \delta_{kq} = 0
\end{equation}

Finally, the logistic function is used to transform $\bar{\theta'}_{kt}$ to the unit interval and so give the bounded mean of latent trust in civil servants, $\bar{\theta}_{kt}$, which is our parameter of interest here [see @Solt2020c, 3-8].


The DCPO model accounts for the incomparability of different survey questions with two parameters.
First, it incorporates the _difficulty_ of each question's responses, that is, how much trust in civil servants is indicated by a given response. 
That each response evinces more or less of our latent trait is most easily seen with regard to the ordinal responses to the same question: strongly agreeing with the statement "both the husband and wife should contribute to household income," exhibits more trust in civil servants than responding "agree," which in turn is more egalitarian than responding "disagree," which is a more egalitarian response than "strongly disagree."
But this is also true across questions.
For example, strongly disagreeing that "on the whole, men make better business executives than women do" likely expresses even more egalitarianism than strongly agreeing merely that both spouses should have paying jobs.
Second, the DCPO model accounts for each question's _dispersion_, its noisiness with regard to our latent trait.
The lower a question's dispersion, the better that changes in responses to the question map onto changes in trust in civil servants.
Together, the model's difficulty and dispersion estimates work to generate comparable estimates of the latent variable of trust in civil servants from the available but incomparable source data.

To address the sparsity of the source data---the fact that there are gaps in the time series of each country, and even many observed country-years have only one or few observed items---DCPO uses simple local-level dynamic linear models, i.e., random-walk priors, for each country.
That is, within each country, each year's value of trust in civil servants is modeled as the previous year's estimate plus a random shock.
These dynamic models smooth the estimates of trust in civil servants over time and allow estimation even in years for which little or no survey data is available, albeit at the expense of greater measurement uncertainty.