# Frozen Conflict: Geopgraphic Clustering and Polarization in

Minneapolis Mayoral Elections

Isak Dai  
December 12, 2025

# 1. Literature Review

# 2. Spatial Patterns - Moran’s I and Local Indicators of Spatial Association (LISA)

Using Moran’s I, we can quantify the degree of geographic clustering of a variable. Moran’s I ranges from -1 to 1, where -1 suggests a checkerboard pattern of perfect anti-clustering (very high Frey voteshare next to very low Frey vote share), 0 suggests no spatial pattern, and 1 suggests perfect clustering (very high Frey vote share next to similar frey vote shares). For instance, the Moran’s I value of 0.629 in Frey’s final-round vote share in 2017 suggests a high degree of clustering, akin to findings from legislative election in countries like South Korea and Spain. In other words, geographic clustering in Frey’s vote share is similar to the rates found in elections contested by well-established political parties. The p-value of 0.001 means that in fewer than 1 in 1,000 cases does this degree of clustering occur by random chance.

To determine which regions are significant contributors to the Moran’s I statistic, LISA (Local Indicators of Spatial Association) tells us where the clustering is taking place. LISA finds precincts where Frey got a high percentage of the vote that are surrounded by other precincts where he got a high percentage of the vote as well as areas where he got a low percentage of the vote surrounded by other areas wehre he got a low percentage. There are also “High-Low” and “Low-High” precincts, which are precincts where Frey got a high percentage of the vote surrounded by precincts where he got a low percentage of the vote, and vice versa.

## 2.1 2017 Mayoral Election: Jacob Frey beats Betsy Hodges

Jacob Frey has run in and won three Minneapolis mayoral elections, starting by unseating incumbent Betsy Hodges in 2017. After winning the city’s first election featuring ranked choice voting (RCV) in 2013, Hodges was embroiled in continual scandals involving the Minneapolis Police Department (Sepic 2017). Critics from the left found her response to the police killings of Jamar Clark and Justine Damond lacking in substance, while those on the right were upset by her firing of MPD Chief Janeé Harteau (Dally 2017). Progressive State Rep. Raymond Dehn and moderate Councilman Jacob Frey squeezed out Hodges, forcing her out of the final round of RCV that Frey won with 57.2% of the vote (Figure 1.1). Looking at a map of the final round, Frey dominated in his council district in North Loop and Downtown East as well as Southwest Minneapolis, while Dehn performed strongest near the University of Minnesota campus and along Lake Street in South Minneapolis.

<figure>
<img src="attachment:resources/plots/2017_frey_share.png" alt="Figure 1.1: Jacob Frey’s vote share by precinct, 2017" />
<figcaption aria-hidden="true">Figure 1.1: Jacob Frey’s vote share by precinct, 2017</figcaption>
</figure>

<figure>
<img src="attachment:resources/plots/2017_lisa_clusters.png" alt="Figure 1.2: LISA Clusters of Frey’s vote share, 2017" />
<figcaption aria-hidden="true">Figure 1.2: LISA Clusters of Frey’s vote share, 2017</figcaption>
</figure>

LISA (Local Indicators of Spatial Association) tells us where the clustering is taking place, finding precincts where Frey got a high percentage of the vote that are surrounded by other precincts where he got a high percentage of the vote. Frey’s has two distinct strongholds, depicted in red (Figure 1.2). First is the core of Downtown Minneapolis, home to more affluent renters in highrise buildings. The second is Southwest Minneapolis, with high-income homeowners living largely in single-family homes. The anti-Frey bloc is a large swath stretching from the Lake Street corridor up through the University of Minnesota campus, inhabited by younger renters and large Latine, Indigenous, and Black communities.

## 2.2 2021 Mayoral Election: Jacob Frey fends off Kate Knuth

Four years later, Frey was running as an incumbent, defending his record that had been battered by the twin crises of the COVID-19 pandemic and the murder of George Floyd by Derek Chauvin, a Minneapolis police officer. The ensuing protests, along with episodes of looting and the torching of a Minneapolis police station, once again put police violence and racial discrimination at the forefront of both local and global discourse. Like Hodges, Frey entered his re-election campaign dogged by critics on left and right that accused him of alternatively heavy-handed or inadequate response to the protests and actions to reform the MPD. On the same day as the Mayoral election, Minneapolis voters also voted on three referenda, the most prominent a proposal to replace the MPD with a Department of Public Safety. Opponents derided the measure as an act of “defunding the police”, and with Frey and MPD Chief Medaria Arradondo firmly in the No camp, the measure was soundly defeated, receiving less than 45% of the vote (Kaste 2021).

<figure>
<img src="attachment:resources/plots/2021_frey_share.png" alt="Figure 2.1: Jacob Frey’s vote share by precinct, 2021" />
<figcaption aria-hidden="true">Figure 2.1: Jacob Frey’s vote share by precinct, 2021</figcaption>
</figure>

<figure>
<img src="attachment:resources/plots/2021_lisa_clusters.png" alt="Figure 2.2: LISA Clusters of Frey’s vote share, 2021" />
<figcaption aria-hidden="true">Figure 2.2: LISA Clusters of Frey’s vote share, 2021</figcaption>
</figure>

Frey faced two major opponents, Kate Knuth and Sheila Nezhad, who both supported the police reform measure. Knuth reached the final round of RCV against Frey, which he won with 56.2% of the vote. Geographic clustering in Frey’s vote share, measured by Moran’s I, slipped from 0.629 to 0.613, which still represents a level of clustering unlikely to occur through random chance. Looking at the LISA clusters of high and low Frey support, Frey expanded his base into North Minneapolis while shedding support in Southwest. North Minneapolis, a working-class region with a large African-American population, saw moderate challenger Latrisha Vetaw rout a progressive councilmember who supported the police amendment (Ansari 2021). There was a single “High-Low” precinct, which represents a neighborhood with a spike in Frey support surrounded by areas of low Frey support. This precinct is home to Minneapolis’ Third Police Precinct and other buildings that were torched during the late-May 2020 protests.

Taken together, these shifts suggest Frey successfully capitalized on voter skepticism of the police reform push, especially in areas that witnessed violent confrontations between protestors and police as well as instances of looting or vandalism. At the same time, Frey shed support in Southwest Minneapolis, which is highly educated, lower density, and heavily White. Despite the heightened salience of discourse around racism and police violence, there is superficial evidence of racial depolarization in Minneapolis’ 2021 mayoral election given these countervailing trends. However, the degree of geographic clustering in Frey’s vote share dropped only slightly, suggesting a shift within coalitions rather than a general depolarization of the electorate.

## 2.3 2025 Mayoral Election: Jacob Frey narrowly beats Omar Fateh

By 2025, Frey had been mayor for eight years, leading to both hardened opposition and support. City councilors, while officially non-partisan, have organized into ideological blocs defined largely by their positions on Frey’s tenure. Proto-parties sprung up in the form of outside groups spending millions on municipal elections since 2021 (Nace 2025). While policing remained a major concern for voters, the 2025 mayoral election became a referendum on Frey and the progressive majority on City Council’s increasingly antagonistic relationship. Tensions peaked with Frey’s unprecedented veto of the 2025 city budget, complicating the city’s efforts to improve public safety and address rising housing costs and homelessness (Spencer 2024). A slate of candidates emerged to challenge Frey, led by State Senator Omar Fateh. Despite their ideological differences – the candidates ranged from business-friendly entrepreneur Jazz Hampton to the democratic socialist Fateh – the challengers urged their supporters to rank the rest of the slate on their ballot, deepening the Frey-centered cleavage in Minneapolis politics.

<figure>
<img src="attachment:resources/plots/2025_frey_share.png" alt="Figure 3.1: Jacob Frey’s vote share by precinct, 2025" />
<figcaption aria-hidden="true">Figure 3.1: Jacob Frey’s vote share by precinct, 2025</figcaption>
</figure>

<figure>
<img src="attachment:resources/plots/2025_lisa_clusters.png" alt="Figure 3.2: LISA Clusters of Frey’s vote share, 2025" />
<figcaption aria-hidden="true">Figure 3.2: LISA Clusters of Frey’s vote share, 2025</figcaption>
</figure>

Frey once again won in the final round of ranked choice voting, but with his lowest-ever vote share: 53.0%. Fateh reached the final round, consolidating support in a large swath of South Minneapolis stretching up through Northeast. Frey expanded his base in Southwest Minneapolis, building a powerful coalition of older homeowners. Geographic clustering in Frey vote share reached a high water mark of 0.681, driven by Fateh’s massive gains with Somali communities in Cedar-Riverside that connected the two previous hotbeds of progressive politics in South Minneapolis and the University of Minnesota. Frey also lost ground with younger professionals in Downtown Minneapolis.

Even as Frey’s vote share decreased, geographic clustering in his vote share reached an all-time high. These countervailing trends indicate a city increasingly divided along geographic lines in their opinion of Frey. Comparing the LISA clusters between 2021 and 2025 reveals an increasingly consolidated split between anti-Frey voters in South Minneapolis and Frey supporters in Southwest Minneapolis, meaning both opposition and support hardened over the four years of Frey’s second term. An open question is how the city’s geographic divisions will persist after Frey leaves the political scene in 2029, when he has indicated he will not run for re-election. Already, media speculation has swirled around two councilmembers in Frey’s bloc on the council – Latrisha Vetaw of North Minneapolis and Linea Palmisano of Southwest Minneapolis. Each represents a different geographic cluster of Frey’s coalition, and if past is prologue, Palmisano may have the upper hand given the increasing depth of support for Frey in Southwest Minneapolis.

## 2.4 Summary Statistics by LISA Cluster

The following tables provide detailed demographic statistics for High-High and Low-Low LISA clusters of Frey vote share in each election. Demographic statistics are computed using areal weighted interpolation, similar to the procedure used for election imputation, to ensure accurate representation of population characteristics within each cluster.

``` r
library(tidyverse)
```

    ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
    ✔ ggplot2 3.5.1     ✔ purrr   1.0.2
    ✔ tibble  3.2.1     ✔ dplyr   1.1.3
    ✔ tidyr   1.3.0     ✔ stringr 1.5.0
    ✔ readr   2.1.3     ✔ forcats 0.5.2

    Warning: package 'ggplot2' was built under R version 4.2.3

    ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
    ✖ dplyr::filter() masks stats::filter()
    ✖ dplyr::lag()    masks stats::lag()

``` r
library(knitr)
library(kableExtra)
```


    Attaching package: 'kableExtra'

    The following object is masked from 'package:dplyr':

        group_rows

``` r
# Source the cluster summary script (suppress all output)
suppressMessages(
  suppressWarnings(
    invisible(capture.output(
      source("../scripts/R/lisa_cluster_summary.R", echo = FALSE, verbose = FALSE),
      type = "output"
    ))
  )
)

# Format and display tables for each year
for (yr in c(2017, 2021, 2025)) {
  year_data <- all_summaries |>
    filter(year == yr) |>
    select(-year) |>
    mutate(across(where(is.numeric), ~ round(.x, 1) |> prettyNum(big.mark = ","))) |>
    select(
      `Cluster Type` = cluster_type,
      `N Precincts` = n_precincts,
      `Moran's I` = morans_i,
      `White` = pct_white,
      `Black` = pct_black,
      `Asian` = pct_asian,
      `Hispanic` = pct_hispanic,
      `Bachelor's+` = pct_bachelors_plus,
      `Higher Ed` = pct_higher_ed,
      `Age 0-17` = pct_age_0_17,
      `Age 18-34` = pct_age_18_34,
      `Age 35-49` = pct_age_35_49,
      `Age 50-64` = pct_age_50_64,
      `Age 65+` = pct_age_65_plus,
      `Owner Occupied` = pct_owner_occupied,
      `Rent Burdened` = pct_rent_burdened,
      `English Only` = pct_english_only,
      `Median HH Income ($)` = med_hh_income
    )
  
  if (nrow(year_data) > 0) {
    cat("\n\n### ", yr, " Election\n\n")
    
    # First table: up to and including Higher Ed
    table1 <- year_data |>
      select(
        `Cluster Type`,
        `N Precincts`,
        `Moran's I`,
        `White`,
        `Black`,
        `Asian`,
        `Hispanic`,
        `Bachelor's+`,
        `Higher Ed`
      ) |>
      kbl(
        caption = paste0(yr, " Minneapolis Mayoral Election - LISA Cluster Demographics (Part 1)"),
        align = c("l", rep("r", 8)),
        booktabs = TRUE
      ) |>
      kable_styling(
        latex_options = c("striped", "hold_position"),
        full_width = FALSE
      ) |>
      column_spec(1, bold = TRUE) |>
      row_spec(0, bold = TRUE)
    
    cat(table1)
    cat("\n\n")
    
    # Second table: from Age 0-17 to the end
    table2 <- year_data |>
      select(
        `Cluster Type`,
        `Age 0-17`,
        `Age 18-34`,
        `Age 35-49`,
        `Age 50-64`,
        `Age 65+`,
        `Owner Occupied`,
        `Rent Burdened`,
        `English Only`,
        `Median HH Income ($)`
      ) |>
      kbl(
        caption = paste0(yr, " Minneapolis Mayoral Election - LISA Cluster Demographics (Part 2)"),
        align = c("l", rep("r", 9)),
        booktabs = TRUE
      ) |>
      kable_styling(
        latex_options = c("striped", "hold_position"),
        full_width = FALSE
      ) |>
      column_spec(1, bold = TRUE) |>
      row_spec(0, bold = TRUE)
    
    cat(table2)
    cat("\n\n")
  }
}
```

### 2.4.1 2017 Election

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2017 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 1)

Cluster Type

N Precincts

Moran’s I

White

Black

Asian

Hispanic

Bachelor’s+

Higher Ed

High-High

21

0.4

80.5

7.1

3.2

6.3

72.5

5.2

Low-Low

23

0.4

58.5

15

6.4

14.9

52.4

23.8

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2017 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 2)

Cluster Type

Age 0-17

Age 18-34

Age 35-49

Age 50-64

Age 65+

Owner Occupied

Rent Burdened

English Only

Median HH Income (\$)

High-High

20.1

25.1

21.3

19.4

14

64.2

35.3

88.2

131,582.4

Low-Low

16.9

45.7

17.2

12.4

7.8

38.6

48.9

72.1

66,448.7

### 2.4.2 2021 Election

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2021 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 1)

Cluster Type

N Precincts

Moran’s I

White

Black

Asian

Hispanic

Bachelor’s+

Higher Ed

High-High

16

0.4

59.5

22.5

5.7

8

56.2

6

Low-Low

25

0.4

61.7

14.2

6.5

12.6

55.4

26.2

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2021 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 2)

Cluster Type

Age 0-17

Age 18-34

Age 35-49

Age 50-64

Age 65+

Owner Occupied

Rent Burdened

English Only

Median HH Income (\$)

High-High

21.3

29.2

19.9

17.9

11.7

54.5

42.2

81.3

104,586.6

Low-Low

13.7

51.6

15.9

11.4

7.4

32.6

48.6

73.3

64,397.8

### 2.4.3 2025 Election

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2025 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 1)

Cluster Type

N Precincts

Moran’s I

White

Black

Asian

Hispanic

Bachelor’s+

Higher Ed

High-High

22

0.4

83

5.4

2.6

6

74.5

3.9

Low-Low

31

0.4

52.3

21.7

5.2

16

47.1

21.2

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

2025 Minneapolis Mayoral Election - LISA Cluster Demographics (Part 2)

Cluster Type

Age 0-17

Age 18-34

Age 35-49

Age 50-64

Age 65+

Owner Occupied

Rent Burdened

English Only

Median HH Income (\$)

High-High

22.3

21.6

22.9

18.8

14.4

72.6

34.7

89.4

140,670.1

Low-Low

17.4

46.1

16.4

12.2

7.9

29.3

51.6

67.9

57,392

# 3. Spatial Regression: Modeling Support for Jacob Frey in Minneapolis Mayoral Elections

## 3.1 Methodology

The evidence of spatial clustering in Frey’s vote share raises questions about the underlying factors that contribute to support or opposition to him. We can use spatial regression to model the relationship between Frey’s average vote share in 2017, 2021, and 2025 and the demographic and socioeconomic characteristics of different census tracts.

Ordinary Least Squares (OLS) regression is a common method for modeling the relationship between a dependent variable and one or more independent variables. In OLS regression, we assume that the relationship between the dependent variable and the independent variables is linear. The model takes the form

$$
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n + \epsilon
$$

where $y$ is the dependent variable, $x_1, x_2, \ldots, x_n$ are the independent variables, \$\_0, \_1, , $\beta_n$ are the coefficients, and $\epsilon$ is the error term, which is assumed to be normally distributed with mean 0 and variance $\sigma^2$.

In spatial regression, we account for the spatial autocorrelation of the error term. When there is spatial clustering in the error term (as diagnosed by calculating Moran’s I of the residuals of an OLS regression), we will systematically underestimate the standard errors of regression coefficients. Thus, we need to use a spatial error model to account for the spatial autocorrelation of the error term.

When we are most interested in the relationship between the dependent variable and the independent variables, we use a spatial error model (SEM). The SEM model takes the form:

$$
y = X\beta + u
$$

$$
u = \lambda W u + \epsilon
$$

where $y$ is the dependent variable, $u$ is the spatial error term, $\lambda$ is the spatial autoregressive parameter, $W$ is the spatial weight matrix, and $\epsilon$ is the “white noise” error term, which is assumed to be normally distributed with mean 0 and variance $\sigma^2$.

The spatial weight matrix $W$ is a matrix of weights for the neighboring units. In our case, we use the Queen contiguity weight matrix, which is a binary matrix that is 1 if the units are neighbors and 0 otherwise, defined by the adjacency of the units’ polygons (including corners touching). In particular, we use the row-standardized Queen contiguity weight matrix that is obtained by dividing each row of the binary Queen contiguity weight matrix by the sum of the row.

## 3.2 Dependent Variable: Frey’s Average Vote Share

To estimate support for Frey across the past three mayoral elections, we use the average of Frey’s vote share in the final round of ranked choice voting (RCV) in 2017, 2021, and 2025. As a reminder, Frey received 57.2% of the vote in 2017, 56.2% of the vote in 2021, and 53.0% of the vote in 2025.

### 3.2.1 Jacob Frey Average Vote Share by Census Tract (2017-2025)

In [None]:
library(tidyverse)
library(sf)
library(mapview)

# Load census data
census_data_pct <- read_csv("../data/processed/census/mpls_tracts_census.csv", show_col_types = FALSE) |>
  mutate(GEOID = as.character(GEOID)) |>
  select(c(1:10, 91, contains("pct_"), unemployment_rate))

# Load Frey vote share shapefile
frey_data_sf <- read_sf("../data/shapefiles/tracts/mpls_tract_frey_votes.shp") |>
  st_transform(26915)

# Join data sources
joined_data_sf <- census_data_pct |>
  left_join(frey_data_sf, by = "GEOID") |>
  st_as_sf()

# Mapview plotting function
mapview_map_plot <- function(data, variable, title = NULL) {
  color_palette <- colorRampPalette(c("#2166ac", "#f7f7f7", "#b2182b"))
  
  # Create tooltip label with title and rounded value (1 decimal place)
  label_values <- paste0(title, ": ", round(data[[variable]], 1))
  
  mapview(
    data,
    zcol = variable,
    layer.name = title,
    legend = TRUE,
    col.regions = color_palette(100),
    label = label_values
  )
}

# Create and display interactive map
mapview_map_plot(joined_data_sf, "fry_sh_", "Frey Average Vote Share")

Interpolating color vector to match number of zcol values.

### 3.2.2 Descriptive Statistics and Spatial Autocorrelation: Jacob Frey Average Vote Share by Census Tract (2017-2025)

``` r
library(tidyverse)
library(knitr)
library(kableExtra)

# Read dependent variable statistics
dep_stats <- read_csv("../outputs/spatial_regression_dependent_var_stats.csv", show_col_types = FALSE)

# Format and display the table
dep_stats |>
  select(
    `Variable` = Variable,
    `Mean` = Mean,
    `Median` = Median,
    `Min` = Min,
    `Max` = Max,
    `SD` = SD,
    `Moran's I` = Moran_I,
    `p-value` = Moran_I_Pvalue
  ) |>
  mutate(across(where(is.numeric), ~ round(.x, 2) |> prettyNum(big.mark = ","))) |>
  kable(
    format = "html",
    caption = "Descriptive Statistics and Spatial Autocorrelation: Jacob Frey Average Vote Share by Census Tract (2017-2025)"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Variable | Mean  | Median | Min   | Max   | SD    | Moran's I | p-value |
|:---------|:------|:-------|:------|:------|:------|:----------|:--------|
| fry_sh\_ | 53.08 | 55.44  | 24.54 | 78.25 | 12.95 | 0.72      | 0       |

Descriptive Statistics and Spatial Autocorrelation: Jacob Frey Average Vote Share by Census Tract (2017-2025)

## 3.3 Predictor Variables

The following predictor variables, collected from the 2019-2023 American Community Survey (ACS) 5-year estimates, are used in the spatial regression models. Detailed descriptions of the predictor variables are provided in the appendix.

### 3.3.1 Descriptive Statistics and Spatial Autocorrelation: Predictor Variables

``` r
# Read predictor variables statistics
predictor_stats <- read_csv("../outputs/spatial_regression_predictor_vars_stats.csv", show_col_types = FALSE)

# Format and display the table
predictor_stats |>
  select(
    `Variable` = Variable,
    `Mean` = Mean,
    `Median` = Median,
    `Min` = Min,
    `Max` = Max,
    `SD` = SD,
    `Moran's I` = Moran_I,
    `p-value` = Moran_I_Pvalue
  ) |>
  mutate(across(where(is.numeric), ~ round(.x, 2) |> prettyNum(big.mark = ","))) |>
  kable(
    format = "html",
    caption = "Descriptive Statistics and Spatial Autocorrelation: Predictor Variables"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Variable | Mean | Median | Min | Max | SD | Moran's I | p-value |
|:---|:---|:---|:---|:---|:---|:---|:---|
| pct_white | 58.63 | 63.23 | 6.43 | 92.58 | 22.57 | 0.63 | 0 |
| pct_bachelors_plus | 51.94 | 54.06 | 4.66 | 86.16 | 19.86 | 0.63 | 0 |
| med_hh_income | 83,711.62 | 78,750 | 21,489 | 212,679 | 36,685.45 | 0.64 | 0 |
| pct_higher_ed | 10.47 | 5.67 | 0.95 | 94.51 | 15.87 | 0.77 | 0 |
| pct_english_only | 78.45 | 81.51 | 15.77 | 97.93 | 13.45 | 0.45 | 0 |
| pct_owner_occupied | 50.53 | 49.41 | 0 | 95.42 | 27.55 | 0.73 | 0 |
| pct_rent_burdened | 46.41 | 44.19 | 10.97 | 89.29 | 16.74 | 0.28 | 0 |
| pct_built_post2000 | 13.95 | 7.93 | 0 | 73.33 | 16.07 | 0.39 | 0 |
| pct_multifamily | 50.25 | 45.66 | 0.77 | 98.76 | 31.77 | 0.74 | 0 |
| pct_moved_2021_later | 15.25 | 12.37 | 1.39 | 53.13 | 9.17 | 0.63 | 0 |
| pct_same_sex_hh | 1.07 | 0.84 | 0 | 6.22 | 1.04 | 0.14 | 0 |
| pct_age_18_34 | 33.46 | 27.36 | 8.58 | 96.89 | 18.21 | 0.72 | 0 |
| pct_age_65_plus | 11.05 | 10.44 | 0 | 29.4 | 6.06 | 0.26 | 0 |
| unemployment_rate | 6.02 | 4.81 | 0.87 | 24.17 | 3.99 | 0.31 | 0 |

Descriptive Statistics and Spatial Autocorrelation: Predictor Variables

Results from comparing Census variables to election results should be carefully interpreted. First, there is the ecological inference issue, where aggregated demographic and socioeconomic characteristics may not be representative of individual voters in a tract. For instance, if we were to find a high correlation between a tract’s percentage of renters and Frey’s vote share, we cannot conclude that renters are more likely to support Frey directly, only that tracts with more renters are more likely to vote for Frey (a subtle yet important distinction). Second, the population that votes in a mayoral election is not the same as the population living in a tract. For instance, some statistics like racial demographics include children under the age of 18, which are not eligible to vote. Others are not eligible to vote because of their citizenship status. Finally, not every eligible voter in a tract votes in a mayoral election, and in general, older and highly-educated residents tend to be more likely to vote in mayoral elections.

## 3.4 Linear Regression Model (OLS)

### 3.4.1 Model Specification

$$
Frey Share = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n + \epsilon
$$

where $x_1, x_2, \ldots, x_n$ are the census variables, \$\_0, \_1, , $\beta_n$ are the coefficients, and $\epsilon$ is the error term, which is assumed to be normally distributed with mean 0 and variance $\sigma^2$.

### 3.4.2 Linear Regression Model (OLS) Coefficients

``` r
library(tidyverse)
library(knitr)
library(kableExtra)

# Read OLS coefficients
ols_coef <- read_csv("../outputs/spatial_regression_ols_coefficients.csv", show_col_types = FALSE)

# Format and display coefficients table
ols_coef |>
  mutate(
    Estimate = round(Estimate, 4),
    StdError = round(StdError, 4),
    TValue = round(TValue, 3),
    PValue = round(PValue, 4),
    Significance = case_when(
      PValue < 0.001 ~ "***",
      PValue < 0.01 ~ "**",
      PValue < 0.05 ~ "*",
      PValue < 0.1 ~ ".",
      TRUE ~ ""
    )
  ) |>
  select(Variable, Estimate, StdError, TValue, PValue, Significance) |>
  kable(
    format = "html",
    caption = "OLS Regression Coefficients: Frey Average Vote Share",
    col.names = c("Variable", "Estimate", "Std. Error", "t-value", "p-value", ""),
    align = c("l", "r", "r", "r", "r", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  footnote(
    general = "Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1",
    general_title = "Note:"
  )
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Variable | Estimate | Std. Error | t-value | p-value |  |
|:---|---:|---:|---:|---:|:--:|
| (Intercept) | 29.2634 | 14.6998 | 1.991 | 0.0491 | \* |
| pct_white | -0.2502 | 0.1106 | -2.263 | 0.0257 | \* |
| pct_bachelors_plus | 0.0310 | 0.1131 | 0.274 | 0.7848 |  |
| med_hh_income | 0.0002 | 0.0001 | 3.066 | 0.0028 | \*\* |
| pct_higher_ed | 0.0086 | 0.1372 | 0.063 | 0.9503 |  |
| pct_english_only | 0.3384 | 0.1187 | 2.851 | 0.0052 | \*\* |
| pct_owner_occupied | -0.0819 | 0.1452 | -0.564 | 0.5739 |  |
| pct_rent_burdened | 0.0774 | 0.0728 | 1.063 | 0.2903 |  |
| pct_built_post2000 | 0.2217 | 0.0790 | 2.807 | 0.0060 | \*\* |
| pct_multifamily | 0.0244 | 0.1057 | 0.230 | 0.8183 |  |
| pct_moved_2021_later | 0.1013 | 0.2103 | 0.482 | 0.6309 |  |
| pct_same_sex_hh | -1.2625 | 0.8819 | -1.432 | 0.1552 |  |
| pct_age_18_34 | -0.3724 | 0.1531 | -2.432 | 0.0167 | \* |
| pct_age_65_plus | 0.5213 | 0.2161 | 2.413 | 0.0176 | \* |
| unemployment_rate | -0.2092 | 0.2918 | -0.717 | 0.4751 |  |
| <span style="font-style: italic;">Note:</span> |  |  |  |  |  |
|  Significance codes: 0 '\*\*\*' 0.001 '\*\*' 0.01 '\*' 0.05 '.' 0.1 ' ' 1 |  |  |  |  |  |

OLS Regression Coefficients: Frey Average Vote Share

``` r
# Read OLS fit statistics
ols_fit <- read_csv("../outputs/spatial_regression_ols_fit_stats.csv", show_col_types = FALSE)

# Format and display fit statistics table
# Helper function to safely round numeric values (vectorized)
safe_round <- function(x, digits) {
  num_val <- suppressWarnings(as.numeric(x))
  ifelse(is.na(num_val), as.character(x), as.character(round(num_val, digits)))
}

ols_fit |>
  mutate(
    Value = case_when(
      Statistic %in% c("R-squared", "Adjusted R-squared") ~ safe_round(Value, 4),
      Statistic == "F-statistic" ~ safe_round(Value, 2),
      Statistic == "F p-value" ~ safe_round(Value, 4),
      Statistic %in% c("AIC", "BIC") ~ safe_round(Value, 2),
      Statistic == "Residual SE" ~ safe_round(Value, 4),
      Statistic == "Moran's I" ~ safe_round(Value, 4),
      TRUE ~ as.character(Value)
    )
  ) |>
  kable(
    format = "html",
    caption = "OLS Regression Model Fit Statistics",
    col.names = c("Statistic", "Value"),
    align = c("l", "r")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Statistic          |      Value |
|:-------------------|-----------:|
| R-squared          |     0.5505 |
| Adjusted R-squared |     0.4911 |
| F-statistic        |       9.27 |
| F p-value          |          0 |
| AIC                |     897.37 |
| BIC                |     942.11 |
| Residual SE        |     9.2365 |
| Moran's I          |     0.4667 |
| Moran's I p-value  | \< 2.2e-16 |

OLS Regression Model Fit Statistics

### 3.4.3 Moran’s I Test for OLS Residuals

The result of Moran’s I test for the residuals of the linear regression model (OLS) is 0.467, which is statistically significant at the 0.01 level. This suggests that the residuals are spatially clustered, which is a violation of the assumption of independence of the error terms. As such, we need to use a spatial error model to account for the spatial autocorrelation of the error terms to obtain unbiased and consistent estimates of the regression coefficients.

<figure>
<img src="attachment:./resources/plots/ols_residuals.png" alt="Map of OLS Residuals by Census Tract" />
<figcaption aria-hidden="true">Map of OLS Residuals by Census Tract</figcaption>
</figure>

Looking at the map of OLS residuals by census tract, there is clear evidence of spatial clustering. The OLS model underestimates Frey’s vote share in North Minneapolis, Southwest Minneapolis, Downtown Minneapolis, and Cedar-Riverside. It overstimates his share in South Minneapolis, Northeast, and Hiawatha/Longfellow. As a general trend, it appears to underfit by underestimating Frey’s vote share in his core support areas and overfit by overestimating his vote share in his less supportive areas.

## 3.5 Spatial Error Model (SEM)

To account for the spatial correlation of the error terms, we use a spatial error model (SEM) described above.

### 3.5.1 Model Specification

$$
Frey Share = X\beta + u
$$

$$
u = \lambda W u + \epsilon
$$

where $u$ is the spatial error term, $\lambda$ is the spatial autoregressive parameter, $W$ is the spatial weight matrix, and $\epsilon$ is the “white noise” error term.

### 3.5.2 Model Estimation

``` r
library(tidyverse)
library(knitr)
library(kableExtra)

# Read SEM coefficients
sem_coef <- read_csv("../outputs/spatial_regression_sem_coefficients.csv", show_col_types = FALSE)

# Format and display coefficients table
sem_coef |>
  mutate(
    Estimate = round(Estimate, 4),
    StdError = round(StdError, 4),
    ZValue = round(ZValue, 3),
    PValue = round(PValue, 4),
    Significance = case_when(
      PValue < 0.001 ~ "***",
      PValue < 0.01 ~ "**",
      PValue < 0.05 ~ "*",
      PValue < 0.1 ~ ".",
      TRUE ~ ""
    )
  ) |>
  select(Variable, Estimate, StdError, ZValue, PValue, Significance) |>
  kable(
    format = "html",
    caption = "Spatial Error Model (SEM) Coefficients: Frey Average Vote Share",
    col.names = c("Variable", "Estimate", "Std. Error", "z-value", "p-value", ""),
    align = c("l", "r", "r", "r", "r", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  footnote(
    general = "Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1",
    general_title = "Note:"
  )
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Variable | Estimate | Std. Error | z-value | p-value |  |
|:---|---:|---:|---:|---:|:--:|
| (Intercept) | 46.6032 | 9.5801 | 4.865 | 0.0000 | \*\*\* |
| pct_white | -0.0601 | 0.0647 | -0.930 | 0.3526 |  |
| pct_bachelors_plus | 0.0255 | 0.0642 | 0.398 | 0.6906 |  |
| med_hh_income | 0.0000 | 0.0000 | 0.848 | 0.3966 |  |
| pct_higher_ed | -0.0964 | 0.0952 | -1.013 | 0.3112 |  |
| pct_english_only | 0.0841 | 0.0668 | 1.258 | 0.2084 |  |
| pct_owner_occupied | 0.0353 | 0.0810 | 0.436 | 0.6625 |  |
| pct_rent_burdened | 0.0134 | 0.0341 | 0.391 | 0.6956 |  |
| pct_built_post2000 | 0.1504 | 0.0459 | 3.277 | 0.0010 | \*\* |
| pct_multifamily | 0.0537 | 0.0614 | 0.874 | 0.3820 |  |
| pct_moved_2021_later | -0.0660 | 0.1037 | -0.636 | 0.5246 |  |
| pct_same_sex_hh | -0.5214 | 0.5091 | -1.024 | 0.3057 |  |
| pct_age_18_34 | -0.2164 | 0.0823 | -2.628 | 0.0086 | \*\* |
| pct_age_65_plus | 0.2748 | 0.1171 | 2.346 | 0.0190 | \* |
| unemployment_rate | 0.0274 | 0.1523 | 0.180 | 0.8574 |  |
| <span style="font-style: italic;">Note:</span> |  |  |  |  |  |
|  Significance codes: 0 '\*\*\*' 0.001 '\*\*' 0.01 '\*' 0.05 '.' 0.1 ' ' 1 |  |  |  |  |  |

Spatial Error Model (SEM) Coefficients: Frey Average Vote Share

``` r
# Read SEM fit statistics
sem_fit <- read_csv("../outputs/spatial_regression_sem_fit_stats.csv", show_col_types = FALSE)

# Format and display fit statistics table
sem_fit |>
  mutate(
    Value = case_when(
      Statistic == "Pseudo R-squared" ~ round(Value, 4),
      Statistic %in% c("AIC", "BIC") ~ round(Value, 2),
      Statistic == "Log-likelihood" ~ round(Value, 2),
      Statistic == "Lambda" ~ round(Value, 4),
      Statistic == "Lambda Std Error" ~ round(Value, 4),
      Statistic == "Lambda z-value" ~ round(Value, 3),
      Statistic == "Lambda p-value" ~ round(Value, 4),
      Statistic == "Moran's I" ~ round(Value, 4),
      TRUE ~ Value
    )
  ) |>
  kable(
    format = "html",
    caption = "Spatial Error Model (SEM) Fit Statistics",
    col.names = c("Statistic", "Value"),
    align = c("l", "r")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

    Warning: 'xfun::attr()' is deprecated.
    Use 'xfun::attr2()' instead.
    See help("Deprecated")

| Statistic         |     Value |
|:------------------|----------:|
| Pseudo R-squared  |    0.8714 |
| AIC               |  790.7000 |
| BIC               |  838.2300 |
| Log-likelihood    | -378.3500 |
| Lambda            |    0.9173 |
| Lambda Std Error  |    0.0336 |
| Lambda z-value    |   27.2760 |
| Lambda p-value    |    0.0000 |
| Moran's I         |    0.0571 |
| Moran's I p-value |    0.1106 |

Spatial Error Model (SEM) Fit Statistics

### 3.5.3 Moran’s I Test for SEM Residuals

The result of Moran’s I test for the residuals of the spatial error model (SEM) is 0.0571, which is not statistically significant at the 0.05 level. This suggests that the residuals are not likely to be spatially clustered. A map of the SEM residuals by census tract shows a more random distribution of residuals across the city than the OLS residuals.

<figure>
<img src="attachment:./resources/plots/sem_residuals.png" alt="Map of SEM Residuals by Census Tract" />
<figcaption aria-hidden="true">Map of SEM Residuals by Census Tract</figcaption>
</figure>

While it is still possible to make out some spatial patterns in the SEM residuals, they are much less pronounced than the OLS residuals. This suggests that the SEM model is able to account for the spatial correlation of the error terms better than the OLS model.

# 4. Discussion

The Moran’s I test for the residuals of the OLS model provides evidence of spatial autocorrelation in the residuals, which means the OLS model is systemtically underestimating the variance of the regression coefficients. The SEM model is able to account for the spatial correlation of the error terms and provides more accurate estimates of the significance of the regression coefficients.

Comparing the coefficients and their standard errors between the OLS and SEM models, we see that there are fewer significant coefficients in the SEM model. *pct_age_18_34* is negatively correlated with Frey’s vote share, while *pct_age_65_plus* is positively correlated with Frey’s vote share, suggesting that a major cleavage in support for Frey is age. The inclusion of *pct_higher_ed* also provides evidence that opposition to Frey among young voters is not only driven by students at the University of Minnesota, but also among young adults who are not students. Frey’s support with older voters perhaps explains his success in three consecutive mayoral elections - older voters are, in general, more likely to vote in mayoral elections and are thus a powerful voting bloc.

The only significant coefficient in the SEM model is *pct_built_post2000*, which is positively correlated with Frey’s vote share. One of the most impactful policies adopted under Frey’s tenure has been the Minneapolis 2040 Comprehensive Plan, which opened up nearly the entire city to multifamily housing development and created ambitious affordable housing goals. the YIMBY-NIMBY divide over support for new housing is a common cleavage in urban politics, and it appears that Frey’s support for new housing development may have helped him win over people living in areas that have been most open to new development. Alternatively, people moving into new areas as a result of redevelopment could also be more supportive of him, but the non-significant coefficient of *pct_moved_2021_later* does not provide evidence of this.

The $\lambda$ parameter in the SEM model is 0.917, which is statistically significant at the 0.01 level. This suggests a strong degree of spatial autocorrelation in Frey’s vote share, even after accounting for the demographic variables included in the model. This suggests that there are some innate geographic patterns in Frey’s vote share that are not captured by the demographic variables included in the model. Given his long tenure as mayor, it is possible that opinions of Jacob Frey are becoming entrenched and associated more with personal ideology and geographic patterns rather than demographic trends.

Future research might consider using a spatial lag model to account for the spatial correlation of the dependent variable. Due to the interpolation of election results to census tracts, there is the possibility that spatial diffusion in Frey’s vote share would have been created artificially. As a result, I decided not to include a spatial lag model in this analysis. Using only precinct-level data from a single election might be a possible way deterine if one precicnt’s support for Frey is significantly correlated with the support for Frey in its surrounding precincts, after accounting for the demographic trends in the surrounding precincts. This could provide more evidence for the concept of support for Frey becoming a force shaping the political landscape of Minneapolis, rather than simply a product of demographic trends.

# 5. Appendix

## 5.1 Predictor Variable Descriptions

| Variable | Description |
|----------------|--------------------------------------------------------|
| pct_white | Percentage of a tract’s population that is non-Hispanic White. |
| pct_bachelors_plus | Percentage of a tract’s population aged 25 or older that has a bachelor’s degree or higher. |
| med_hh_income | Median household income in the tract. |
| pct_higher_ed | Percentage of a tract’s population that is enrolled in higher education (college or higher). |
| pct_english_only | Percentage of households in a tract that speak only English at home. |
| pct_owner_occupied | Percentage of a tract’s housing units that are owner-occupied. |
| pct_rent_burdened | Percentage of a tract’s housing units that are rent burdened (paying more than 30% of income on rent). |
| pct_built_post2000 | Percentage of a tract’s housing units that were built after 2000. |
| pct_multifamily | Percentage of a tract’s housing units that are multifamily (2+ units). |
| pct_moved_2021_later | Percentage of a tract’s population that moved in 2021 or later. |
| pct_same_sex_hh | Percentage of a tract’s households that are same-sex couples. |
| pct_age_18_34 | Percentage of a tract’s population that is 18-34 years old. |
| pct_age_65_plus | Percentage of a tract’s population that is 65 years old or older. |
| unemployment_rate | Unemployment rate in the tract. |