In [None]:
library(lmtest)
library(sandwich)

# Builds our base model
final_model1 <- lm(log(crmrte) ~ log(prbarr) + log(prbconv) + log(prbpris), data=clean)
se.final_model1 <- sqrt(diag(vcovHC(final_model1)))


# Builds model 2
final_model2 <- lm(log(crmrte) ~ log(prbarr) + log(prbconv) + log(prbpris)
             + density + I(density^2) + log(polpc), data=clean)
se.final_model2 <- sqrt(diag(vcovHC(final_model2)))


# Builds model 3
final_model3 <- lm(log(crmrte) ~ log(prbarr) + log(prbconv) + log(prbpris)
             + density + I(density^2) + log(polpc) + avgsen + taxpc 
             + west + central + urban + pctmin80 + wcon + wtuc + wfir 
             + wser + wmfg + wfed + wsta + wloc + mix + pctymle, data=clean)
se.final_model3 <- sqrt(diag(vcovHC(final_model3)))


# Generates a stargazer table for all three models
stargazer(final_model1, final_model2, final_model3, type="html", omit.stat="f", 
          se = list(se.final_model1, se.final_model2, se.final_model3),
          title="Models ",
          star.cutoffs=c(0.05 , 0.01, 0.001))

# Calculates the AIC for each model
paste("Model 1:", round(AIC(final_model1), 0))
paste("Model 2:", round(AIC(final_model2), 0))
paste("Model 3:", round(AIC(final_model3), 0))


<table style="text-align:center"><caption><strong>Building Our Base Model</strong></caption>
<tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="3"><em>Dependent variable:</em></td></tr>
<tr><td></td><td colspan="3" style="border-bottom: 1px solid black"></td></tr>
<tr><td style="text-align:left"></td><td colspan="3">log(crmrte)</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td></tr>
<tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">log(prbarr)</td><td>-0.724<sup>***</sup></td><td>-0.475<sup>*</sup></td><td>-0.476<sup>***</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.116)</td><td>(0.201)</td><td>(0.105)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">log(prbconv)</td><td>-0.472<sup>***</sup></td><td>-0.321</td><td>-0.258<sup>*</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.138)</td><td>(0.169)</td><td>(0.131)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">log(prbpris)</td><td>0.148</td><td>0.043</td><td>-0.045</td></tr>
<tr><td style="text-align:left"></td><td>(0.242)</td><td>(0.232)</td><td>(0.171)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">density</td><td></td><td>0.305<sup>*</sup></td><td>0.362<sup>*</sup></td></tr>
<tr><td style="text-align:left"></td><td></td><td>(0.130)</td><td>(0.141)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">I(density2)</td><td></td><td>-0.024</td><td>-0.031</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(0.020)</td><td>(0.022)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">log(polpc)</td><td></td><td>0.176</td><td>0.214</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(0.270)</td><td>(0.177)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">avgsen</td><td></td><td></td><td>-0.023</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.016)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">taxpc</td><td></td><td></td><td>0.007</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.008)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">west</td><td></td><td></td><td>-0.078</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.131)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">central</td><td></td><td></td><td>-0.101</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.086)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">urban</td><td></td><td></td><td>-0.190</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.202)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">pctmin80</td><td></td><td></td><td>0.012<sup>***</sup></td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wcon</td><td></td><td></td><td>0.001</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wtuc</td><td></td><td></td><td>0.0001</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wfir</td><td></td><td></td><td>-0.001</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wser</td><td></td><td></td><td>-0.0005</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.002)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wmfg</td><td></td><td></td><td>-0.0002</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wfed</td><td></td><td></td><td>0.002<sup>*</sup></td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wsta</td><td></td><td></td><td>-0.001</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.001)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">wloc</td><td></td><td></td><td>0.0002</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.002)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">mix</td><td></td><td></td><td>-0.045</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(0.667)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">pctymle</td><td></td><td></td><td>1.939</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td>(2.375)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>-4.708<sup>***</sup></td><td>-3.563</td><td>-4.159<sup>*</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.289)</td><td>(2.090)</td><td>(1.812)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
<tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>90</td><td>90</td><td>90</td></tr>
<tr><td style="text-align:left">R<sup>2</sup></td><td>0.415</td><td>0.597</td><td>0.846</td></tr>
<tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.394</td><td>0.568</td><td>0.796</td></tr>
<tr><td style="text-align:left">Residual Std. Error</td><td>0.427 (df = 86)</td><td>0.361 (df = 83)</td><td>0.248 (df = 67)</td></tr>
<tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="3" style="text-align:right"><sup>*</sup>p<0.05; <sup>**</sup>p<0.01; <sup>***</sup>p<0.001</td></tr>
</table>

'AIC (base model): 108'
'AIC (model 2): 80'
'AIC (model 3): 80'

## (3 and 4) Model Building and Regression Analysis

### Model 1: Our Base Model Measures the Elasticity of $crmrte$ with Respect to $prbarr$, $prbconv$, and $prbpris$

We wanted to understand if an increasing probability of incarceration for offenders ($prbpris$) is associated with lower crime rates ($crmrte$); however, to land in prison, offenders must first be arrested ($prbarr$) and convicted of a crime ($prbconv$). Our base model regresses our log-transformed outcome variable $crmrte$ on log transformations of $prbarr$, $prbconv$, and $prbpris$ to account for this natural sequence of events. The coefficients associated with each regressor capture how each regressor varies individually with $log(crmrte)$ when the other two are held constant.   

#### An Elastic Model Is Easier to Interpret and Better Conforms to the CLM
Although the maximum value of $crmrte$ is only ~18 times greater than its minimum value, (0.0055 vs. 0.099), we discovered that regressions of  $\log{crmrte}$ on $prbarr$ or $prbconv$ individually had better fits $(prbarr: R^2 = 0.22$ and $prbconv: R^2 = 0.20)$ than regressions of $crmrte$ on $prbarr$ or $prbconv$ $(prbarr: R^2 = 0.15$ and $prbconv: R^2 = 0.16)$. Thus, we chose to employ $\log_{crmrte}$ in our base model. 

We also chose to log-transform all of our regressors. Doing so sacrifices a slightly better fit as compared to modeling untransformed regressors $(R^2 = 0.45$ vs. $R^2 = 0.41)$, but also improves our model in three ways. 

**1) It makes our model easier to interpret.** Although they are presented as probabilities, some values of $prbarr$ and $prbconv$ exceed one. Additionally, scaling issues hinder comparisons between counties; for example, what does it mean to make a 0.1-unit change in a county with $prbconv=0.1$ as compared to one with $prbconv=0.5$?  Employing a log-log model circumvents these complications and lets us compare the elasticity of $crmrte$ with respect to each of our key variables.  

**2) It better fits the zero-mean assumption of the CLM.** The log-level model of $crmrte$ against our regressors has a fitted values vs. residuals curve that noticeably bows upward at either end. The log-log model is flat on the left and bows upward much less on the right (data not shown). 

**3) It lowers the maximum Cook's distance to below 0.5.** County #51 greatly influences the log-level model (Cook's distance $\geq$ 1); the log-log model increases the influence of other points, but in no case to a Cook's distance of more than 0.5. 

Having fully specified our base model, we calculated the least-squares fit below. 

$$ \widehat{\log{crmrte}} = -4.71 - 0.724*\log prbarr - 0.472*\log prbconv + 0.148*\log prbpris $$


#### $prbpris$ Is Not Associated with Crime Rate in Any Practical Way, but $prbarr$ and $prbconv$ Are Negatively Associated

$\hat{\beta}_{\log{prbarr}}$, $\hat{\beta}_{\log{prbconv}}$, and $\hat{\beta}_0$ are statistically significant at the 0.1% level, but $\hat{\beta}_{\log{prbarr}}$ is not, even at the 25% level; thus, our model suggests that changes in incarceration rates are not associated with changes in crime rate. Our model further suggests that 1% relative increases (not absolute increases) in $prbarr$ and $prbconv$ are associated with 0.724% and 0.472% relative reductions in $crmrte$ when the other two regressors are held fixed. Although it is statistically significant at the 0.1% level, $\hat{\beta}_0$ indicates the $\log{crmrte}$ intercept when $prbarr=prbconv=prbpris=1$. This only occurs when authorities make arrests for all reported offenses, convict everyone who is arrested, and incarcerate all convicts. Clearly, this is not a realistic situation across an entire county.

We speculate that $\hat{\beta}_{\log{prbarr}}$ and $\hat{\beta}_{\log{prbconv}}$ describe practically significant associations with crime rate, since comparatively large reductions in crime rate are associated with modest increases in arrest and conviction rates. In contrast, the probability of being incarcerated upon conviction does not appear to be associated with crime rate at any statistically significant level; thus to a first approximation, the answer to our research question is no. 

We cannot claim that these relationships indicate causality; however, they suggest that we should introduce additional regressors to isolate the associations of $\log{crmrte}$ with $\log{prbarr}$ and $\log{prbconv}$. Doing so could provide support for launching controlled pilot programs that are aimed at increasing arrest and conviction rates and would test if any causal relationships exist.

In [None]:
# We showed scatter plots for the log transformations in the EDA above; 
# here we are simply calculating R^2

# Generate bivariate regressions on untransformed regressors
model10 <- lm(log(crmrte) ~ prbarr, data=clean)
model11 <- lm(log(crmrte) ~ prbconv, data=clean)
model12 <- lm(log(crmrte) ~ prbpris, data=clean)

# Generate bivariate regressions on log-transformed regressors
model13 <- lm(log(crmrte) ~ log(prbarr), data=clean)
model14 <- lm(log(crmrte) ~ log(prbconv), data=clean)
model15 <- lm(log(crmrte) ~ log(prbpris), data=clean)

paste("Model:", summary(model10)$call[2], " | ", "R-squared:", round(summary(model10)$r.squared, 3))
paste("Model:", summary(model11)$call[2], " | ", "R-squared:", round(summary(model11)$r.squared, 3))
paste("Model:", summary(model12)$call[2], " | ", "R-squared:", round(summary(model12)$r.squared, 3))
paste("Model:", summary(model13)$call[2], " | ", "R-squared:", round(summary(model13)$r.squared, 3))
paste("Model:", summary(model14)$call[2], " | ", "R-squared:", round(summary(model14)$r.squared, 3))
paste("Model:", summary(model15)$call[2], " | ", "R-squared:", round(summary(model15)$r.squared, 3))

### Model 2: Adding $density$ and $polpc$ Controls for Confounding Effects Improves Our Base Model's Fit

Having established our base model, we sought to introduce covariates that might (i) control for confounding effects and (ii) parsimoniously improve the model's fit to the underlying data . 

Given that we observed $\log{prbarr}$ is negatively associated with $\log{crmrte}$, we speculated that reduced crime rates might be associated just with having a higher concentration of police officers relative to population size as opposed to the act of arresting people itself. We sought to address this issue by including $\log{polpc}$ as a regressor to partial out the effect of police presence per capita from $prbarr$ (as well as the other variables in our base model). 

Counties in North Carolina are spread across a wide (29-fold) range of population densities. (This excludes county 173, which we retained in the dataset, but disregarded in calculating the range here.) We speculate that counties with relatively high densities could experience different challenges than counties with low densities; for instance, because of disparities in unemployment, educational attainment, or opportunities to commit crime. In fact, we observed that $density$ has a strong correlation with $\log{crmrte}$ and a strong negative correlation with both $\log{prbarr}$ and $\log{prbconv}$. Thus, we included it to reduce bias it might otherwise impart to $\hat{\beta}_{\log{prbarr}}$ and $\hat{\beta}_{\log{prbconv}}$. 


#### We Added a Quadratic Term to $density$ and Log-Transformed $polpc$ to Improve Our Model's Fit

We incorporated $\log{polpc}$ as well as $density$ and $density^2$ as regressors. We previously identified country #51 as an outlier. Log-transforming $polpc$ ameliorates the influence that county #51 has on a regression with $\log{crmrte}$. It also yields a better fitting regression relative to the untransformed variable ($R^2 = 0.08$ vs. $R^2=0$) that has a lower Akaike Information Criterion (AIC) score (145 vs. 152). A scatter plot indicates that $density$ is positively associated with $\log{crmrte}$, but at $density \geq 2$ the strength of this association starts to lessen. We added a quadratic term to account for this, which yielded a better fit than regressing against $density$ by itself ($R^2 = 0.46$ vs. $R^2 = 0.40$) and a better AIC score (100 vs. 106).  

Having fully specified our second model, we calculated the least-squares fit below.

$$ \widehat{\log crmrte} = -3.563 - 0.475*\log prbarr - 0.321*\log prbconv + 0.043*\log prbpris + 0.305*density - 0.024*density^2 + 0.176*\log polpc$$

#### $\log{prbarr}$ and $\log{prbconv}$ Still Show Practically Significant Associations with Crime Rate After Accounting for Population Density and Police per Capita

Relative to our base model ($R^2=0.415$ and $AIC=108$), model 2 yields a better fit ($R^2=0.597$) and minimizes information loss ($AIC=80$). The coefficients for our key variables are at least somewhat biased in our base model, since model 2 drives their magnitudes closer to zero. Specifically, $|\hat{\beta}_{\log{prbarr}}|$ and $|\hat{\beta}_{\log{prbconv}}|$ are reduced by 34% and 32%, respectively. Furthermore, the statistical significance of each coefficient is reduced; $\hat{\beta}_{\log{prbarr}}$ is only significant at the 5% level, while $\hat{\beta}_{\log{prbconv}}$ is only significant at the 10% level, which could indicate that they are each colinear with some combination of $\log{polpc}$, $density$, or $density^2$. $|\hat{\beta}_{\log{prbpris}}|$ dropped closer to zero and still is not statistically significant. $\hat{\beta}_0$ is no longer statistically significant.

We speculate that $\hat{\beta}_{\log{prbarr}}$ and $\hat{\beta}_{\log{prbconv}}$ are still probably of practical significance and indicate more actionable associations with crime rate than $\hat{\beta}_{\log{prbpris}}$. Even after accounting for differences in $density$ and $polpc$, modest percentage increases in either variable are still associated with relatively substantial percentage reductions in crime rate (e.g., a 10% increase in $prbarr$ from its original level is associated with a 4.75% reduction in $crmrte$ from its previous level). Additionally, $\hat{\beta}_{\log{prbconv}}$ just misses the threshold of being statistically significant at the 5% level ($\text{p-value}=0.59$). Conversely, $\hat{\beta}_{\log{prbpris}}$ still is not statistically significant and the magnitude of its effect even is smaller in this model, which argues that it is still not practically significant. 

In total, having controlled for the effects police presence and population density, we find that arrest and conviction rates are still meaningfully associated with crime rate. These results strengthen the notion that there is value in launching pilot programs to test if North Carolina can increase arrests or convictions to reduce crime rates. 

#### $\hat{\beta}_{density}$ Is Statistically Significant and Both $density$ Terms Describe Our Data Well

$\hat{\beta}_{density}$ is statistically significant at the 5% level, while $\hat{\beta}_{density^2}$ is not. The two $density$ terms indicate that $\widehat{\log{crmrte}}$ rises in association with $density$ up until $density \approx 6.5$, which fits our data well, given that only three counties have $density > 6$. $\hat{\beta}_{\log{polpc}}$ is not statistically significant; thus, we do not consider its practical significance.

In [None]:
# Shows that log(polpc) reduces effect of outlier 
par(mfrow = c(1, 2), cex.main = 0.8, cex.lab = 0.8, cex.axis = 0.8)
options(repr.plot.width=7, repr.plot.height=5)
hist(clean$polpc, main = "Police per Capita", col = rgb(0.4, 0.4, 0.4, 1), ylim = c(0, 30), xlab = "Police per capita",
     breaks = sqrt(nrow(clean)))
hist(log(clean$polpc), main = "log(Police per Capita)", col = rgb(0.4, 0.4, 0.4, 1), ylim = c(0, 30), 
     xlab = "log(Police per capita)", breaks = sqrt(nrow(clean)))

# Regresses log(crmrte) on polpc and log(polpc)
model.p1 <- lm(log(crmrte) ~ polpc, data=clean)
paste("Model:", summary(model.p1)$call[2])
paste("R-squared:", round(summary(model.p1)$r.squared, 2))
paste("Akaike Information Criterion:", round(AIC(model.p1), 0))
paste("*******************************")

model.p2 <- lm(log(crmrte) ~ log(polpc), data=clean)
paste("Model:", summary(model.p2)$call[2])
paste("R-squared:", round(summary(model.p2)$r.squared, 2))
paste("Akaike Information Criterion:", round(AIC(model.p2), 0))
paste("*******************************")

# Plots regression lines for polpc models
plot(clean$polpc, log(clean$crmrte), main="Untransformed polpc", 
     xlab="Police per capita", ylab="log(Crime rate)")
abline(model.p1)
plot(log(clean$polpc), log(clean$crmrte), main="log(polpc)", 
     xlab="log(Police per capita)", ylab="log(Crime rate)")
abline(model.p2)

# Regresses log(crmrte) on density 
model.d1 <- lm(log(crmrte) ~ density, data=clean)
summary(model.d1)
paste("Model:", summary(model.d1)$call[2])
paste("R-squared:", round(summary(model.d1)$r.squared, 2))
paste("Akaike Information Criterion:", round(AIC(model.d1), 0))
paste("*******************************")

# Regresses log(crmrte) on density and density^2
model.d2 <- lm(log(crmrte) ~ density + I(density^2), data=clean)
summary(model.d2)
paste("Model:", summary(model.d2)$call[2])
paste("R-squared:", round(summary(model.d2)$r.squared, 2))
paste("Akaike Information Criterion:", round(AIC(model.d2), 0))
paste("*******************************")

# Plots density regressions against data
clean$square.density <- (clean$density + (clean$density)^2)
plot(x=clean$density, y=clean$log.crmrte)

newdata <- data.frame(density=seq(min(clean$density), max(clean$density), 0.01))
newdata$pred1 <- predict(model.d2, newdata)
plot(clean$log.crmrte ~ clean$density , data=newdata, main="Quadratic vs. Linear Regressions on Density", 
     xlab="Density", ylab="log(Crime rate)")
lines(newdata$density, newdata$pred1, col="red")
abline(a=coef(model.d1)[1], b=coef(model.d1)[2])

### Model 3: $\hat{\beta}_{\log{prbarr}}$ and (to a Lesser Extent) $\hat{\beta}_{\log{prbconv}}$ Exhibit Robustness to Model Specification, but $\hat{\beta}_{\log{prbpris}}$ Does Not

Finally, we tested the robustness of our base model coefficients to pertubations in model specification. To do so, we added untransformed regressors that correspond to every variable remaining in the dataset that we did not specifically address in model 2, with the exception of $year$ and $county$ (see table below).  

#### Model 3 Reinforces the Notion that the Probability of Incarceration Is Not Associated with Crime Rate, but Probabilities of Arrest and Conviction Are

With respect to our motivating question (i.e., Is incarceration associated with crime rate?), comparatively larger reductions in crime are associated with simply arresting suspects and convicting offenders; incarceration, which is more costly, is not associated with higher or lower crime rates. Additionally, the fact that $\hat{\beta}_{\log{prbarr}}$ and $\hat{\beta}_{\log{prbconv}}$ are relatively unchanged in value relative to our base model argues that they are robust to model specification. In contrast, $\hat{\beta}_{\log{prbpris}}$ appears to be sensitive.

* $\log{prbarr}$: We argued above in model 2 that $\hat{\beta}_{\log{prbarr}}$ is of practical significance; this conclusion also holds in model 3, since $\hat{\beta}_{\log{prbarr}}$ is of even greater statistical significance and has the same value. 


* $\log{prbconv}$: $\hat{\beta}_{\log{prbconv}}$ is statistically significant in model 3, but its effect is slightly smaller in model 3 (-0.258) than it is in model 2 (-0.321). We argue that this could still represent a practically significant effect, as it implies that a 10% relative increase in convictions is associated with a 2.58% relative reduction in crime rate. 


* $\log{prbpris}$: As before, $\hat{\beta}_{\log{prbpris}}$ is not statistically significant, and its value is even closer to zero. These two observations suggest that the probability of incarceration is not associated with crime rate in any meaningful way (i.e., the answer to our research question is no).

#### $pctmin80$, $density$, and $wfed$ Are Statistically Significant to at Least the 5% Level

* **$pctmin80$:** Holding all other regressors constant, a one unit increase in $pctmin80$ is associated with a 1.2% increase in $crmrte$; given that the median of $pctmin80$ is 24, this represents a substantial increase in $crmrte$. Thus, researchers may need to control for $pctmin80$ if other factors do not absorb its effects, but is of limited use in recommending specific policies. Efforts to alter this quantity (e.g., by evicting minorities) are likely unrealistic to undertake, morally repugnant, and legally dubious.


* **$density$:** We previously accounted for $density$ in model 2. Compared to model 2, model 3 captures a slightly steeper rise in $crmrte$ as a function of population density. As with $pctmin80$, researchers may need to control for $density$ if other factors do not absorb its effects, but it is of limited use in the sense that we cannot reasonably recommend policies that test if reducing population density reduces crime in existing areas. However, in the long term, North Carolina might be able to test if changing zoning laws to reduce population density lowers crime rate in new communities.


* **$wfed$:** Holding all other regressors constant, every dollar that federal employees earn on a weekly basis is associated with a 0.2% increase in $crmrte$; given that the median of $wfed$ is \\$449.8, this represents a substantial increase in $crmrte$. Thus, researchers may need to control for $pctmin80$ if other factors do not absorb its effects; however, we suspect that it is of limited use in recommending specific policies. For example, an increasing federal wage could simply represent greater federal investment in an area (e.g., to decrease poverty or perhaps even crime rate).

## (6) Conclusions

We found that within this dataset, $prbpris$ does not show any meaningful association with $crmrte$ and that the answer to our research question (Are crime rates associated with the probability of being incarcerated?) is no. In contrast, our analysis indicates that increasing probabilities of arrest and conviction (measured by $prbarr$ and $prbconv$, respectively) are associated with lower crime rates. 

#### We Cannot Claim that Causal Relationships Exist between $crmrte$ and $prbarr$ or $prbconv$ 

Importantly, we cannot claim that any causal effects exist in these relationships. For example, areas with lower crime rates might simply have higher rates of arrests and convictions because lower crime rates facilitate these effects. Stated another way, other factors (e.g., poverty rates, education) could play a more direct role in reducing crime and may enable higher arrest and conviction rates. 

#### $prbpris$ Could Be Associated with Crime Rate in Ways that Are Masked By Our Dataset
THIS COULD DEPEND ON KATIE'S SECTION

#### We Recommend Establishing Pilot Programs to Test the Associations that We Identified

In light of the fact that we cannot define a causal relationship between these variables, we recommend that you establish pilot programs in some counties that attempt to test some of these observations. If retrospective analyses indicate that the programs appear to have reduced crime rates (especially in comparison with similar counties that are not executing a similar pilot), then you can launch these initiatives statewide. We recommend that you campaign on pilot policy prescriptions that are designed to manipulate these variables. Additionally, we recommend implementing programs that test if the probability of prison really is unrelated to crime rate.

**Recommendation #1: Process backlogged evidence and invest in new technology to increase arrest rates.** The results of model 2 suggest that larger police forces in and of themselves may be insufficient to reduce crime rates. Consequently, we advocate for creative measures that enable them to act more efficiently. For example, you could advocate increasing funding to process backlogged rape kits to make more arrests. Similarly, you could push to upgrade analytic capabilities to process other types of DNA samples. 

**Recommendation #2: Invest in professional development to increase conviction rates.** North Carolina could probably add more judges, prosecutors, defense attorneys, and attendant staff to process more cases each year; however, this could be prohibitively expensive. A more viable alternative could revolve around identifying and promulgating best practices (e.g., via mentorship programs) that help judges clear cases from dockets more quickly and prosecutors present cases more effectively. 

**Recommendation #3: Offer alternatives to prison for offenders.** Our analysis suggested that incarceration rates might not be associated with crime rate; we argue that it is worth exploring if offering cheaper alternatives in lieu of prison (e.g., fines, community service, probation) for non-violent offenders has an adverse effect on crime rate.  

**Recommendation #4: Fund additional studies to resolve confounding effects.** DOES NOT DISTINGUISH VIOLENT FROM NON-VIOLENT CRIME. THIS COULD DEPEND ON KATIE'S SECTION.
