# A Bayesian Behavioral Model of Police Misconduct
### Project Description
We've been hearing the same old song that not all police are bad for decades now, in spite of law enforcement's consistent over application of force to BIPOC communities around the nation. That this has continued for literally hundreds of years now is an argument against there being just a few bad cops in PDs, and a massive argument in favor of a broken culture inside of police departments.

The statistical models below are a direct check against the argument that the problem lies in "just a few bad apples". Anyone with eyes knows that's not the case, but the truly obstinate policy makers out there are going to need one heck of a push to do the right thing--a push that includes taking away any room for argument as to the cause of this ongoing terrorism.

The following is the R code and JAGS scripts for a set of Bayesian Cognitve models analyzing the rate of violent interactions of police officers with communities. Note: the data used is fabricated. If you intend to use this model legitimately, you'll need to work with your community to collect and use your own data--I'll walk through how the data needs to be structured to seemlessly use these scripts in the first section below.  

I've included five different models that one can apply to this question for people can try out. I'll describe each model below so that you can make a decision about how you'd want to proceed with using it.

All of these are 100% open for you to use. I drafted these models to be used as a statistical cudgel against obstinate policy makers, and I made sure that they're general enough to work with any data that people might have available. Looking at the kinds of arguments that policy makers are throwing out there right now--arguments that support the continued misallocation of tax dollars to violent PDs that are based on these sky-high views of national police statistics (I talk about why this is a problem in my description of models 3 and 4)--I want to make sure that these arguments are properly debunked so that the real leaders out there right now on the picket lines have one less hurdle to jump over. These models are just tools to help do that.

The beauty of Bayesian modeling is that it allows us to make strong statistical inferences with limited data. Right now, authorities are hiding that data, and Bayesian stats are one way of getting around that using sources of data that we do have access to--the stories within communities. It's worth noting that the model used in a court of law to prove malingering--lying about test results usually in criminal fraud cases--is a Bayesian model, for exactly the reasons I just stated.

If you want to take these scripts, play with them, apply them to other problems, you by all means can do so! If you want any help in implementing these models, feel free to reach out to me at zrosen@uci.edu.

This isn't anywhere near a solution to systemic racism. The only real solution is to force policy change by whatever means are possible, and to push for policy that puts BIPOC on an even playing field with white folks, finally. I'm not going to act like these models are more nobel than they are. If these can be used as a tool for people out there fighting the good fight, then I'm glad.

### Data Structure
Let's say you want to swap out my made up data with real data. How would you do that? So the models all assume that you have a known number of officers, with a known number of complaints levied against them per each of the communities that they have been on patrol in to start. This is represented in how the data is structured. What you would need to do to swap in your own data would be to assign a data frame or matrix to the variable "y" in the R script for the model that you want to use, where said data frame or matrix has one row per each officer, with one column for as many communities as you're interested in. So, for example . . . 

| Officer #   | Community 1 | Community 2 | Community 3 | . . . | Community C |
|-------------|-------------|-------------|-------------|-------|-------------|
| 001         | 0           | 1           | 5           | . . . | 3           |
| 002         | 2           | 3           | 6           | . . . | 2           |
| 003         | 1           | 1           | 3           | . . . | 2           |
| . . .       | . . .       |  . . .      | . . .       | . . . | . . .       |
| 100         | 4           | 1           | 10          | . . . | 4           |


### Model 1
Model one assumes that each individual officer has a rate of violent interactions in each community they serve, independent of external factors. We use the following priors to do so:

$\theta_{c, o} \thicksim uniform(0,1)$

$y_{c, o} \thicksim binomial(\theta_{c, o}, N_{c, o})$

Where theta is the rate of violent altercations in the community, c stands for the community, and o for the officer in question.

This model will work well if you already know the total number of interactions that an officer has had. Given that we can't honestly be sure of this every time, the next model implements a mechanism to infer the total number of interactions.


### Model 2
We add one layer of complexity here and set a prior on the total number of interactions an officer has, rather than supplying that number. Doing this will give us an estimate for both the rate at which officers interact violently with communities, and the total number of interactions with a community. Think of the combination of these two as estimating what the worst case scenario might be for the number of times the officer went on patrol in that community but the best case scenario for the number of times they had a violent altercation with community members, or the best case scenario for the number of times officers prowled the community but the worst case scenario for the number of times they had a violent altercation with civilians. We model this using the following priors:

$\theta_{c,o} \thicksim uniform(0,1)$

$N_{c,o} \thicksim categorical(1/50)$

$y_{c,o} \thicksim binomial(\theta_{c,o}, N_{c,o})$

This model still assumes that each officer has their own unique rate of violent altercations, per each community, and that there is zero outside influence on their actions. The next model starts to assume that, hey, maybe the PD as a whole has an effect on officers' behavior.

### Model 3
Full disclosure, I included this model in part to show why aggregating data for the entirety of a PD is a terrible idea, let alone aggregating data for the entirety of the country. That the only publically available data source is a document of aggregate counts of police interactions across the entire nation, with self reported data not on the complaints levied against officers, but the number of complaints that resulted in any disciplinary action at all, is horrifying. This model shows how aggregate data can conceal actual trends that affect communities.

We add a new prior here for the rate in general that PDs influence officers' actions in communities. We assume that an officer's rate of violent altercations is centered on the mean rate of violent altercations for the whole PD, and that each officer might vary some amount according to them as an individual. Those priors are represented below. Delta is the PD rate of violent altercations, N is still the number of total interactions an officer has with a community, and theta is the rate of violent altercations in that community, for the officer in question.

$\delta \thicksim uniform(0,1)$

$\sigma_{c,o} \thicksim uniform(0,1)$

$\theta_{c,o} \thicksim normal(\delta, \frac{1}{\sigma_{c,o}^2})T(0,1)$

$N_{c,o} \thicksim categorical(1/50)$

$y_{c,o} \thicksim binomial(\theta_{c,o}, N_{c,o})$

When you run this model, you'll notice that delta tends towards lower rates. Fun fact, once we split delta up by community, that changes dramatically.

### Model 4
We know that cultures in PDs drastically affect the behavior of officers. We also know that bias is not directed at all citizens equally. The following model, I think, is the most honest and most accurate model without knowing the exact number of interactions an officer has with a given community. It assumes that the PD does treat communities differently--a known fact at this point--and that treatment can act to additively affect officers' rate of violent altercations, or negatively affect it.

If you want validation for whether this model or the previous one is more accurate, take the difference between the two models using the Savage-Dickey method at the actual rate of violent altercation complaints levied at the PD. You're absolutely going to notice that Model 4 outperforms Model 3--it captures that systemic racism exists in the form of specific biases against communities, which increases Model 4's predictive accuracy.

We use the following priors to capture this difference. Theta is still the rate of violent altercations, per officer per community, but we now assume that delta can vary from community to community. Officers can vary from the rate of altercations at the PD level by some measure of standard deviation, thus their number of altercations are normally distributed around the PD's mean per community, either positively or negatively (this actually checks back the "good cop" argument some people keep trying to kick around, and we can talk about how via email if you'd like).

$\delta_{c} \thicksim uniform(0,1)$

$\sigma_{c,o} \thicksim uniform(0,1)$

$\theta_{c,o} \thicksim normal(\delta_c, \frac{1}{\sigma_{c,o}^2})T(0,1)$

$N_{c,o} \thicksim categorical(1/50)$

$y_{c,o} \thicksim binomial(\theta_{c,o}, N_{c,o})$

### Model 5
Model 5 is an exact re-hashing of model 4, but it includes knowledge of the total number of interactions N with the community that an officer has. Again, that N is really unlikely to be found in any formal record, and it does not significantly change results for delta--the community level rate of violent altercations for the whole PD--which is really the main point of interest for this model. We care deeply about the effect of policing on the community, and so delta is our compass here.

$\delta_{c} \thicksim uniform(0,1)$

$\sigma_{c,o} \thicksim uniform(0,1)$

$\theta_{c,o} \thicksim normal(\delta_c, \frac{1}{\sigma_{c,o}^2})T(0,1)$

$y_{c,o} \thicksim binomial(\theta_{c,o}, N_{c,o})$