# Abstract

TBD

# Introduction

Public perception of the police is incredibly important to police effectiveness and legitimacy but extremely difficult to measure. Public perception offers insight into how well a police department is functioning and may suggest adherence to tenets of procedural justice. Yet, compared to traditional performance metrics, metrics to evaluate public opinion are poorly defined and documented. 

There is very little research currently available that measures real-world public sentiment of the police in US cities. As a result, this project serves to provide a proof-of-concept first and foremost that publicly-available data can be easily acquired and use in order to study common issues in policing. 

## Clearance rates

Homicide clearance rates, or what share of murders a police department "solves", are a key performance metric for police departments. Chicago has one of the lowest homicide clearance rates in the country, and only about 1 in 6 murders lead to arrest. Moreover, Chicago's clearance rate has steadily declined over the past ten years, from about 40% in 2000 down to under 20% in 2017. As a comparison, several police departments have markedly higher clearance rates. Over the past decade, Los Angeles has solved 51% of murders and New York has solved 61% of murders. 

There are several potential reasons for the low clearance rate in Chicago, some of which suggest that non-traditional metrics of policing like procedural justice or public opinion may be related to traditional metrics. Police officers tend to cite the historically fraught relationship between the people and police, believing that someone who already views the police negatively because police seem inept may be less likely to cooperate with an investigation; more bluntly, many police officers lament a "no snitch" policy among victimized communities in Chicago. Evidence is conflicted: the National Crime Victimization Survey reports that these communities are no less likely to report crimes to the police, but a Cato Institute survey shows a race and education gap for crime reporting. There are also other viable explanations for Chicago's abysmal clearance rate, most notably that Chicago's police force has limited manpower per murder. Chicago has more murders than New York and Los Angeles combined, yet the police department (12,000 officers) is dwarved by New York's (36,000) and Los Angeles' (10,000).

## Public opinion

Public opinion can also help measure procedural justice, or how police officers enforce laws.  Although procedural justice is difficult to measure directly, past research has evaluated procedural justice through the lens of public opinion survey data.   

Procedural justice is necessary for effective policing. A civilian who considers the law enforcement process fair and just is likely to consider any related consequences fair and just, too. Conversely, when civilians perceive lack of procedural justice, they are more likely to file complaints and view their police force as delegitimate. For example, one study of New York Police Department Stop, Question, and Frisk stops showed that civilians who believed their stop to be fair were less likely to file a complaint than those who believed their stop was unjust. Finally, a lack of procedural justice in just a few encouters can severely curtail public opinion of the police. Negative interactions with the police shape citizen perception up to fourteen times more strongly than positive ones (ADD SOURCE). 

Public perception of the police offers an additional metric to assess police performance. While hard metrics like clearance rates are easy to measure, assessing how the public feels towards the police is far more complex. Indeed, most work that tries to assess public sentiment uses survey-based or experimental research. More recent work has considered sentiments of tweets to assess public opinion of the police. As a caveat, public perception of the police is complicated and interacts with policing in myriad ways.

## Research goals

The goal of this work is to assess the extent to which twitter data can be reliably used to evaluate public opinion of the police department. There is no dataset on police-related tweets that I could draw on, for example; very little by way of classifying tweets as police-related or not exists at present. This work then serves to primarily explore whether tweets can be reliably be categorized as "police relevant, positive", "police relevant, negative", "police relevant, neutral", or "not police relevant" with respect to public sentiment. 

Why might such work be important? There are thousands of tweets about policing each and every day in the United States. Understanding which tweets reflect public sentiment (rather than are unrelated but use similar acronyms) provide a foundation for further research. More importantly, understanding what precise sentiment tweets express on a larger scale can better enable researchers to measure public sentiment of the police. 

More concretely, this is simply the beginning. Once we can reliably predict how a tweet relates to policing, we can begin to assess public perception of the police. More specifically, I'm interested in assessing the extent to which public sentiment reflects traditional metrics of police effectiveness, where effectiveness here is roughly equivalent to clearance.

# Past Work - UPDATE from Amitabh comments

## Using twitter data to measure public sentiment towards the police 

Although there has been limited work using data science techniques to study criminal justice, the Urban Institute used sentiment analysis for police-related tweets to measure how perception of the police changed due to the murder of Freddie Gray, using the following methods:

- Obtaining the data: researchers used a set of relevant tweets from 2014 and 2015 acquired through twitter.
- Processing the data: researchers removed mentions, hashtags, links, punctuation, and stop words from all tweets. They also used CoreNLP to tag tweets (e.g., to identify whether "cop" was a noun or a verb in each tweet).
- Learning models: researchers classified over 4,000 tweets manually to identify whether the tweet was positive, negative, neutral, or not applicable to their research for use in training and validation sets. They then used several types of models to predict the sentiment of new tweets and selected a gradient-boosted regression classifier as their model based on its accuracy (63%). 
- Conclusions: researchers then used their newly labeled set of all tweets to assess the shift in public sentiment over time. 

## Using twitter data to connect public opinion with tweet sentiment

As a more general example, researchers at Carnegie Mellon University determined that public opinion surveys correlate to twitter sentiment on several key issues. They used twitter data specifically with two endgoals: to identify relevant tweets and to estimate sentiment (positive and negative) about a given topic. 

In their work, researchers obtained tweets from 2008 and 2009 using the twitter API.  They then used key words (like "obama" to measure presidential approval) to ensure that their tweets were relevant. Tweets were classified as positive, negative, or both depending on whether there was a positive, a negative, or both types of words in it. Finally, to get an accurate measure of sentiment, they computed a moving average aggregate of sentiment ratios, where sentiment ratio was defined as the ratio between the number of positive and negative relevant tweets. The moving averages allowed them to smooth otherwise volatile data. They then investigated correlations between the sentiment they uncovered and traditional public opinion surveys. 

# Methods 

## Extracting tweets using the Twitter API

Tweets were extracted from the Twitter API using the following criteria:
- Tweets contained at least one of the following search terms: "Chicago Police", "CPD", "chicago police department", "second city cop", "chicago cop"
- Tweets were extracted in two rounds: one in early April that was overly dominated by the Jurnee Smollet case, and one at the middle of May. 

Ultimately, tweets in the second set were used for modelling. Each dataset included over 15,000 tweets. 

## Preprocessing tweets 

Preprocessing and labeling large amounts of text data posed a non-trivial challenge. Several approaches, as listed below, were attempted in order to efficiently and accurately label the data.

### Hand labeling 
First, I tried to hand label 

### CoreNLP


## Classifying tweets as relevant or irrelevant 

## Feature engineering 

## Modeling 

# Results 

# Selected sources

## Papers

Ekins, Emily. (2016). Policing in America: Understanding Public Attitudes Toward the Police. Results from a National Survey. SSRN Electronic Journal. 10.2139/ssrn.2919449. 

Fowler, AF Rengifo and K. 2016. "Stop, Question, and Complain: Citizen Grievances Against the NYPD and the Opacity of Police Stops Across New York City Precincts, 2007-2013." Journal of Urban Health (93 Suppl 1): 32-41.

O'Connor, Brendan & Balasubramanyan, Ramnath & R. Routledge, Bryan & A. Smith, Noah. (2010). From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. International AAAI Conference on Weblogs and Social Media. 11. 

Skogan, Wesley G. 2006. "Asymmetry in the Impact of Encounters With Police." Policing and Society.

Tyler, Tom R. 2004. "Enhancing Police Legitimacy." The Annals of the American Academy of Political and Social Science 593: 84-99.

## News and articles (quick links)

- https://www.washingtonpost.com/graphics/2018/investigations/unsolved-homicide-database/?utm_term=.8da8a801878a&city=indianapolis]
- https://chicago.suntimes.com/news/murder-clearance-rate-in-chicago-hit-new-low-in-2017/
- https://www.theatlantic.com/ideas/archive/2018/05/quis-custodiet-ipsos-custodes/560324/
- https://datasmart.ash.harvard.edu/news/article/map-monday-unsolved-homicides