# Oakland Crime Rate 2016-2022 (Part 1)

Leqi Zhong

Programming in Journalism, Winter 2023



## Introduction

### The Project

Oakland, California, is known as one of the most dangerous cities in the U.S. Especially since the pandemic, Oakland residents have complained of a decline in safety, with an increase in crime due to a stagnant economy and inflation. Some neighborhoods, such as Oakland's Chinatown, have faced a rise in crime targeting the Asian Pacific Islander community, while the Oakland Police Department did not have the budget to deploy patrol teams. As a result, the community organized volunteer patrols to address safety concerns. The project gathered crime statistics from the Oakland Police Department, narrowed down to full years of data from 2016 to 2022, with the goal of analyzing policing trends in Oakland and the impact of the pandemic on crime rates.

### Data Source

City of Oakland, CrimeWatch data:
https://data.oaklandca.gov/Public-Safety/CrimeWatch-Data/ppgh-7dqv/data

Oakland Police Beats Map: 
https://oakgis.maps.arcgis.com/apps/OnePane/basicviewer/index.html?appid=12ae8a087be44043abc6996c5e499d5c

### Findings

1. In terms of the total number of cases, the crime rate has increased in the last two years (the lower crime rate in 2020 may be due to the fact that most people are homebound) and, in particular, 2022 is the highest total number of crimes in six years and has the largest increase.

2. Regarding the timing of crime, no correlation was found between crime and month and season.

3. Auto-related crime is the most common type of crime. Auto burglary was the most common type of crime, followed by stolen vehicle and vandalism. Petty theft, misdemeanor assault, domestic violence, and robbery were also more common types of crime.

4. The most dangerous area in Oakland are Downtown Oakland(04X), west side of Lake Merritt(08X), and Fruitvale(19X).

5. 2020 is the only year in which the number of stolen vehicle cases was higher than auto burglary, perhaps related to the fact that people stay home for a long time and do not go out.

6. The number of auto burglary and stolen vehicle cases is rising sharply, from about 15,000 in 2016 to nearly 20,000 in 2022.

7. In terms of the 03X area where Chinatown is located, though the number of cases is on the rise from 2020 to 2022, the recent three years' number of cases is significantly lower than the pre-pandemic's number of cases, i.e. the security situation is better than pre-pandemic.

8. For Chinatown, the top crime types are auto burglary, vandalism, misdemeanor assault and robbery. Since the latter two cases occur more frequently in Chinatown (compared to Oakland's total crime tpye trend), this may be the reason why people feel that "Chinatown is not safe."

### Limitations:

1. The police department is using Uniform Crime Reporting (UCR) model to record crimes, which has a hierarchy rule that classifies each incident into one category, even if it involves multiple crimes. Therefore, the numbers cannot fully reflect residents' safety concerns.

2. Due to errors and limitations, about 5% of the cases did not specify the exact crime type. In addition, the classification of crime types is not clear.

3. Some data are missing PoliceBeat or location data. But it's only 0.4% of the total data, so I believe it doesn't affect the big picture.

## Importing Tools and Data

In [3]:
import pandas as pd
import requests

In [4]:
url = 'https://data.oaklandca.gov/api/views/ppgh-7dqv/rows.csv?accessType=DOWNLOAD'
r = requests.get(url, allow_redirects=False)
open('CrimeWatch_Data.csv', 'wb').write(r.content)

138473055

In [5]:
crime_watch = pd.read_csv('CrimeWatch_Data.csv')

In [6]:
crime_watch.head()

Unnamed: 0,CrimeType,DateTime,CaseNumber,Description,PoliceBeat,Address,City,State,Location
0,VANDALISM,10/02/2021 10:30:00 PM,21-916959,VANDALISM,04X,17TH ST,Oakland,CA,
1,BURG - AUTO,10/11/2021 12:00:00 AM,21-917387,BURGLARY-AUTO,77X,TELEGRAPH AVE,Oakland,CA,
2,BURG - AUTO,11/09/2021 07:45:00 PM,21-919287,BURGLARY-AUTO,77X,COLLEGE AVE,Oakland,CA,
3,BURG - AUTO,11/10/2021 07:00:00 PM,21-919396,BURGLARY-AUTO,77X,48TH ST,Oakland,CA,
4,GRAND THEFT,11/15/2021 04:00:00 PM,21-919633,GRAND THEFT,77X,PALISADE,Oakland,CA,


## Data Cleaning

### Datetime

In [7]:
crime_watch.DateTime[0]

'10/02/2021 10:30:00 PM'

In [8]:
crime_watch[crime_watch.DateTime.isnull()]

Unnamed: 0,CrimeType,DateTime,CaseNumber,Description,PoliceBeat,Address,City,State,Location
450,,,08-055152,BURGLARY-AUTO,77X,UNKNOWN,Oakland,CA,
477,,,08-065205,GRAND THEFT:MISCELLANEOUS (AMENDED),11X,500 63RD ST,Oakland,CA,POINT (-122.2609 37.84867)
490,,,08-070104,THEFT,77X,UNKNOWN,Oakland,CA,
1351,,,13-035215,BURGLARY-AUTO,,,Oakland,CA,POINT (-122.27307 37.80508)
4502,,,07-011308,GRAND THEFT,04X,1400 FRANKLIN ST,Oakland,CA,POINT (-122.26986 37.80401)
...,...,...,...,...,...,...,...,...,...
1070662,,,06-095203,ANNOYING TELEPHONE CALL:OBSCENE/THREATENING,,,Oakland,CA,POINT (-122.27307 37.80508)
1073880,,,06-101255,DISTURB THE PEACE,,,Oakland,CA,POINT (-122.27307 37.80508)
1074496,,,06-068176,BURGLARY-AUTO,,,Oakland,CA,POINT (-122.27307 37.80508)
1074871,,,06-074887,DISTURB THE PEACE,,,Oakland,CA,POINT (-122.27307 37.80508)


In [6]:
from datetime import datetime
def convert_date(row):
    # 06/05/2007 09:30:00 PM
    try:
        #return datetime.strptime(row.DateTime, "%m/%d/%Y %H:%M:%S %p")
        # split the date on white space. 
        # Take the first item returned
        # which should be month/day/year
        date_str = row.DateTime.split(' ')[0]
        return datetime.strptime(date_str, "%m/%d/%Y")
    except:
        return None
    
crime_watch['date_new'] = crime_watch.apply(convert_date,  axis=1)

crime_watch.date_new.notnull()

In [7]:
non_null_row_filter = crime_watch.date_new.notnull()

In [8]:
clean_crime = crime_watch[non_null_row_filter]

In [9]:
clean_crime.groupby("date_new")

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fdc07106a00>

### Data Quality Check

In [10]:
clean_crime.date_new.min()

datetime.datetime(1950, 1, 4, 0, 0)

In [11]:
clean_crime.date_new.max()

datetime.datetime(3013, 7, 4, 0, 0)

### Select the Time Scope

In [12]:
end_date = clean_crime.date_new < datetime(2023, 1, 1)

In [13]:
start_date = clean_crime.date_new > datetime(2015, 12, 31)

In [14]:
crime_16_22 = clean_crime[start_date & end_date]

In [15]:
crime_16_22.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 430543 entries, 0 to 1076702
Data columns (total 10 columns):
 #   Column       Non-Null Count   Dtype 
---  ------       --------------   ----- 
 0   CrimeType    426734 non-null  object
 1   DateTime     430543 non-null  object
 2   CaseNumber   430543 non-null  object
 3   Description  430542 non-null  object
 4   PoliceBeat   429962 non-null  object
 5   Address      430414 non-null  object
 6   City         430543 non-null  object
 7   State        430543 non-null  object
 8   Location     419055 non-null  object
 9   date_new     430543 non-null  object
dtypes: object(10)
memory usage: 36.1+ MB


In [16]:
crime_16_22.date_new.min()

datetime.datetime(2016, 1, 1, 0, 0)

In [17]:
crime_16_22.date_new.max()

datetime.datetime(2022, 12, 31, 0, 0)

### Export Clean Dataframe

In [18]:
crime_16_22.to_csv('crime_16_22.csv')

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/popularity_contest/reporter.py", line 105, in report_popularity
    libraries = get_used_libraries(initial_modules, current_modules)
  File "/opt/conda/lib/python3.9/site-packages/popularity_contest/reporter.py", line 74, in get_used_libraries
    all_packages = get_all_packages()
  File "/opt/conda/lib/python3.9/site-packages/popularity_contest/reporter.py", line 51, in get_all_packages
    for f in dist.files:
TypeError: 'NoneType' object is not iterable
