# Analyzing hate crimes trends for Austin against the USA as a whole, 2017 - Present

## Data Wrangling & Cleaning

I've been working, off and on, on this project for since about January 2020. One-half practice, one-half because I want to try and contribute to making sense of the chaos that is our world right now. What I intend is to analyze hate crimes trends for Austin, TX against the USA as a whole from 2017 to the present, with particular focus on the LGBT Community. 

I am using data provided by Austin PD in this notebook, and in the next 2, or 3 notebooks as well. For now, I am focusing solely on data for Austin. I will get into broader data for the USA later down the road.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

aus_17 = pd.read_csv('https://data.austintexas.gov/resource/79qh-wdpx.csv')
display(aus_17.head())
print('----------------------------------')
display(aus_17.dtypes)

Unnamed: 0,month,incident_number,date_of_incident_day_of_week,number_of_vitims_under_18,number_of_victims_over_18,number_of_offenders_under_18,number_of_offenders_over_18,race_or_ethnic_of_offender,offense,offense_location,bias,victim_type
0,January,2017-241137,01/01/2017/Sun,0,1,0,1,White/Not Hispanic,Aggravated Assault,Park/Playground,Anti-Black or African American,Individual
1,February,2017-580344,02/01/2017/Wed,0,1,0,1,Black or African American/Not Hispanic,Aggravated Assault,Highway/Road/Alley/Street/Sidewalk,Anti-White,Individual
2,March,2017-800291,03/21/2017/Tues,0,0,0,0,Unknown,Destruction,Highway/Road/Alley/Street/Sidewalk,Anti-Jewish,Other
3,April,2017-1021534,04/12/2017/Wed,0,0,0,0,White/Unknown,Simple Assault,Air/Bus/Train Terminal,Anti-Jewish,Individual
4,May,2017-1351550,05/15/2017/Mon,1,0,1,2,White/Not Hispanic,Simple Assault,Residence/Home,Anti-Gay (Male),Individual


----------------------------------


month                           object
incident_number                 object
date_of_incident_day_of_week    object
number_of_vitims_under_18        int64
number_of_victims_over_18        int64
number_of_offenders_under_18     int64
number_of_offenders_over_18      int64
race_or_ethnic_of_offender      object
offense                         object
offense_location                object
bias                            object
victim_type                     object
dtype: object

I dislike the Socrata method bc it imports all data in all columns as objects...importing the data using the url method leaves the column data intact so will make my job much easier down the road. 

### First glance...
As I stated previously, my goal is to analyze trends over time. In particular, I want to focus on how hate crime affects the LGBT community. Initially speaking, most of these columns will be unnecessary for my purposes so I suspect we'll be removing most of them. 

In [2]:
# Loading the datasets for '18, '19, and this year
aus_18 = pd.read_csv('https://data.austintexas.gov/resource/idj2-d9th.csv')
aus_19 = pd.read_csv('https://data.austintexas.gov/resource/e3qf-htd9.csv')
aus_20 = pd.read_csv('https://data.austintexas.gov/resource/vc9m-ha4y.csv')

In [3]:
# Concatenating the datasets
aus_final = pd.concat([aus_17, aus_18, aus_19, aus_20], sort=False, axis=0)
display(aus_final.head())
print('----------------------------------')
display(aus_final.tail())
print('----------------------------------')
display(aus_final.describe())
print('----------------------------------')
display(aus_final.info())
print("-------------------------------")
display(aus_final.isnull().sum())

Unnamed: 0,month,incident_number,date_of_incident_day_of_week,number_of_vitims_under_18,number_of_victims_over_18,number_of_offenders_under_18,number_of_offenders_over_18,race_or_ethnic_of_offender,offense,offense_location,...,victim_type,race_ethnic_of_offender_s,offense_s,date_of_incident,day_of_week,number_of_victims_under_18,number_of_offenders_under,number_of_offenders_over,race_ethnicity_of_offenders,notes
0,January,2017-241137,01/01/2017/Sun,0.0,1,0.0,1.0,White/Not Hispanic,Aggravated Assault,Park/Playground,...,Individual,,,,,,,,,
1,February,2017-580344,02/01/2017/Wed,0.0,1,0.0,1.0,Black or African American/Not Hispanic,Aggravated Assault,Highway/Road/Alley/Street/Sidewalk,...,Individual,,,,,,,,,
2,March,2017-800291,03/21/2017/Tues,0.0,0,0.0,0.0,Unknown,Destruction,Highway/Road/Alley/Street/Sidewalk,...,Other,,,,,,,,,
3,April,2017-1021534,04/12/2017/Wed,0.0,0,0.0,0.0,White/Unknown,Simple Assault,Air/Bus/Train Terminal,...,Individual,,,,,,,,,
4,May,2017-1351550,05/15/2017/Mon,1.0,0,1.0,2.0,White/Not Hispanic,Simple Assault,Residence/Home,...,Individual,,,,,,,,,


----------------------------------


Unnamed: 0,month,incident_number,date_of_incident_day_of_week,number_of_vitims_under_18,number_of_victims_over_18,number_of_offenders_under_18,number_of_offenders_over_18,race_or_ethnic_of_offender,offense,offense_location,...,victim_type,race_ethnic_of_offender_s,offense_s,date_of_incident,day_of_week,number_of_victims_under_18,number_of_offenders_under,number_of_offenders_over,race_ethnicity_of_offenders,notes
2,March,2020-5011788,,,1,,,,,Residence/Home,...,,,Criminal Mischief,2020-03-22T00:00:00.000,Sunday,0.0,0.0,0.0,Unknown,
3,April,2020-5015689,,,1,,,,,Church/Synagogue/Temple/Mosque,...,,,Criminal Mischief,2020-04-20T00:00:00.000,Monday,0.0,0.0,0.0,Unknown,
4,April,2020-5016804,,,1,,,,,Department/Discount Store,...,,,Assault by Threat,2020-04-29T00:00:00.000,Wednesday,0.0,0.0,1.0,Black/Non-Hispanic,
5,May,2020-1381131,,,1,,,,,Convenience Store,...,,,Assault by Contact,2020-05-17T00:00:00.000,Sunday,0.0,0.0,1.0,White/Non-Hispanic,
6,May,2020-1410411,,,1,,,,,Streets/Highway/Road/Alley,...,,,Assault with Injury,2020-05-20T00:00:00.000,Wednesday,0.0,0.0,1.0,White/Non-Hispanic,


----------------------------------


Unnamed: 0,number_of_vitims_under_18,number_of_victims_over_18,number_of_offenders_under_18,number_of_offenders_over_18,number_of_victims_under_18,number_of_offenders_under,number_of_offenders_over
count,36.0,55.0,36.0,36.0,19.0,19.0,19.0
mean,0.055556,0.909091,0.111111,1.0,0.052632,0.157895,1.105263
std,0.232311,0.397805,0.39841,0.676123,0.229416,0.688247,1.04853
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,1.0,0.0,1.0,0.0,0.0,0.5
50%,0.0,1.0,0.0,1.0,0.0,0.0,1.0
75%,0.0,1.0,0.0,1.0,0.0,0.0,1.0
max,1.0,2.0,2.0,4.0,1.0,3.0,4.0


----------------------------------
<class 'pandas.core.frame.DataFrame'>
Int64Index: 55 entries, 0 to 6
Data columns (total 21 columns):
month                           55 non-null object
incident_number                 55 non-null object
date_of_incident_day_of_week    36 non-null object
number_of_vitims_under_18       36 non-null float64
number_of_victims_over_18       55 non-null int64
number_of_offenders_under_18    36 non-null float64
number_of_offenders_over_18     36 non-null float64
race_or_ethnic_of_offender      17 non-null object
offense                         17 non-null object
offense_location                55 non-null object
bias                            55 non-null object
victim_type                     36 non-null object
race_ethnic_of_offender_s       19 non-null object
offense_s                       38 non-null object
date_of_incident                19 non-null object
day_of_week                     19 non-null object
number_of_victims_under_18      19 non-null f

None

-------------------------------


month                            0
incident_number                  0
date_of_incident_day_of_week    19
number_of_vitims_under_18       19
number_of_victims_over_18        0
number_of_offenders_under_18    19
number_of_offenders_over_18     19
race_or_ethnic_of_offender      38
offense                         38
offense_location                 0
bias                             0
victim_type                     19
race_ethnic_of_offender_s       36
offense_s                       17
date_of_incident                36
day_of_week                     36
number_of_victims_under_18      36
number_of_offenders_under       36
number_of_offenders_over        36
race_ethnicity_of_offenders     36
notes                           53
dtype: int64

In [4]:
aus_final.to_csv(r"C:\Users\Robert\OneDrive\Desktop\datasets\aus_final.csv")