# Analyzing hate crimes trends for Austin against the USA as a whole, 2017 - Present

## Data Wrangling & Cleaning

I've been working, off and on, on this project for since about January 2020. One-half practice, one-half because I want to try and contribute to making sense of the chaos that is our world right now. What I intend is to analyze hate crimes trends for Austin, TX, from 2017 to the present, with particular focus on the LGBT Community. 

I am using data provided by Austin PD in this notebook, and in the next 2, or 3 notebooks as well. For now, I am focusing solely on data for Austin. I will get into broader data for the USA later down the road.

In [1]:
import pandas as pd

aus_17 = pd.read_csv('https://data.austintexas.gov/resource/79qh-wdpx.csv')
display(aus_17.head())
print('----------------------------------')
display(aus_17.dtypes)

Unnamed: 0,month,incident_number,date_of_incident_day_of_week,number_of_vitims_under_18,number_of_victims_over_18,number_of_offenders_under_18,number_of_offenders_over_18,race_or_ethnic_of_offender,offense,offense_location,bias,victim_type
0,January,2017-241137,01/01/2017/Sun,0,1,0,1,White/Not Hispanic,Aggravated Assault,Park/Playground,Anti-Black or African American,Individual
1,February,2017-580344,02/01/2017/Wed,0,1,0,1,Black or African American/Not Hispanic,Aggravated Assault,Highway/Road/Alley/Street/Sidewalk,Anti-White,Individual
2,March,2017-800291,03/21/2017/Tues,0,0,0,0,Unknown,Destruction,Highway/Road/Alley/Street/Sidewalk,Anti-Jewish,Other
3,April,2017-1021534,04/12/2017/Wed,0,0,0,0,White/Unknown,Simple Assault,Air/Bus/Train Terminal,Anti-Jewish,Individual
4,May,2017-1351550,05/15/2017/Mon,1,0,1,2,White/Not Hispanic,Simple Assault,Residence/Home,Anti-Gay (Male),Individual


----------------------------------


month                           object
incident_number                 object
date_of_incident_day_of_week    object
number_of_vitims_under_18        int64
number_of_victims_over_18        int64
number_of_offenders_under_18     int64
number_of_offenders_over_18      int64
race_or_ethnic_of_offender      object
offense                         object
offense_location                object
bias                            object
victim_type                     object
dtype: object

I dislike the Socrata method bc it imports all data in all columns as objects...importing the data using the url method leaves the column data intact so will make my job much easier down the road. 

### First glance...
As I stated previously, my goal is to analyze trends over time. In particular, I want to focus on how hate crime affects the LGBT community. Initially speaking, most of these columns will be unnecessary for my purposes so I suspect we'll be removing most of them. 

In [2]:
# Loading the datasets for '18, '19, and this year
aus_18 = pd.read_csv('https://data.austintexas.gov/resource/idj2-d9th.csv')
aus_19 = pd.read_csv('https://data.austintexas.gov/resource/e3qf-htd9.csv')
aus_20 = pd.read_csv('https://data.austintexas.gov/resource/vc9m-ha4y.csv')

In [3]:
aus_final = pd.concat([aus_17, aus_18, aus_19, aus_20])
display(aus_final.head())
print('----------------------------------')
display(aus_final.tail())
print('----------------------------------')
display(aus_final.info())

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  """Entry point for launching an IPython kernel.


Unnamed: 0,bias,date_of_incident,date_of_incident_day_of_week,day_of_week,incident_number,month,notes,number_of_offenders_over,number_of_offenders_over_18,number_of_offenders_under,...,number_of_victims_over_18,number_of_victims_under_18,number_of_vitims_under_18,offense,offense_location,offense_s,race_ethnic_of_offender_s,race_ethnicity_of_offenders,race_or_ethnic_of_offender,victim_type
0,Anti-Black or African American,,01/01/2017/Sun,,2017-241137,January,,,1.0,,...,1,,0.0,Aggravated Assault,Park/Playground,,,,White/Not Hispanic,Individual
1,Anti-White,,02/01/2017/Wed,,2017-580344,February,,,1.0,,...,1,,0.0,Aggravated Assault,Highway/Road/Alley/Street/Sidewalk,,,,Black or African American/Not Hispanic,Individual
2,Anti-Jewish,,03/21/2017/Tues,,2017-800291,March,,,0.0,,...,0,,0.0,Destruction,Highway/Road/Alley/Street/Sidewalk,,,,Unknown,Other
3,Anti-Jewish,,04/12/2017/Wed,,2017-1021534,April,,,0.0,,...,0,,0.0,Simple Assault,Air/Bus/Train Terminal,,,,White/Unknown,Individual
4,Anti-Gay (Male),,05/15/2017/Mon,,2017-1351550,May,,,2.0,,...,0,,1.0,Simple Assault,Residence/Home,,,,White/Not Hispanic,Individual


----------------------------------


Unnamed: 0,bias,date_of_incident,date_of_incident_day_of_week,day_of_week,incident_number,month,notes,number_of_offenders_over,number_of_offenders_over_18,number_of_offenders_under,...,number_of_victims_over_18,number_of_victims_under_18,number_of_vitims_under_18,offense,offense_location,offense_s,race_ethnic_of_offender_s,race_ethnicity_of_offenders,race_or_ethnic_of_offender,victim_type
2,Anti-Gay (Male); Anti-Jewish,2020-03-22T00:00:00.000,,Sunday,2020-5011788,March,,0.0,,0.0,...,1,0.0,,,Residence/Home,Criminal Mischief,,Unknown,,
3,Anti-Buddhist,2020-04-20T00:00:00.000,,Monday,2020-5015689,April,,0.0,,0.0,...,1,0.0,,,Church/Synagogue/Temple/Mosque,Criminal Mischief,,Unknown,,
4,Anti-Gay (Male); Anti-Transgender,2020-04-29T00:00:00.000,,Wednesday,2020-5016804,April,,1.0,,0.0,...,1,0.0,,,Department/Discount Store,Assault by Threat,,Black/Non-Hispanic,,
5,Anti-Black or African American,2020-05-17T00:00:00.000,,Sunday,2020-1381131,May,,1.0,,0.0,...,1,0.0,,,Convenience Store,Assault by Contact,,White/Non-Hispanic,,
6,Anti-Hispanic or Latino,2020-05-20T00:00:00.000,,Wednesday,2020-1410411,May,,1.0,,0.0,...,1,0.0,,,Streets/Highway/Road/Alley,Assault with Injury,,White/Non-Hispanic,,


----------------------------------
<class 'pandas.core.frame.DataFrame'>
Int64Index: 55 entries, 0 to 6
Data columns (total 21 columns):
bias                            55 non-null object
date_of_incident                19 non-null object
date_of_incident_day_of_week    36 non-null object
day_of_week                     19 non-null object
incident_number                 55 non-null object
month                           55 non-null object
notes                           2 non-null object
number_of_offenders_over        19 non-null float64
number_of_offenders_over_18     36 non-null float64
number_of_offenders_under       19 non-null float64
number_of_offenders_under_18    36 non-null float64
number_of_victims_over_18       55 non-null int64
number_of_victims_under_18      19 non-null float64
number_of_vitims_under_18       36 non-null float64
offense                         17 non-null object
offense_location                55 non-null object
offense_s                       38 non-null

None

In [4]:
aus_final.to_csv(r"C:\Users\Robert\OneDrive\Desktop\aus_concat.csv")