# Project Title: Predicting Violent Crime By Geographical Location and Type
### Predicting Violent Crime Using Hate Crimes and Violent Crime Datasets, <br> Mental Illness and Psychological Datasets, <br> and Wage and Standard Living Conditions Datasets

## Topic
*What problem are you (or your stakeholder) trying to address?*
The main topic of this project is to utilize my skill sets comprising of data analysis, machine learning, and data manipulation to determine where, when, why, and possibly how hate crimes, mental illness related to violent crimes, and psychological profiles of individuals with anti-social tendencies and a high potential propensity to commit crimes will occur based on available data from local, county, state, national, and global data from at least 3 different sources. The data must include these specific types that can be combined to form reliable conclusions and summations. To start I will attempt to develop models that are based on existing data. Maybe in the future, if there is a successfully analysis, I can begin to look at using real-time data to make my analysis?

## Project Question
*What specific question are you seeking to answer with this project?* 
*This is not the same as the questions you ask to limit the scope of the project.* 
The questions this project is trying to answer is it possible to predict that a mentally ill person, that has affiliations with known hate groups, has a history of psychological problems, and has a criminal record of some type, can commit atrocities and mass murders or some type of mass public crime? Society has been looking for solutions to these social problems for a long time. Can data analytics make a significant difference? I believe it can if we can get past the delays in getting approval to access the data, and we can update the real-time data to the point that the current data represents current social conditions we might be able to avoid catastrophic events like mass murders and other criminal behavior. The key problem is relating the geographical regional data to the potential criminal elements that probably have some type of mental illness history and/or hate group affiliations of some type.

## What would an answer look like?
*What is your hypothesized answer to your question?* 
The primary data is either categorical or statistical in nature and it must be determined what information is the target from each data set and what attributes will make up those data sets. Statistical data might include the rate of murder per 100,000 individuals per year. Categorical data would look more like nationality, male or female, violent versus non-violent, or type of crime committed white collar, financial, rape, murder, misdemeanor, etc. Depending on this criterion the outcome for this project should be a verifiable and reliable projection of areas and individuals that are most prone to commit violence, express hate criminal behavior, and have had or has mental illness or a history of criminal behavior. With this information, it should be possible to watch or monitor those individuals at these times for such behavior. Then we night be able to prevent criminal activities that normally would not be avoidable. 


## Data Sources
*What 3 data sources have you identified for this project?*
*How are you going to relate these datasets?*
*How will you use this data to answer your project question?*

# Import For Project Code Cell

In [1]:
# imports for jupyter notebook final project 
import numpy as np
import pandas as pd
import matplotlib as plt

%matplotlib inline

# DATASET KEY
1. Mental Illness and Fatal Shootings: Variable Term: df1
2. Bias Ethnic Groups and Number of Criminal Incidents: Variable Term: df2
3. Frequency of Mental Illness by Location: Variable Term df3
4. Living Wage State Capitols: Variable Term: df4
5. Mental Health Data: Variable Term: df5
6. Predicting Hate Crimes: Variable Term: df6

In [2]:
# First Dataset: MentalIllnessAndFataShootings.csv 
# Variable Term = df1
df1 = pd.read_csv('data/MentalIllnessAndFatalShootings.csv')
df1.head()

Unnamed: 0,id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera
0,3,Tim Elliot,1/2/2015,shot,gun,53.0,M,A,Shelton,WA,True,attack,Not fleeing,False
1,4,Lewis Lee Lembke,1/2/2015,shot,gun,47.0,M,W,Aloha,OR,False,attack,Not fleeing,False
2,5,John Paul Quintero,1/3/2015,shot and Tasered,unarmed,23.0,M,H,Wichita,KS,False,other,Not fleeing,False
3,8,Matthew Hoffman,1/4/2015,shot,toy weapon,32.0,M,W,San Francisco,CA,True,attack,Not fleeing,False
4,9,Michael Rodriguez,1/4/2015,shot,nail gun,39.0,M,H,Evans,CO,False,attack,Not fleeing,False


In [3]:
# Second Dataset: Bias9_18_2022.csv
# Variable Term: df2
df2 = pd.read_csv('data/Bias9_18_2022.csv')
df2.head()

Unnamed: 0,Ethnic or Group Identification,Number of Incidents
0,Anti-Black or African American,2871
1,Anti-White,869
2,Anti-Jewish,683
3,Anti-Gay (Male),673
4,Anti-Hispanic or Latino,517


In [4]:
# Third Dataset: FrequencyOfMentalIllnessByLocation.csv
# Variable Term: df3
df3 = pd.read_csv('data/FrequencyOfMentalIllnessByLocation.csv')
df3.head()

Unnamed: 0,Timestamp,Age,Gender,Country,state,self_employed,family_history,treatment,work_interfere,no_employees,...,leave,mental_health_consequence,phys_health_consequence,coworkers,supervisor,mental_health_interview,phys_health_interview,mental_vs_physical,obs_consequence,comments
0,8/27/2014 11:29,37,Female,United States,IL,,No,Yes,Often,25-Jun,...,Somewhat easy,No,No,Some of them,Yes,No,Maybe,Yes,No,
1,8/27/2014 11:29,44,M,United States,IN,,No,No,Rarely,More than 1000,...,Don't know,Maybe,No,No,No,No,No,Don't know,No,
2,8/27/2014 11:29,32,Male,Canada,,,No,No,Rarely,25-Jun,...,Somewhat difficult,No,No,Yes,Yes,Yes,Yes,No,No,
3,8/27/2014 11:29,31,Male,United Kingdom,,,Yes,Yes,Often,26-100,...,Somewhat difficult,Yes,Yes,Some of them,No,Maybe,Maybe,No,Yes,
4,8/27/2014 11:30,31,Male,United States,TX,,No,No,Never,100-500,...,Don't know,No,No,Some of them,Yes,Yes,Yes,Don't know,No,


In [5]:
# Fourth Dataset: LivingWageStateCapitols.csv
# Variable Term: df4
df4 = pd.read_csv('data/LivingWageStateCapitals.csv')
df4.head()

Unnamed: 0,state_territory,city,minimum_wage,one_adult_no_kids_living_wage,one_adult_one_kid_living_wage,one_adult_two_kids_living_wage,one_adult_three_kids_living_wage,two_adults_one_working_no_kids_living_wage,two_adults_one_working_one_kid_living_wage,two_adults_one_working_two_kids_living_wage,...,one_adult_two_kids_poverty_wage,one_adult_three_kids_poverty_wage,two_adults_one_working_no_kids_poverty_wage,two_adults_one_working_one_kid_poverty_wage,two_adults_one_working_two_kids_poverty_wage,two_adults_one_working_three_kids_poverty_wage,two_adults_both_working_no_kids_poverty_wage,two_adults_both_working_one_kid_poverty_wage,two_adults_both_working_two_kids_poverty_wage,two_adults_both_working_three_kids_poverty_wage
0,District of Columbia,Washington,13.25,19.97,38.95,48.99,63.96,29.61,34.55,38.32,...,10.44,12.6,8.29,10.44,12.6,14.75,4.14,5.22,6.3,7.38
1,Alabama,Montgomery,7.25,13.56,27.35,33.42,42.17,22.59,26.66,30.27,...,10.44,12.6,8.29,10.44,12.6,14.75,4.14,5.22,6.3,7.38
2,Alaska,Juneau,10.19,15.48,29.99,36.0,47.42,24.48,29.46,33.01,...,13.05,15.75,10.36,13.05,15.75,18.44,5.18,6.53,7.87,9.22
3,Arizona,Phoenix,12.0,15.41,29.44,35.4,46.01,24.85,29.25,32.98,...,10.44,12.6,8.29,10.44,12.6,14.75,4.14,5.22,6.3,7.38
4,Arkansas,Little Rock,10.0,13.97,28.81,35.49,45.33,23.21,27.66,31.36,...,10.44,12.6,8.29,10.44,12.6,14.75,4.14,5.22,6.3,7.38


In [6]:
# Fifth Dataset: MentalHealthData.csv
# Variable Term: df5
df5 = pd.read_csv('data/MentalHealthData.csv')
df5.head()

Unnamed: 0,Are you self-employed?,How many employees does your company or organization have?,Is your employer primarily a tech company/organization?,Is your primary role within your company related to tech/IT?,Does your employer provide mental health benefits as part of healthcare coverage?,Do you know the options for mental health care available under your employer-provided coverage?,"Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?",Does your employer offer resources to learn more about mental health concerns and options for seeking help?,"If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:",Would you feel comfortable discussing a mental health disorder with your coworkers?,...,"If you have a mental health issue, do you feel that it interferes with your work when being treated effectively?","If you have a mental health issue, do you feel that it interferes with your work when NOT being treated effectively?",What is your age?,What is your gender?,What country do you live in?,What US state or territory do you live in?,What country do you work in?,What US state or territory do you work in?,Which of the following best describes your work position?,Do you work remotely?
0,0,1 to 5,1.0,,Yes,Yes,No,No,Somewhat difficult,Maybe,...,Sometimes,Often,33,Male,Canada,,Canada,,Back-end Developer,Sometimes
1,0,1 to 5,1.0,,No,No,No,I don't know,Very easy,Yes,...,Not applicable to me,Not applicable to me,40,male,Netherlands,,Netherlands,,Front-end Developer|Back-end Developer,Sometimes
2,0,1 to 5,1.0,,Yes,Yes,No,I don't know,I don't know,Maybe,...,Not applicable to me,Not applicable to me,21,male,United Kingdom,,United Kingdom,,Back-end Developer|DevOps/SysAdmin,Never
3,0,1 to 5,1.0,,No,No,No,No,Very difficult,No,...,Often,Often,36,Male,Brazil,,Brazil,,Back-end Developer,Never
4,0,1 to 5,0.0,1.0,I don't know,No,Yes,No,Very difficult,Yes,...,Not applicable to me,Often,36,F,United States of America,Indiana,United States of America,Indiana,Other,Sometimes


In [7]:
# Sixth Dataset: PredictingHateCrimes.csv
# Variable Term: df6
df6 = pd.read_csv('data/PredictingHateCrimes.csv')
df6.head()

Unnamed: 0,median_household_income,share_unemployed_seasonal,share_population_in_metro_areas,share_population_with_high_school_degree,share_non_citizen,share_white_poverty,gini_index,share_non_white,share_voters_voted_trump,hate_crimes_per_100k_splc,avg_hatecrimes_per_100k_fbi
0,42278,0.06,0.64,0.821,0.02,0.12,0.472,0.35,0.63,0.125839,1.80641
1,67629,0.064,0.63,0.914,0.04,0.06,0.422,0.42,0.53,0.14374,1.6567
2,49254,0.063,0.9,0.842,0.1,0.09,0.455,0.49,0.5,0.22532,3.413928
3,44922,0.052,0.69,0.824,0.04,0.12,0.458,0.26,0.6,0.069061,0.869209
4,60487,0.059,0.97,0.806,0.13,0.09,0.471,0.61,0.33,0.255805,2.397986


In [8]:
# 🦉: The following command converts this Jupyter notebook to a Python script.
!jupyter nbconvert --to python python-exercises.ipynb

'jupyter' is not recognized as an internal or external command,
operable program or batch file.
