# An In-Depth Analysis of Gun Violence in America

### Will M, Zichao L, Ethan B

# Part 1: Introduction

Gun violence has become a significant problem in America today. We are constantly reminded by news reports and social media that gun violence is a part of out lives - as a result, our lives are being disrupted by this threat. Schools are enforcing shooting drills, products like bulletproof vests are becoming ever more common, and our politics are being divided over what the right thing to do is.

In 2020, gun violence was the most common cause of death among people younger than 19. Between 1968 and 2011, an estimated 1.4 million Americans died from gun violence. The gun-related homicide rate in the United States is 25 times higher than in other developed countries. Because of these statistics, it makes sense that the general public be informed about this issue.

In this tutorial, we will do an in-depth analysis of the history, causes and effects of gun violence. The data we will be using can be found <a id = "https://github.com/jamesqo/gun-violence-data"here></a>. The ultimate goal is to understand the factors that contribute the most to gun violence. 

# Part 2: Data

We will start by importing the necesary packages.

In [2]:
import pandas as pd
import numpy as np

The first thing we need to do is to read in our data. This can be done with pandas, and here is the result:

In [3]:
data = pd.read_csv("/Users/mr8bit/Documents/Other/GitHub/Academic Repositories/CMSC320 Final Project/stage3.csv")
data

Unnamed: 0,incident_id,date,state,city_or_county,address,n_killed,n_injured,incident_url,source_url,incident_url_fields_missing,...,participant_age,participant_age_group,participant_gender,participant_name,participant_relationship,participant_status,participant_type,sources,state_house_district,state_senate_district
0,461105,2013-01-01,Pennsylvania,Mckeesport,1506 Versailles Avenue and Coursin Street,0,4,http://www.gunviolencearchive.org/incident/461105,http://www.post-gazette.com/local/south/2013/0...,False,...,0::20,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||3::Male||4::Female,0::Julian Sims,,0::Arrested||1::Injured||2::Injured||3::Injure...,0::Victim||1::Victim||2::Victim||3::Victim||4:...,http://pittsburgh.cbslocal.com/2013/01/01/4-pe...,,
1,460726,2013-01-01,California,Hawthorne,13500 block of Cerise Avenue,1,3,http://www.gunviolencearchive.org/incident/460726,http://www.dailybulletin.com/article/zz/201301...,False,...,0::20,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male,0::Bernard Gillis,,0::Killed||1::Injured||2::Injured||3::Injured,0::Victim||1::Victim||2::Victim||3::Victim||4:...,http://losangeles.cbslocal.com/2013/01/01/man-...,62.0,35.0
2,478855,2013-01-01,Ohio,Lorain,1776 East 28th Street,1,3,http://www.gunviolencearchive.org/incident/478855,http://chronicle.northcoastnow.com/2013/02/14/...,False,...,0::25||1::31||2::33||3::34||4::33,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Male||3::Male||4::Male,0::Damien Bell||1::Desmen Noble||2::Herman Sea...,,"0::Injured, Unharmed, Arrested||1::Unharmed, A...",0::Subject-Suspect||1::Subject-Suspect||2::Vic...,http://www.morningjournal.com/general-news/201...,56.0,13.0
3,478925,2013-01-05,Colorado,Aurora,16000 block of East Ithaca Place,4,0,http://www.gunviolencearchive.org/incident/478925,http://www.dailydemocrat.com/20130106/aurora-s...,False,...,0::29||1::33||2::56||3::33,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Female||1::Male||2::Male||3::Male,0::Stacie Philbrook||1::Christopher Ratliffe||...,,0::Killed||1::Killed||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...,http://denver.cbslocal.com/2013/01/06/officer-...,40.0,28.0
4,478959,2013-01-07,North Carolina,Greensboro,307 Mourning Dove Terrace,2,2,http://www.gunviolencearchive.org/incident/478959,http://www.journalnow.com/news/local/article_d...,False,...,0::18||1::46||2::14||3::47,0::Adult 18+||1::Adult 18+||2::Teen 12-17||3::...,0::Female||1::Male||2::Male||3::Female,0::Danielle Imani Jameison||1::Maurice Eugene ...,3::Family,0::Injured||1::Injured||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...,http://myfox8.com/2013/01/08/update-mother-sho...,62.0,27.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
239672,1083142,2018-03-31,Louisiana,Rayne,North Riceland Road and Highway 90,0,0,http://www.gunviolencearchive.org/incident/108...,http://www.klfy.com/news/local/rayne-woman-cha...,False,...,0::25,0::Adult 18+,0::Female,0::Jhkeya Tezeno,,"0::Unharmed, Arrested",0::Subject-Suspect,http://www.klfy.com/news/local/rayne-woman-cha...,,
239673,1083139,2018-03-31,Louisiana,Natchitoches,247 Keyser Ave,1,0,http://www.gunviolencearchive.org/incident/108...,http://www.ksla.com/story/37854648/man-wanted-...,False,...,1::21,0::Adult 18+||1::Adult 18+,0::Male||1::Male,0::Jamal Haskett||1::Jaquarious Tyjuan Ardison,,"0::Killed||1::Unharmed, Arrested",0::Victim||1::Subject-Suspect,http://www.ksla.com/story/37854648/man-wanted-...,23.0,31.0
239674,1083151,2018-03-31,Louisiana,Gretna,1300 block of Cook Street,0,1,http://www.gunviolencearchive.org/incident/108...,http://www.nola.com/crime/index.ssf/2018/04/sh...,False,...,0::21,0::Adult 18+,0::Male,,,0::Injured,0::Victim,http://www.nola.com/crime/index.ssf/2018/04/sh...,85.0,7.0
239675,1082514,2018-03-31,Texas,Houston,12630 Ashford Point Dr,1,0,http://www.gunviolencearchive.org/incident/108...,https://www.chron.com/news/houston-texas/houst...,False,...,0::42,0::Adult 18+,0::Male,0::Leroy Ellis,,0::Killed,0::Victim,http://www.khou.com/article/news/hpd-investiga...,149.0,17.0


This table is rather big, so we will need to do some cleaning and tidying before we can start our analysis. 

Firstly, we won't need all the data in this table. According to the dataset, some of the columns are not required - and thus, may contain NaN values. We don't want this as it will make our analysis more difficult than it needs to be. Out of the 29 columns, only 9 are required. That being said, we don't want to remove all of these unreqired columns, as some also contain value information we will need. The columns we will be removing are those that are not required and necesary for this analysis.

The following columns will be removed:
- source_url
- congressional_district
- location_description
- notes
- participant_name
- sources
- state_house_district
- state_senate_district

Here is the result:

In [4]:
columns_to_remove = ["source_url", "congressional_district", "location_description", 
"notes", "participant_name", "sources", "state_house_district", "state_senate_district"]

for column in columns_to_remove:
    data = data.drop(column, axis = 1)
data

Unnamed: 0,incident_id,date,state,city_or_county,address,n_killed,n_injured,incident_url,incident_url_fields_missing,gun_stolen,...,incident_characteristics,latitude,longitude,n_guns_involved,participant_age,participant_age_group,participant_gender,participant_relationship,participant_status,participant_type
0,461105,2013-01-01,Pennsylvania,Mckeesport,1506 Versailles Avenue and Coursin Street,0,4,http://www.gunviolencearchive.org/incident/461105,False,,...,Shot - Wounded/Injured||Mass Shooting (4+ vict...,40.3467,-79.8559,,0::20,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||3::Male||4::Female,,0::Arrested||1::Injured||2::Injured||3::Injure...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
1,460726,2013-01-01,California,Hawthorne,13500 block of Cerise Avenue,1,3,http://www.gunviolencearchive.org/incident/460726,False,,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",33.9090,-118.3330,,0::20,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male,,0::Killed||1::Injured||2::Injured||3::Injured,0::Victim||1::Victim||2::Victim||3::Victim||4:...
2,478855,2013-01-01,Ohio,Lorain,1776 East 28th Street,1,3,http://www.gunviolencearchive.org/incident/478855,False,0::Unknown||1::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",41.4455,-82.1377,2.0,0::25||1::31||2::33||3::34||4::33,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Male||3::Male||4::Male,,"0::Injured, Unharmed, Arrested||1::Unharmed, A...",0::Subject-Suspect||1::Subject-Suspect||2::Vic...
3,478925,2013-01-05,Colorado,Aurora,16000 block of East Ithaca Place,4,0,http://www.gunviolencearchive.org/incident/478925,False,,...,"Shot - Dead (murder, accidental, suicide)||Off...",39.6518,-104.8020,,0::29||1::33||2::56||3::33,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Female||1::Male||2::Male||3::Male,,0::Killed||1::Killed||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...
4,478959,2013-01-07,North Carolina,Greensboro,307 Mourning Dove Terrace,2,2,http://www.gunviolencearchive.org/incident/478959,False,0::Unknown||1::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",36.1140,-79.9569,2.0,0::18||1::46||2::14||3::47,0::Adult 18+||1::Adult 18+||2::Teen 12-17||3::...,0::Female||1::Male||2::Male||3::Female,3::Family,0::Injured||1::Injured||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
239672,1083142,2018-03-31,Louisiana,Rayne,North Riceland Road and Highway 90,0,0,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,Shots Fired - No Injuries,,,1.0,0::25,0::Adult 18+,0::Female,,"0::Unharmed, Arrested",0::Subject-Suspect
239673,1083139,2018-03-31,Louisiana,Natchitoches,247 Keyser Ave,1,0,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,"Shot - Dead (murder, accidental, suicide)||Ins...",31.7537,-93.0836,1.0,1::21,0::Adult 18+||1::Adult 18+,0::Male||1::Male,,"0::Killed||1::Unharmed, Arrested",0::Victim||1::Subject-Suspect
239674,1083151,2018-03-31,Louisiana,Gretna,1300 block of Cook Street,0,1,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,Shot - Wounded/Injured,29.9239,-90.0442,1.0,0::21,0::Adult 18+,0::Male,,0::Injured,0::Victim
239675,1082514,2018-03-31,Texas,Houston,12630 Ashford Point Dr,1,0,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,"Shot - Dead (murder, accidental, suicide)",29.7201,-95.6110,1.0,0::42,0::Adult 18+,0::Male,,0::Killed,0::Victim


Secondly, we need to deal with the NaN values. The solution we came up here with is simple - just drop the row if it contains any NaN values.

Here is the result:

In [6]:
data = data.dropna()
data

Unnamed: 0,incident_id,date,state,city_or_county,address,n_killed,n_injured,incident_url,incident_url_fields_missing,gun_stolen,...,incident_characteristics,latitude,longitude,n_guns_involved,participant_age,participant_age_group,participant_gender,participant_relationship,participant_status,participant_type
4,478959,2013-01-07,North Carolina,Greensboro,307 Mourning Dove Terrace,2,2,http://www.gunviolencearchive.org/incident/478959,False,0::Unknown||1::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",36.1140,-79.9569,2.0,0::18||1::46||2::14||3::47,0::Adult 18+||1::Adult 18+||2::Teen 12-17||3::...,0::Female||1::Male||2::Male||3::Female,3::Family,0::Injured||1::Injured||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...
6,479363,2013-01-19,New Mexico,Albuquerque,2806 Long Lane,5,0,http://www.gunviolencearchive.org/incident/479363,False,0::Unknown||1::Unknown,...,"Shot - Dead (murder, accidental, suicide)||Mas...",34.9791,-106.7160,2.0,0::51||1::40||2::9||3::5||4::2||5::15,0::Adult 18+||1::Adult 18+||2::Child 0-11||3::...,0::Male||1::Female||2::Male||3::Female||4::Fem...,5::Family,0::Killed||1::Killed||2::Killed||3::Killed||4:...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
16,479580,2013-02-03,California,Yuba (county),5800 block of Poplar Avenue,1,3,http://www.gunviolencearchive.org/incident/479580,False,0::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",39.1236,-121.5830,1.0,0::20||4::25||5::18||6::19,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Female||4::Male||5::Male|...,4::Drive by - Random victims||5::Drive by - Ra...,0::Killed||1::Injured||2::Injured||3::Injured|...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
36,482856,2013-03-13,New York,Mohawk,17 W Main St,6,2,http://www.gunviolencearchive.org/incident/482856,False,0::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",43.0110,-75.0058,1.0,0::68||1::57||2::66||3::67||4::62||5::51||6::2...,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Male||4::Male||5::Male||6...,7::Aquaintance,0::Killed||1::Killed||2::Injured||3::Injured||...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
57,485811,2013-04-24,Illinois,Manchester,East Street,6,1,http://www.gunviolencearchive.org/incident/485811,False,0::Unknown||1::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",39.5417,-90.3301,2.0,0::64||1::22||2::29||3::5||4::1,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::C...,0::Female||1::Female||2::Male||3::Male||4::Mal...,6::Significant others - current or former,0::Killed||1::Killed||2::Killed||3::Killed||4:...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
239553,1083224,2018-03-31,Alabama,Eclectic,340 Boswell Road,2,0,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,"Shot - Dead (murder, accidental, suicide)||Sui...",32.7015,-86.0871,1.0,0::68||1::74,0::Adult 18+||1::Adult 18+,0::Male||1::Male,1::Friends,0::Killed||1::Killed,0::Victim||1::Subject-Suspect
239567,1081772,2018-03-31,Maryland,Crisfield,26440 Silver Ln,1,1,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,"Shot - Wounded/Injured||Shot - Dead (murder, a...",37.9962,-75.8392,1.0,0::44||1::57,0::Adult 18+||1::Adult 18+,0::Female||1::Male,1::Significant others - current or former,0::Injured||1::Killed,0::Victim||1::Subject-Suspect
239568,1083136,2018-03-31,California,Santa Cruz,600 block of Riverside Ave,0,1,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,Shot - Wounded/Injured||Gang involvement,36.9702,-122.0200,1.0,0::26,0::Adult 18+||1::Adult 18+,0::Male||1::Male,1::Gang vs Gang,0::Injured||1::Unharmed,0::Victim||1::Subject-Suspect
239617,1081647,2018-03-31,Florida,Miami,NW 65th St and NW 13th Ct,1,0,http://www.gunviolencearchive.org/incident/108...,False,0::Unknown,...,"Shot - Dead (murder, accidental, suicide)||Chi...",25.8343,-80.2195,1.0,0::4||1::24,0::Child 0-11||1::Adult 18+,0::Female||1::Male,1::Family,"0::Killed||1::Unharmed, Arrested",0::Victim||1::Subject-Suspect


Lastly, we need to remove columns that were well-formed but are either unncecsary or contain sensitive information, like an address. We want this analysis to remain as anonymous as possible, and we want to respect those who were affected by these incidents.

Here is the final result, and the data we will be using in the rest of the analysis:

In [7]:
data = data.drop("address", axis = 1)
data = data.drop("incident_url", axis = 1)
data = data.drop("incident_url_fields_missing", axis = 1)
data

Unnamed: 0,incident_id,date,state,city_or_county,n_killed,n_injured,gun_stolen,gun_type,incident_characteristics,latitude,longitude,n_guns_involved,participant_age,participant_age_group,participant_gender,participant_relationship,participant_status,participant_type
4,478959,2013-01-07,North Carolina,Greensboro,2,2,0::Unknown||1::Unknown,0::Handgun||1::Handgun,"Shot - Wounded/Injured||Shot - Dead (murder, a...",36.1140,-79.9569,2.0,0::18||1::46||2::14||3::47,0::Adult 18+||1::Adult 18+||2::Teen 12-17||3::...,0::Female||1::Male||2::Male||3::Female,3::Family,0::Injured||1::Injured||2::Killed||3::Killed,0::Victim||1::Victim||2::Victim||3::Subject-Su...
6,479363,2013-01-19,New Mexico,Albuquerque,5,0,0::Unknown||1::Unknown,0::22 LR||1::223 Rem [AR-15],"Shot - Dead (murder, accidental, suicide)||Mas...",34.9791,-106.7160,2.0,0::51||1::40||2::9||3::5||4::2||5::15,0::Adult 18+||1::Adult 18+||2::Child 0-11||3::...,0::Male||1::Female||2::Male||3::Female||4::Fem...,5::Family,0::Killed||1::Killed||2::Killed||3::Killed||4:...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
16,479580,2013-02-03,California,Yuba (county),1,3,0::Unknown,0::9mm,"Shot - Wounded/Injured||Shot - Dead (murder, a...",39.1236,-121.5830,1.0,0::20||4::25||5::18||6::19,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Female||4::Male||5::Male|...,4::Drive by - Random victims||5::Drive by - Ra...,0::Killed||1::Injured||2::Injured||3::Injured|...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
36,482856,2013-03-13,New York,Mohawk,6,2,0::Unknown,0::Shotgun,"Shot - Wounded/Injured||Shot - Dead (murder, a...",43.0110,-75.0058,1.0,0::68||1::57||2::66||3::67||4::62||5::51||6::2...,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::A...,0::Male||1::Male||2::Male||4::Male||5::Male||6...,7::Aquaintance,0::Killed||1::Killed||2::Injured||3::Injured||...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
57,485811,2013-04-24,Illinois,Manchester,6,1,0::Unknown||1::Unknown,0::Unknown||1::Unknown,"Shot - Wounded/Injured||Shot - Dead (murder, a...",39.5417,-90.3301,2.0,0::64||1::22||2::29||3::5||4::1,0::Adult 18+||1::Adult 18+||2::Adult 18+||3::C...,0::Female||1::Female||2::Male||3::Male||4::Mal...,6::Significant others - current or former,0::Killed||1::Killed||2::Killed||3::Killed||4:...,0::Victim||1::Victim||2::Victim||3::Victim||4:...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
239553,1083224,2018-03-31,Alabama,Eclectic,2,0,0::Unknown,0::Unknown,"Shot - Dead (murder, accidental, suicide)||Sui...",32.7015,-86.0871,1.0,0::68||1::74,0::Adult 18+||1::Adult 18+,0::Male||1::Male,1::Friends,0::Killed||1::Killed,0::Victim||1::Subject-Suspect
239567,1081772,2018-03-31,Maryland,Crisfield,1,1,0::Unknown,0::Handgun,"Shot - Wounded/Injured||Shot - Dead (murder, a...",37.9962,-75.8392,1.0,0::44||1::57,0::Adult 18+||1::Adult 18+,0::Female||1::Male,1::Significant others - current or former,0::Injured||1::Killed,0::Victim||1::Subject-Suspect
239568,1083136,2018-03-31,California,Santa Cruz,0,1,0::Unknown,0::Unknown,Shot - Wounded/Injured||Gang involvement,36.9702,-122.0200,1.0,0::26,0::Adult 18+||1::Adult 18+,0::Male||1::Male,1::Gang vs Gang,0::Injured||1::Unharmed,0::Victim||1::Subject-Suspect
239617,1081647,2018-03-31,Florida,Miami,1,0,0::Unknown,0::Unknown,"Shot - Dead (murder, accidental, suicide)||Chi...",25.8343,-80.2195,1.0,0::4||1::24,0::Child 0-11||1::Adult 18+,0::Female||1::Male,1::Family,"0::Killed||1::Unharmed, Arrested",0::Victim||1::Subject-Suspect


Now that our data has been cleaned up, it's time to explain what we are looking at. This dataset tracked every since recorded incident of gun violence between early 2013 and early 2018 in the United States. It contains all the critical information we need to understand each incident that occured, such as where and when it happened, who was involved, and what the outcome was. Below is a summary of each column and what it tells us about the incident.

- date: when the incident occured
- state: what state the incident occured in
- city_or_county: what city or county the incident occured in
- n_killed: how many people were killed in the incident
- n_injured: how many people were injured in the incident
- gun_stolen: whether or not the gun/guns used were stolen
- gun_type: what type of gun/guns were used
- incident_characteristics: specific details about the incident
- latitude: geographic latitude of the incident
- longitude: geographic longitude of the incident
- n_guns_involved: how many guns involved in the incident
- participant_age: a breakdown of each participant's age
- participant_age_group: a breakdown of each participant's age group
- participant_gender: a breakdown of each participant's gender
- participant_relationship: a breakdown of each participant's relationship to other participants
- participant_status: a breakdown of the outcome of each participant
- participant_type: a breakdown of each participant's role in the incident