# Taser death statistics
Taser-related analysis based on the restraint-custody-death data project. Relates to a couple of stories:
- [Police withheld records of their son’s death. Now they know why.](http://www.mystatesman.com/news/texas-police-withheld-records-their-son-death-now-they-know-why/MHJC1hWAbPhcN6gOtqOkyM/)
- [‘48 marks on his back’ — How Tasers figure into police custodial deaths](http://www.mystatesman.com/news/marks-his-back-how-tasers-figure-into-deaths-police-custody/fVR3r93ypApkIi7dOtH6hP/)
- [3-minute Taser jolt, quick settlement: Was justice done in Burnet death?](http://www.mystatesman.com/news/crime--law/minute-taser-jolt-quick-settlement-was-justice-served-burnet-man-death/5aSntE4PMC8ZvWqmuNcF1J/)
- [Uncommon but ‘horrific’: When Tasers set their targets on fire
](http://www.mystatesman.com/news/crime--law/uncommon-but-horrific-when-tasers-set-their-targets-fire/KxGJuOzq0luN3J6Wdfjy8L/)

## Findings
- [Number of taser deaths](#Number-of-taser-deaths)
- [Cases where shock is cause of death](#Number-of-cases-shock-is-cause-of-death)
- [Looking at times tased](#Times-tased)
- [Tased with toxicity](#Tased-with-toxicity)
- [Data integrity checks](#Data-integrity-checks)

In [1]:
import agate
import re
from IPython.display import display

import warnings
warnings.filterwarnings('ignore')

### Getting and setting up data

Starts with the `deaths_latest.csv` file that is created when the [File processing](./) notebook is run. 

In [2]:
# sets data types on fields agate got wrong
specified_data_types = {
    'tracked_cause': agate.Text(),
    'offense': agate.Text(),
    'case_study': agate.Text(),
    'official_discipline': agate.Text()
}

# this pulls the deaths file that is exported in the File processing notebook
deaths = agate.Table.from_csv('../exports/deaths_latest.csv', column_types=specified_data_types)

print(deaths)

| column               | data_type |
| -------------------- | --------- |
| id                   | Number    |
| ag_report_url        | Text      |
| first_name           | Text      |
| middle_name          | Text      |
| last_name            | Text      |
| suffix               | Text      |
| slug                 | Text      |
| race                 | Text      |
| gender               | Text      |
| date_of_birth        | Date      |
| date_of_death        | Date      |
| age                  | Number    |
| agency               | Number    |
| restrained           | Boolean   |
| tazed                | Boolean   |
| times_tazed          | Number    |
| pepper_sprayed       | Boolean   |
| official_discipline  | Text      |
| grand_jury_result    | Text      |
| mental_health_issues | Boolean   |
| manner_of_death      | Text      |
| drug_intoxication    | Boolean   |
| cause_of_death       | Text      |
| tracked_cause        | Text      |
| offense              | Text      |
|

## Taser deaths

Numbers and graphic data needed releted to Taser use in the restraint custody deaths data.

Style note: The field in the database is `tazed` and `times_tazed` which is wrong in several ways. We would use "tased" in past tense, but Taser is a brand name, so it is best avoided in favor of "shock weapon" or "shocked".


### Total number deaths in our data


In [3]:
deaths_total_count = len(deaths)
print('Total deaths: {}'.format(deaths_total_count))

Total deaths: 289


### Number of taser deaths

Rows where `tazed` has been marked "Yes".

In [4]:
# count the number or records with tazed
deaths_tased_count = deaths.aggregate(agate.Count('tazed', True))
print('Number of deaths where tased: {}'.format(deaths_tased_count))

# tazed / total
print('Percentage of deaths where deceased was tased: {:.1%}'.format(
        deaths_tased_count / deaths_total_count
    ))

Number of deaths where tased: 87
Percentage of deaths where deceased was tased: 30.1%


## Electric shock deaths
One way we wanted to cut the data was to find the number of people where a taser was used, and then one of the official causes of death was listed as electric shock.

In the database, the `tracked_cause` was saved as a foreign key, so the Deaths table holds only the ID.

`Electric shock` has an ID of 36.

The method to find this stat:
- pivotby those tazed, then tracked_cause
- aggregate count those rows, which gives us # of matching cases
- filter that table by '36' in tracked cause, which is electric shock
- Also track by tazed = true, in case there is another type of electrocution

In [5]:
# Count of tracked_cause of tased folks
cause_taser_types = deaths.pivot(['tazed', 'tracked_cause'])

cause_taser_types.print_table()

| tazed | tracked_cause | Count |
| ----- | ------------- | ----- |
| False |               |   139 |
| False | 1             |    29 |
| False | 2             |    27 |
| False | 1,2           |     7 |
|  True |               |    56 |
|  True | 2             |    11 |
|  True | 1             |     7 |
|  True | 1,2           |     5 |
|  True | 1,36          |     2 |
|  True | 36            |     4 |
|  True | 2,36          |     2 |


In [6]:
# this filters the cause_taser_types table to just those where
# the tracked_cause ID is '36', which is Electric shock
shocked = cause_taser_types.where(
    lambda row: re.search('36', str(row['tracked_cause']))
)

# this further filters to tazed is true
tased_shocked = shocked.where(lambda row: row['tazed'] == True)

# prints it
tased_shocked.print_table()

| tazed | tracked_cause | Count |
| ----- | ------------- | ----- |
|  True | 1,36          |     2 |
|  True | 36            |     4 |
|  True | 2,36          |     2 |


### Number of cases shock is cause of death

In [7]:
# sums the tracked_cause_count column of those tazed & shocked to death
# this is number of dead where taser listed as cause
cases_shocked_tased = tased_shocked.aggregate(agate.Sum('Count'))
print('Total cases where Electric shock is an official cause of death: {}'.format(cases_shocked_tased))

Total cases where Electric shock is an official cause of death: 8


## Taser deaths by race

Demographics of those marked as being tased.

In [8]:
# select just the tazed cases
tased_true = deaths.where(lambda row: row['tazed'] == True)

# pivot tased cases by race
tased_race_pivot = tased_true.pivot('race_name')

# print a table of the pivot
tased_race_pivot.print_table()

print("\n")

# print bars of the pivot
tased_race_pivot.print_bars('race_name', 'Count', width=60)

tased_race_pivot.to_csv('../exports/tased_race_pivot.csv')

| race_name       | Count |
| --------------- | ----- |
| Black           |    42 |
| Hispanic/Latino |    21 |
| Asian           |     1 |
| White           |    23 |


race_name       Count
Black              42 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░      
Hispanic/Latino    21 ▓░░░░░░░░░░░░░░░░                     
Asian               1 ▓░                                    
White              23 ▓░░░░░░░░░░░░░░░░░                    
                      +--------+--------+---------+--------+
                      0.0    12.5     25.0      37.5    50.0


## Times tased

In [9]:
# Create a table of those tased
tased = deaths.where(lambda row: row['tazed'] is True)

print('Number of people who were tased:\n{}'.format(len(tased)))

Number of people who were tased:
87


In [10]:
# histogram to show how many subjects have been
# tased a certain number of times
# first number in bin is times tazed except for last which is a range.
tased_bins = tased.bins('times_tazed', 11, 0, 11)

tased_bins.to_csv('../exports/tased_bins.csv')

tased_bins.print_table()
# tased.bins('times_tazed', 11, 0, 11).print_table()

tased_bins.print_bars('times_tazed', width=60)

| times_tazed | Count |
| ----------- | ----- |
| [1 - 2)     |    22 |
| [2 - 3)     |    27 |
| [3 - 4)     |    13 |
| [4 - 5)     |    10 |
| [5 - 6)     |     2 |
| [6 - 7)     |     2 |
| [7 - 8)     |     4 |
| [10 - 50]   |     7 |
times_tazed Count
[1 - 2)        22 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           
[2 - 3)        27 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░    
[3 - 4)        13 ▓░░░░░░░░░░░░░░░░░░                       
[4 - 5)        10 ▓░░░░░░░░░░░░░░                           
[5 - 6)         2 ▓░░░                                      
[6 - 7)         2 ▓░░░                                      
[7 - 8)         4 ▓░░░░░                                    
[10 - 50]       7 ▓░░░░░░░░░░                               
                  +---------+---------+----------+---------+
                  0.0      7.5      15.0       22.5     30.0


In [11]:
# to check a specific number of times_tazed
tased.aggregate(agate.Count('times_tazed', 2))

27

## Tased with toxicity

This is a list of cases where the subject was tased and there was some indication in the cause_of_death field that drugs were involved.

In [12]:
# regex search based on drug terms
# was double-checked by hand
tased_toxic = tased.where(lambda row: re.search(
        '(?i)meth|(?i)Cocaine|(?i)toxic|(?i)overdose|(?i)phencyclidine|(?i)PCP|(?i)drug|(?i)Ecstasy|(?i)MDMA',\
        str(row['cause_of_death'])
    ))

# print numbers and percentages
print('Those tased with toxicity: {}, or {:.1%} of the total {} tased.\n'.format(
        len(tased_toxic),
        len(tased_toxic) / deaths_tased_count,
        len(tased)
    ))

# print list of deceased tased with toxicity
print('Names of those in list:\n')
tased_toxic.select(['first_name',
                   'middle_name',
                   'last_name',
                   'reporter_assigned']
                  ).rename(column_names={
                          'reporter_assigned': 'reporter'
                          }).order_by('last_name').print_table(max_rows=None)


Those tased with toxicity: 59, or 67.8% of the total 87 tased.

Names of those in list:

| first_name | middle_name   | last_name        | reporter |
| ---------- | ------------- | ---------------- | -------- |
| Pierre     | Tourell       | Abernathy        | ed       |
| Raymond    | Luther        | Allen            | ed       |
| Ross       | Allen         | Anthony          | ed       |
| Manuel     | A.            | Baltazar         | ed       |
| Willie     | Ray           | Banks            | ed       |
| Richard    | Eduardo       | Battistata       | js       |
| Derrick    | Anthony       | Birdow           | ab       |
| Inocencio  | Juarez        | Cardenas         | js       |
| Ernesto    |               | Carraman         | ed       |
| Michael    | Shea          | Cassel           |          |
| Wilber     |               | Castillo-Gongora | ab       |
| Denis      | John          | Chabot           | ed       |
| Norman     | Lee           | Cooper           | ed     

## Data integrity checks

Other cleanup or checks as necessary.

### Records where tazed is null

In [13]:
# finds deaths rows where `tazed` is null
tazed_null = deaths.where(lambda row: row['tazed'] == None)

# print number of records
print('Number of cases where `tazed` is null: {}\n'.format(
      len(tazed_null)  
    ))

# selects that name rows from the tazed_null table and prints them
tazed_null_names = tazed_null.select([
        'first_name',
        'middle_name',
        'last_name',
        'reporter_assigned'
    ]).order_by('last_name')
tazed_null_names.print_table(max_rows=85)

Number of cases where `tazed` is null: 0

| first_name | middle_name | last_name | reporter_assigned |
| ---------- | ----------- | --------- | ----------------- |


### Tazed marked Yes, but tazed_times is Null

In [14]:
# finds deaths rows where `tazed` is null
tazed_not_null = deaths.where(lambda row: row['tazed'] is True)
tazed_time_null = tazed_not_null.where(lambda row: row['times_tazed'] == None)

tazed_time_null.select([
        'first_name',
        'middle_name',
        'last_name',
        'reporter_assigned'
    ]).order_by('last_name').rename(column_names = {
        'reporter_assigned': 'reporter'
    }).print_table(max_rows=None)

| first_name | middle_name | last_name | reporter |
| ---------- | ----------- | --------- | -------- |
