# Welcome to our Combined Notebook!

In this project, we're analyzing the BEST neighborhood to commit crime. We'll be using Police Blotter, Wealth, and Non-Traffic Citation Data to find the most crime-friendly neighborhood!

First, let's import some useful libraries.

In [7]:
# Combined Notebook

import pandas as pd
import numpy as np
import operator

%matplotlib inline

import matplotlib.pyplot as plt

pd.set_option('display.min_rows', 400) # So we see all the neighborhoods
pd.set_option('display.max_rows', 400)

# Zach's Data: Police Blotter Data

First up is the police blotter data, which contains crime statistics dating back to 2005, sorted by type of crime, location, time committed, and more!

Let's display the number of offenses per neighborhood.

In [18]:
# Zach's Data
policeData = pd.read_csv("archive-police-blotter.csv", parse_dates=True, low_memory=False) # Read File

# Drop unnecessary columns
policeData.drop(columns = ["PK", "CCR", "INCIDENTTIME", "INCIDENTLOCATION", "CLEAREDFLAG", "INCIDENTZONE", "HIERARCHYDESC", "INCIDENTTRACT", "X", "Y"], inplace = True)

#Deletes from 8-29 (unnecessary hierarchies)
for i in range(22):
    policeData.drop(policeData[policeData.HIERARCHY == i+8].index, inplace=True)

#Deletes the ones the for loop didn't get to
policeData.drop(policeData[policeData.HIERARCHY == 0].index, inplace=True)
policeData.drop(policeData[policeData.HIERARCHY == 1].index, inplace=True)
policeData.drop(policeData[policeData.HIERARCHY == 2].index, inplace=True)
policeData.drop(policeData[policeData.HIERARCHY == 4].index, inplace=True)
policeData.drop(policeData[policeData.HIERARCHY == 99].index, inplace=True)

# Group by neighborhood
neighborhoods = policeData.groupby(["INCIDENTNEIGHBORHOOD"]).count()

# Remove hierarchy so we can combine all forms of stealing
neighborhoods.drop(columns = "HIERARCHY", inplace = True)

neighborhoods.sort_values("OFFENSES") # Sort values by # of offenses



Unnamed: 0_level_0,OFFENSES
INCIDENTNEIGHBORHOOD,Unnamed: 1_level_1
Outside County,10
Outside State,21
Mt. Oliver Boro,59
Chartiers City,97
New Homestead,120
Arlington Heights,202
Mt. Oliver Neighborhood,208
Glen Hazel,211
Hays,211
Summer Hill,216


In [12]:
# Renhan's Data

# inDict Function
def inDict(word):
    for w in dictionary:
        if word == w:
            return True
    return False

# Initialize a variable that holds the data
non_traffic_citations = pd.read_csv("NonTrafficCitations.csv") # index_col="_id")

# Filter the columns so that only the column with the neighborhood are saved
neighborhoods = non_traffic_citations.iloc[:, 8]

# Initialize a global dictionary
global dictionary
dictionary = {}

# Iterate this code for each row in neighborhoods
for row in neighborhoods:
    # If the item is in the dictionary, add 1 to its value
    if(inDict(row)):
        dictionary[row] += 1
        
    # If the item is not in the dictionary, create a new entry and initialize it to 1
    else:
        dictionary[row] = 1
        
# Sort the dictionary (this will turn it into a list of tuples)
dictionary = sorted(dictionary.items(), key=operator.itemgetter(1))

# Print the list
for key in dictionary:
    print(str(key[0]) + ": " + str(key[1]))

nan: 1
Ridgemont: 1
Mt. Oliver Boro: 1
Chartiers City: 1
Outside State: 2
Outside County: 2
St. Clair: 3
Mt. Oliver Neighborhood: 4
New Homestead: 5
Swisshelm Park: 5
Spring Garden: 7
Hays: 7
Regent Square: 7
Summer Hill: 7
Arlington Heights: 7
Unable To Retrieve Address: 7
Oakwood: 8
Fairywood: 9
Outside City: 9
Duquesne Heights: 13
Friendship: 14
Windgap: 14
Allegheny West: 16
East Carnegie: 17
Polish Hill: 17
Lower Lawrenceville: 19
Glen Hazel: 19
Morningside: 22
Bon Air: 22
Upper Hill: 23
Esplen: 25
California-Kirkbride: 26
Westwood: 26
Terrace Village: 26
West End: 27
Stanton Heights: 27
Mount Oliver: 28
Upper Lawrenceville: 31
Fineview: 31
Point Breeze North: 32
Manchester: 33
Banksville: 36
Homewood West: 38
Lincoln Place: 38
Troy Hill: 38
Perry North: 39
Golden Triangle/Civic Arena: 42
Highland Park: 42
Spring Hill-City View: 45
Overbrook: 46
South Shore: 46
Point Breeze: 47
Bedford Dwellings: 51
Crafton Heights: 54
Northview Heights: 55
West Oakland: 56
Beltzhoover: 56
Chateau

# Rafael's Data: Total Income Estimate

Lastly we have the total estimated annual income for each neighborhood. This data will be sorted by greatest to least total estimated income for each neighborhood.

In [17]:
# Rafael's Data

# Data (uploading, finding total estimate data, sorting)
wages = pd.read_csv("wage-or-salary-income-in-the-past-12-months-for-households.csv")
totalEstimates = wages.iloc[:, [0,2,3]]
greatestEstimates = totalEstimates.sort_values(by=['Estimate; Total:'], ascending = False)
greatestEstimates

Unnamed: 0,Neighborhood,Estimate; Total:,Margin of Error; Total:
68,Shadyside,7484.0,360.919659
77,Squirrel Hill South,7211.0,332.601864
13,Brookline,5843.0,327.69498
9,Bloomfield,4571.0,205.494525
53,Mount Washington,4385.0,249.523546
15,Carrick,4301.0,277.61304
7,Beechview,3567.0,258.02713
36,Greenfield,3438.0,251.708164
76,Squirrel Hill North,3370.0,187.890926
72,South Side Flats,3311.0,264.567572


# Data Analysis

For our project, we decided to grade based on two types of criminals: those that are confident in their skills, and those that are relatively new and inexperienced.

For the first type of criminal, we'll assume that the best neighborhoods to rob are those with a low crime rate; this means people won't be expecting crime as much. Obviously, there is the possibility that crime is low because the police are really good or because security is tight, but we'll assume that a good criminal will not care or will be able to bypass these obstacles with skills. This person would have to be a CRIMINAL MASTERMIND, as depicted below.

![Criminal Mastermind](https://cdn.discordapp.com/attachments/272551180415270912/963926885425872928/unknown.png)


For the second type of criminal, I'm going to assume they're inexperienced, and would like to commit crime in a neighborhood that is easier to succeed without being caught. This would be a neighborhood with a high crime rate. See picture below for more information.

![Robber](https://cdn.discordapp.com/attachments/272551180415270912/963925109029085245/unknown.png)

For the high-class criminal, wealthy, low crime/citation rate neighborhoods would be best.
For the low-class criminal, somewhat wealthy, high crime/citation rate neighborhoods would be best.

For the Police Blotter Data, the neighborhoods with the highest # of offenses were Golden Triangle, South Side Flats, and Shadyside, each with over 6000 offenses. The neighborhoods with the lowest # of offenses were Mt. Oliver Borough, Chartiers City, and New Homestead, all of which have less than 200 offenses.

For the Non-Traffic Citations, the neighborhoods with the highest # of citations were Central Oakland, Central Business District, and South Side Flats. The neighborhoods with the lowest # of citations were Ridgemont, Mount Oliver Borough, and Chartiers City.

The wealthiest neighborhoods are Shadyside, Squirrel Hill South, Brookline, Bloomfield, and Mount Washington.

For our metric, we will rate each of these categories out of ten and combine them (except wealth, which we will rate out of 5, considering it an "added bonus" for the criminal if they are wealthy). The neighborhood(s) with the best scores for the two types of criminals will be the best. Rather than rating each neighborhood (as there are WAY too many), we'll take the top 4 for Police Blotter/Non-Traffic Incidents for each type of criminal and rate those.

## First Type of Criminal

### Mt. Oliver Borough: 
    Police Blotter Incidents: 59 offenses, 10/10
    Non-Citation Incidents: 1 citation, 10/10
    Wealth: 1/5
    
    Total: 21/25
### Chartiers City:
    Police Blotter Incidents: 97 offenses, 9.5/10
    Non-Citation Incidents: 1 citation, 10/10
    Wealth: 1/5
    
    Total: 20.5/25
### New Homestead:
    Police Blotter Incidents: 120 offenses, 9/10
    Non-Citation Incidents: 5 citation, 9/10
    Wealth: 1.25/5
    
    Total: 19.25/25
### Ridgemont:
    Police Blotter Incidents: 307 offenses, 7.5/10
    Non-Citation Incidents: 1 citation, 10/10
    Wealth: 1.5/5
    
    Total: 19/25
    
## Second Type of Criminal

### Golden Triangle:
    Police Blotter Incidents: 11113 offenses, 10/10
    Non-Citation Incidents: 4/10
    Wealth: N/A
    
    Total: 14/20
### South Side Flats: 
    Police Blotter Incidents: 8855 offenses, 9.5/10
    Non-Citation Incidents: 3082 incidents, 10/10
    Wealth: 4.25/5
    
    Total: 23.75/25
### Shady Side: 
    Police Blotter Incidents: 6770 offenses, 8/10
    Non-Citation Incidents: 285 incidents, 7/10
    Wealth: 5/5
    
    Total: 20/25
### Central Oakland: 
    Police Blotter Incidents: 4035 offenses, 6.5/10
    Non-Citation Incidents: 622 incidents, 8/10
    Wealth: 4/5
    
    Total: 18.5/25
    
# Conclusions

To conclude, for the high-class (first type) of criminal, Mount Oliver Borough is the best neighborhood! It has the lowest crime rate, although its wealth rating is not very high. Despite that, there should be at least a few houses with some valuable loot waiting to be plundered.

For the low-class (second type) of criminal, South Side Flats would be the best. It is very wealthy, and has an incredibly high number of incidents. It is the perfect neighborhood.