# Capstone Project - The Battle of the Neighborhoods

## Table of contents
* [A. Introduction](#introduction)
* [B. Data](#data)
* [C. Methodology](#methodology)
* [D. Results](#results)
* [E. Discussion](#discussion)
* [F. Conclusion](#conclusion)

## A. Introduction  <a name="introduction"></a>

Allocating police resources is a challenging endeavor. It’s likely that certain types of crime occur in certain areas and types of venues. If police had a better idea of where specific crimes, and crime in general occur, they will be able to more efficiently distribute their resources (manpower, equipment, etc.) and implement preventative measures.

In [37]:
from geopy.geocoders import Nominatim
import pandas as pd
import requests
import json

In [38]:
address = 'Chicago, IL'
geolocator = Nominatim(user_agent="city_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('The geograpical coordinates of {} are {}, {}.'.format(address, latitude, longitude))

The geograpical coordinates of Chicago, IL are 41.8755616, -87.6244212.


In [39]:
local_filepath = '..\\foursquare_credentials.txt'
f = open(local_filepath, "r")
contents = f.read()
credentials = json.loads(contents)
f.close()

CLIENT_ID = credentials['CLIENT_ID']
CLIENT_SECRET = credentials['CLIENT_SECRET']
VERSION = credentials['VERSION']

In [40]:
LIMIT = 1000
def getNearbyVenues(latitudes, longitudes, radius=500):
    
    venues_list=[]
    for lat, lng in zip(latitudes, longitudes):
        
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
         
        results = requests.get(url).json()["response"]["groups"][0]["items"]
        venues_list.append([(
            v['venue']['categories'][0]['name'],
            v['venue']['location']['lat'], 
            v['venue']['location']['lng']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [ 
                  'Venue Category',
                  'Venue Latitude', 
                  'Venue Longitude']
    return(nearby_venues)

In [41]:
df = getNearbyVenues([latitude], [longitude])

In [42]:
df.shape

(100, 3)

In [43]:
df.head()

Unnamed: 0,Venue Category,Venue Latitude,Venue Longitude
0,Theater,41.876058,-87.625303
1,Cuban Restaurant,41.875724,-87.626386
2,Sushi Restaurant,41.876969,-87.624534
3,Hostel,41.875757,-87.626537
4,Donut Shop,41.876768,-87.624575


In [52]:
onehot = pd.get_dummies(df[['Venue Category']], prefix="", prefix_sep="")
fixed_columns = list(onehot.columns)#[onehot.columns[-1]] + list(onehot.columns[:-1])
onehot = onehot[fixed_columns]

In [53]:
onehot.shape

(100, 63)

In [54]:
onehot.head(3)

Unnamed: 0,American Restaurant,Arepa Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Bakery,Bookstore,Boutique,Bubble Tea Shop,Building,...,Sandwich Place,Snack Place,Spanish Restaurant,Speakeasy,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Trail,Whisky Bar
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0


In [61]:
venue_categories = list(onehot.columns)
venue_categories[:5]

['American Restaurant',
 'Arepa Restaurant',
 'Art Museum',
 'Arts & Crafts Store',
 'Asian Restaurant']

##  B. Data <a name="data"></a>

The venue category data for venues in Chicago, IL will be used with crime data for the past year downloaded from the Chicago municipal website to find what venues types are most likely to have each type of crime occur there. In addition, for each venue category, what types of crime are most likely to occur there will also be determined.

https://data.cityofchicago.org/Public-Safety/Crimes-Map/dfnk-7re6

In [67]:
file_path = 'chicago_crime.csv'
crime_df = pd.read_csv(file_path)

In [68]:
crime_df.dropna(inplace=True)
crime_df.reset_index(inplace=True, drop=True);

In [69]:
print(crime_df.shape)
crime_df.head()

(254366, 5)


Unnamed: 0,DATE OF OCCURRENCE,PRIMARY DESCRIPTION,LOCATION DESCRIPTION,LATITUDE,LONGITUDE
0,6/24/2019 18:24,BATTERY,SIDEWALK,41.753506,-87.665947
1,12/5/2019 18:43,NARCOTICS,SIDEWALK,41.862559,-87.721771
2,6/24/2019 11:00,THEFT,STREET,41.992936,-87.700697
3,11/19/2019 19:20,THEFT,CTA BUS,41.778768,-87.683628
4,11/19/2019 0:10,BATTERY,APARTMENT,41.883109,-87.760218


In [75]:
crime_categories = list(crime_df['PRIMARY DESCRIPTION'].unique())
crime_categories

['BATTERY',
 'NARCOTICS',
 'THEFT',
 'CRIMINAL DAMAGE',
 'KIDNAPPING',
 'DECEPTIVE PRACTICE',
 'WEAPONS VIOLATION',
 'CRIMINAL TRESPASS',
 'ASSAULT',
 'OTHER OFFENSE',
 'ROBBERY',
 'MOTOR VEHICLE THEFT',
 'BURGLARY',
 'OFFENSE INVOLVING CHILDREN',
 'PUBLIC PEACE VIOLATION',
 'SEX OFFENSE',
 'CONCEALED CARRY LICENSE VIOLATION',
 'INTERFERENCE WITH PUBLIC OFFICER',
 'CRIM SEXUAL ASSAULT',
 'STALKING',
 'PROSTITUTION',
 'GAMBLING',
 'INTIMIDATION',
 'ARSON',
 'HOMICIDE',
 'LIQUOR LAW VIOLATION',
 'NON-CRIMINAL',
 'PUBLIC INDECENCY',
 'OBSCENITY',
 'HUMAN TRAFFICKING',
 'OTHER NARCOTIC VIOLATION']

## C. Methodology <a name="methodology"></a>

A good question to ask would be what does it mean for a crime to have occured near a type of venue. In this analysis, we will determine near to mean less than 0.1 miles or 528 feet.

To be added in week 5

In [93]:
crime_by_venue = pd.DataFrame(index=crime_categories, columns=venue_categories).fillna(value=0)

In [94]:
crime_by_venue

Unnamed: 0,American Restaurant,Arepa Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Bakery,Bookstore,Boutique,Bubble Tea Shop,Building,...,Sandwich Place,Snack Place,Spanish Restaurant,Speakeasy,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Trail,Whisky Bar
BATTERY,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
NARCOTICS,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
THEFT,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
CRIMINAL DAMAGE,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
KIDNAPPING,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
DECEPTIVE PRACTICE,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
WEAPONS VIOLATION,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
CRIMINAL TRESPASS,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ASSAULT,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
OTHER OFFENSE,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [91]:
crime_by_venue.iloc[0]['American Restaurant'] = 1

In [137]:
for crime_type in crime_categories:
    crimes_of_this_type = crime_df[crime_df['PRIMARY DESCRIPTION'] == crime_type]
    for incident in crimes_of_this_type.iterrows():
        #date = incident[1][0]
        crime = incident[1][1]
        crime_latitude = incident[1][3]
        crime_longitude = incident[1][4]
        #print(crime, crime_latitude, crime_longitude)
        for venue_type in venue_categories:
            venues_of_this_type = df[df['Venue Category'] == venue_type]
            for venue_ in venues_of_this_type.iterrows():
                venue = venue_[1]['Venue Category']
                venue_latitude = venue_[1]['Venue Latitude']
                venue_longitude = venue_[1]['Venue Longitude']
                print(venue, venue_longitude, venue_latitude)
                break
            break
        break
    break

American Restaurant -87.62942015891375 41.87554611111292


##  D. Results <a name="results"></a>

To be added in week 5

##  E. Discussion <a name="discussion"></a>

To be added in week 5

## F. Conclusion <a name="conclusion"></a>

To be added in week 5