# Capstone Project - The Battle of the Neighborhoods

## Table of contents
* [A. Introduction](#introduction)
* [B. Data](#data)
* [C. Methodology](#methodology)
* [D. Results](#results)
* [E. Discussion](#discussion)
* [F. Conclusion](#conclusion)

## A. Introduction  <a name="introduction"></a>

Allocating police resources is a challenging endeavor. It’s likely that certain types of crime occur in certain areas and types of venues. If police had a better idea of where specific crimes, and crime in general occur, they will be able to more efficiently distribute their resources (manpower, equipment, etc.) and implement preventative measures.

In [20]:
from geopy.geocoders import Nominatim
import pandas as pd
import requests
import json

In [14]:
address = 'Chicago, IL'
geolocator = Nominatim(user_agent="city_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('The geograpical coordinates of {} are {}, {}.'.format(address, latitude, longitude))

The geograpical coordinates of Chicago, IL are 41.8755616, -87.6244212.


In [15]:
local_filepath = '..\\foursquare_credentials.txt'
f = open(local_filepath, "r")
contents = f.read()
credentials = json.loads(contents)
f.close()

CLIENT_ID = credentials['CLIENT_ID']
CLIENT_SECRET = credentials['CLIENT_SECRET']
VERSION = credentials['VERSION']

In [16]:
LIMIT = 1000
def getNearbyVenues(latitudes, longitudes, radius=500):
    
    venues_list=[]
    for lat, lng in zip(latitudes, longitudes):
        
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
         
        results = requests.get(url).json()["response"]["groups"][0]["items"]
        venues_list.append([(
            v['venue']['categories'][0]['name'],
            v['venue']['location']['lat'], 
            v['venue']['location']['lng']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [ 
                  'Venue Category',
                  'Venue Latitude', 
                  'Venue Longitude']
    return(nearby_venues)

In [17]:
df = getNearbyVenues([latitude], [longitude])

In [18]:
df.head()

Unnamed: 0,Venue Category,Venue Latitude,Venue Longitude
0,Theater,41.876058,-87.625303
1,Cuban Restaurant,41.875724,-87.626386
2,Sushi Restaurant,41.876969,-87.624534
3,Hostel,41.875757,-87.626537
4,Donut Shop,41.876768,-87.624575


In [7]:
onehot = pd.get_dummies(df[['Venue Category']], prefix="", prefix_sep="")
fixed_columns = list(onehot.columns)#[onehot.columns[-1]] + list(onehot.columns[:-1])
onehot = onehot[fixed_columns]

In [8]:
onehot.head(3)

Unnamed: 0,American Restaurant,Antique Shop,Baby Store,Bakery,Bar,Bookstore,Boxing Gym,Breakfast Spot,Bubble Tea Shop,Building,...,Speakeasy,Sporting Goods Shop,Strip Club,Sushi Restaurant,Taco Place,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [9]:
onehot.columns

Index(['American Restaurant', 'Antique Shop', 'Baby Store', 'Bakery', 'Bar',
       'Bookstore', 'Boxing Gym', 'Breakfast Spot', 'Bubble Tea Shop',
       'Building', 'Burger Joint', 'Burrito Place', 'Café', 'Cocktail Bar',
       'Coffee Shop', 'Comic Shop', 'Cosmetics Shop', 'Coworking Space',
       'Cuban Restaurant', 'Dance Studio', 'Discount Store',
       'Electronics Store', 'Event Space', 'Falafel Restaurant',
       'Fast Food Restaurant', 'French Restaurant', 'Furniture / Home Store',
       'Greek Restaurant', 'Gym', 'Gym / Fitness Center', 'Hotel', 'Hotel Bar',
       'Indian Restaurant', 'Italian Restaurant', 'Japanese Curry Restaurant',
       'Japanese Restaurant', 'Juice Bar', 'Laundry Service', 'Liquor Store',
       'Martial Arts Dojo', 'Middle Eastern Restaurant',
       'Molecular Gastronomy Restaurant', 'Monument / Landmark', 'Nail Salon',
       'Park', 'Pizza Place', 'Plaza', 'Restaurant', 'Sandwich Place',
       'Shopping Mall', 'Spa', 'Speakeasy', 'Sporting G

##  B. Data <a name="data"></a>

The venue category data for venues in Chicago, IL will be used with crime data downloaded from the Chicago municipal website to find what venues types are most likely to have each type of crime occur there. In addition, for each venue category, what types of crime are most likely to occur there will also be determined.

https://data.cityofchicago.org/Public-Safety/Crimes-Map/dfnk-7re6

In [11]:
file_path = 'chicago_crime.csv'
crime_df = pd.read_csv(file_path)

In [12]:
crime_df.dropna(inplace=True)
crime_df.reset_index(inplace=True, drop=True);

In [13]:
print(crime_df.shape[0])
crime_df.head()

254366


Unnamed: 0,DATE OF OCCURRENCE,PRIMARY DESCRIPTION,LOCATION DESCRIPTION,LATITUDE,LONGITUDE
0,6/24/2019 18:24,BATTERY,SIDEWALK,41.753506,-87.665947
1,12/5/2019 18:43,NARCOTICS,SIDEWALK,41.862559,-87.721771
2,6/24/2019 11:00,THEFT,STREET,41.992936,-87.700697
3,11/19/2019 19:20,THEFT,CTA BUS,41.778768,-87.683628
4,11/19/2019 0:10,BATTERY,APARTMENT,41.883109,-87.760218


## C. Methodology <a name="methodology"></a>

To be added in week 5

##  D. Results <a name="results"></a>

To be added in week 5

##  E. Discussion <a name="discussion"></a>

To be added in week 5

## F. Conclusion <a name="conclusion"></a>

To be added in week 5