# Gregory Pollard Battle of the Neighbourhoods Project

## Introduction/Business Problem:

> I have been tasked by a restaurant chain with locations in many boroughs of New York to find an optimal location <br> for a new venue in southern Manhattan. The restaurants in this chain primarily serve pizza and so the institution would like recommendations on <br> where to place this new venue based on density of competitor pizza places in the area and also the ratings of these <br> other venues too. The company itself is very well known in New York and has recently acquired the funds to set up this new location<br> in the hopes that they can expand their business into Manhattan with more locations in the north to come down the line as well. I intend to analyse this data by clustering<br> it and finding a position in which there are few competitors, and also where the nearest pizza places have a low rating (so that they are not as difficult to compete with).

## Data

> I will utilise the Foursquare API to acquire location data on other pizza places in Manhattan, I shall then cluster<br>these venues and decide where the optimal position for a new venue could be. To decide this optimal location, I will assess the positions <br>and ratings of each existing venue and choose a spot where there is a small number of competitors with <br>low ratings. Because of this, I will be using the “Search” and “Likes” endpoints to acquire this data from foursquare, I will<br> also be using the K-means clustering algorithm to classify my data.<br><br>
The venues themselves will be local pizza places such as “Prince Street Pizza” and “Joe’s Pizza”, I will only be using venues in the southern areas of Manhattan<br> for this as it is the region in which the restaurant chain wishes to set up a new location. My centre point to retrieve this data<br> will be “The Sheen Center for Thought & Culture”, which is a catholic-affiliated performing arts<br> complex with 2 theaters, rehearsal studios & an art gallery. From this location I will set a large radius for pizza places in Manhattan <br>and call the data with the Foursquare API.


### Import Necessary Packages

In [3]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation


!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize


! pip install folium==0.5.0
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Folium installed
Libraries imported.


### Prepare Foursquare Details

In [4]:
CLIENT_ID = 'X5OYKVDZ3N4VVLFJERLAXNHKHPIKVR14BIJPSHXRLZ314EZI' # your Foursquare ID
CLIENT_SECRET = 'UCVWLRPX4LSGGTW3EQYFSO0HL2L0TKCM0X01DPQUWNOZOBWF' # your Foursquare Secret
ACCESS_TOKEN = 'ACI1JL2T214BVPNOZUWWA3PNG2U0MLX0R0HIBVKC2KISRUNJ' # your FourSquare Access Token
VERSION = '20180604'
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentials:
CLIENT_ID: X5OYKVDZ3N4VVLFJERLAXNHKHPIKVR14BIJPSHXRLZ314EZI
CLIENT_SECRET:UCVWLRPX4LSGGTW3EQYFSO0HL2L0TKCM0X01DPQUWNOZOBWF


### Define Coordinares of the Center of the Call and the Query

In [10]:
geolocator = Nominatim(user_agent="foursquare_agent")
latitude = 40.72537645209457
longitude = -73.9935543780856
search_query = 'Pizza'
radius =10000
print(latitude, longitude)

40.72537645209457 -73.9935543780856


### Define the URL and Make the "Search" Call

In [15]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius)
results = requests.get(url).json()

### Transform the Data into a Pandas Dataframe and Trim out Unnecessary Data

In [31]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)

# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered



Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,id
0,Prince Street Pizza,Pizza Place,27 Prince St,btwn Mott & Elizabeth St,40.723093,-73.994527,"[{'label': 'display', 'lat': 40.72309326145674...",267,10012.0,US,New York,NY,United States,"[27 Prince St (btwn Mott & Elizabeth St), New ...",4f045eeb00399761c77301e3
1,Joe's Pizza,Pizza Place,7 Carmine St,at 6th Ave,40.730461,-74.001972,"[{'label': 'display', 'lat': 40.73046111519949...",908,10014.0,US,New York,NY,United States,"[7 Carmine St (at 6th Ave), New York, NY 10014...",45ebc982f964a52091431fe3
2,Lombardi's Coal Oven Pizza,Pizza Place,32 Spring St,at Mott St,40.721636,-73.995635,"[{'label': 'display', 'lat': 40.72163625443080...",451,10012.0,US,New York,NY,United States,"[32 Spring St (at Mott St), New York, NY 10012...",3fd66200f964a52062e61ee3
3,99¢ Fresh Pizza,Pizza Place,71 2nd Ave,Between E 4th And E 5th St,40.726376,-73.989338,"[{'label': 'display', 'lat': 40.72637573745666...",372,10003.0,US,New York,NY,United States,"[71 2nd Ave (Between E 4th And E 5th St), New ...",53b5f4b0498e7253b4854831
4,Famous Ben's Pizza of SoHo,Pizza Place,177 Spring St,at Thompson St,40.72484,-74.002551,"[{'label': 'display', 'lat': 40.72483950842974...",761,10012.0,US,New York,NY,United States,"[177 Spring St (at Thompson St), New York, NY ...",44acde31f964a52013351fe3
5,Champion Pizza,Pizza Place,17 Cleveland Pl,,40.721638,-73.99747,"[{'label': 'display', 'lat': 40.72163754224819...",531,10012.0,US,New York,NY,United States,"[17 Cleveland Pl, New York, NY 10012, United S...",55ea9f4d498ed46db0383483
6,East Village Pizza,Pizza Place,145 1st Ave,at E 9th St,40.728272,-73.985006,"[{'label': 'display', 'lat': 40.72827235649891...",789,10003.0,US,New York,NY,United States,"[145 1st Ave (at E 9th St), New York, NY 10003...",4a99d4a3f964a520b93020e3
7,Pizza Mercato,Pizza Place,11 Waverly Pl,Mercer St,40.73005,-73.993999,"[{'label': 'display', 'lat': 40.7300498236796,...",521,10003.0,US,New York,NY,United States,"[11 Waverly Pl (Mercer St), New York, NY 10003...",4ae9081af964a52016b421e3
8,Taco Bell/Pizza Hut,Pizza Place,18 E. 14th Street,btwn 5th & University,40.735398,-73.99279,"[{'label': 'entrance', 'lat': 40.735468, 'lng'...",1117,10003.0,US,New York,NY,United States,"[18 E. 14th Street (btwn 5th & University), Ne...",4a2d48adf964a5209e971fe3
9,Bravo Pizza,Pizza Place,115 E 14th St,Irving Pl,40.734198,-73.989269,"[{'label': 'display', 'lat': 40.73419779708055...",1046,10003.0,US,New York,NY,United States,"[115 E 14th St (Irving Pl), New York, NY 10003...",4bd4eeff29eb9c7497b092e1


### Create the Map of Venues

In [36]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='The Sheen Center for Thought & Culture',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.name):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=folium.Popup(label, parse_html=True),
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
display(venues_map)