# The Battle of Neighborhoods--Where to open a coffee shop?

### Introduction

A coffeehouse, coffee shop, or café is an establishment that primarily serves coffee (of various types, e.g. espresso, latte, cappuccino). Some coffeehouses may serve cold drinks such as iced coffee and iced tea; in continental Europe, cafés serve alcoholic drinks. A coffeehouse may also serve food such as light snacks, sandwiches, muffins or pastries. Coffeehouses range from owner-operated small businesses to large multinational corporations. Some coffeehouse chains operate on a franchise business model, with numerous branches across various countries around the world. Choices of the coffee shop location are closely correlated with the successfulness of the business. Therefore, it is important for the coffee shop owners to learn the neighborhoods data and make the decision.
In this project, we will use data science techniques to explore the neighborhoods in Buffalo NY and recomand a coffee shope location.

### Data Description

For the data that we using to solve the problem, we will use the Foursquare API to explore neighborhoods in Buffalo, NY. We will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. K-means clustering algorithm will be used to complete this task. Finally, we will use the Folium library to visualize the neighborhoods in Buffalo, NY and their explore the emerging clusters.
To conclude, the data that we used in this project are:

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. Buffalo Geospatial Neighborhood Data (Source: https://data.buffalony.gov/Economic-Neighborhood-Development/Neighborhoods/q9bk-zu3p)

2. Foursquare API
 
</font>
</div>

### Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download and Explore Dataset</a>

2. <a href="#item2">Explore Neighborhoods in Buffalo, NY</a>

3. <a href="#item3">Analyze Each Neighborhood</a>

4. <a href="#item4">Cluster Neighborhoods</a>

5. <a href="#item5">Results</a>    

6. <a herf="#item6">Discussion</a>
    
7. <a herf='#item7'>Conclusion</a>
</font>
</div>

In [104]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## 1. Download and Explore Dataset

In [105]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,the_geom,OBJECTID,Shape.STArea(),Shape.STLength(),NbhdName,PlaceName,OBJECTID_1,NbhdNum,CalcAcres,SqMiles,Shape_Leng,AREA
0,MULTIPOLYGON (((-78.803962665535 42.8986567774...,23,41248620.0,28981.72532,Genesee-Moselle,Buffalo,31,23,946.941874,1.479597,28981.725098,41248620.0
1,MULTIPOLYGON (((-78.86897100867 42.90219696255...,3,9298607.0,12763.118891,Allentown,Buffalo,24,3,213.467508,0.333543,12763.11898,9298607.0
2,MULTIPOLYGON (((-78.878376553824 42.9586334048...,11,23664610.0,20858.13257,West Hertel,Buffalo,1,11,543.266941,0.848855,20858.132954,23664610.0
3,MULTIPOLYGON (((-78.902088872515 42.9020386999...,1,64185170.0,78080.952053,Central,Buffalo,26,1,1473.494627,2.302335,78080.953511,64185170.0
4,MULTIPOLYGON (((-78.870236491887 42.9154064085...,5,23666490.0,20649.219019,Elmwood Bryant,Buffalo,22,5,543.310108,0.848922,20649.21872,23666490.0


In [106]:
df_data_1.shape

(35, 12)

clean data

In [179]:
buf_neighborhoods = pd.DataFrame()

In [180]:

body = client_00d69d839eeb454e99bfff58e0dd1162.get_object(Bucket='applieddatascience-donotdelete-pr-avk1fmqnkju01t',Key='lat_long_buf_neighbor.xlsx')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_0 = pd.read_excel(body)
df_data_0.head()


Unnamed: 0,A,B,C,A.1,B.1,C.1,A.2,B.2,C.2,A.3,B.3,C.3,A.4,B.4,C.4,A.5,B.5,C.5,A.6,B.6,C.6,A.7,B.7,C.7,A.8,B.8,C.8,A.9,B.9,C.9,A.10,B.10,C.10,A.11,B.11,C.11,A.12,B.12,C.12,A.13,B.13,C.13,A.14,B.14,C.14,A.15,B.15,C.15,A.16,B.16,C.16,A.17,B.17,C.17,A.18,B.18,C.18,A.19,B.19,C.19,A.20,B.20,C.20,A.21,B.21,C.21,A.22,B.22,C.22,A.23,B.23,C.23,A.24,B.24,C.24,A.25,B.25,C.25,A.26,B.26,C.26,A.27,B.27,C.27,A.28,B.28,C.28,A.29,B.29,C.29,A.30,B.30,C.30,A.31,B.31,C.31,A.32,B.32,C.32,A.33,B.33,C.33,A.34,B.34,C.34,A.35,B.35,C.35,A.36,B.36,C.36,A.37,B.37,C.37,A.38,B.38,C.38,A.39,B.39,C.39,A.40,B.40,C.40,A.41,B.41,C.41,A.42,B.42,C.42,A.43,B.43,C.43,A.44,B.44,C.44,A.45,B.45,C.45,A.46,B.46,C.46,A.47,B.47,C.47,A.48,B.48,C.48,A.49,B.49,C.49,A.50,B.50,C.50,A.51,B.51,C.51,A.52,B.52,C.52,A.53,B.53,C.53,A.54,B.54,C.54,A.55,B.55,C.55,A.56,B.56,C.56,A.57,B.57,C.57,A.58,B.58,C.58,A.59,B.59,C.59,A.60,B.60,C.60,A.61,B.61,C.61,A.62,B.62,C.62,A.63,B.63,C.63,A.64,B.64,C.64,A.65,B.65,C.65,A.66,B.66,C.66,A.67,B.67,C.67,A.68,B.68,C.68,A.69,B.69,C.69,A.70,B.70,C.70,A.71,B.71,C.71,A.72,B.72,C.72,A.73,B.73,C.73,A.74,B.74,C.74,A.75,B.75,C.75,A.76,B.76,C.76,A.77,B.77,C.77,A.78,B.78,C.78,A.79,B.79,C.79,A.80,B.80,C.80,A.81,B.81,C.81,A.82,B.82,C.82,A.83,B.83,C.83,A.84,B.84,C.84,A.85,B.85,C.85,A.86,B.86,C.86,A.87,B.87,C.87,A.88,B.88,C.88,A.89,B.89,C.89,A.90,B.90,C.90,A.91,B.91,C.91,A.92,B.92,C.92,A.93,B.93,C.93,A.94,B.94,C.94,A.95,B.95,C.95,A.96,B.96,C.96,A.97,B.97,C.97,A.98,B.98,C.98,A.99,B.99,C.99,A.100,B.100,C.100,A.101,B.101,C.101,A.102,B.102,C.102,A.103,B.103,C.103,A.104,B.104,C.104,A.105,B.105,C.105,A.106,B.106,C.106,A.107,B.107,C.107,A.108,B.108,C.108,A.109,B.109,C.109,A.110,B.110,C.110,A.111,B.111,C.111,A.112,B.112,C.112,A.113,B.113,C.113,A.114,B.114,C.114,A.115,B.115,C.115,A.116,B.116,C.116,A.117,B.117,C.117,A.118,B.118,C.118,A.119,B.119,C.119,A.120,B.120,C.120,A.121,B.121,C.121,A.122,B.122,C.122,A.123,B.123,C.123,A.124,B.124,C.124,A.125,B.125,C.125,A.126,B.126,C.126,A.127,B.127,C.127,A.128,B.128,C.128,A.129,B.129,C.129,A.130,B.130,C.130,A.131,B.131,C.131,A.132,B.132,C.132,A.133,B.133,C.133,A.134,B.134,C.134,A.135,B.135,C.135,A.136,B.136,C.136,A.137,B.137,C.137,A.138,B.138,C.138,A.139,B.139,C.139,A.140,B.140,C.140,A.141,B.141,C.141,A.142,B.142,C.142,A.143,B.143,C.143,A.144,B.144,C.144,A.145,B.145,C.145,A.146,B.146,C.146,A.147,B.147,C.147,A.148,B.148,C.148,A.149,B.149,C.149,A.150,B.150,C.150,A.151,B.151,C.151,A.152,B.152,C.152,A.153,B.153,C.153,A.154,B.154,C.154,A.155,B.155,C.155,A.156,B.156,C.156,A.157,B.157,C.157,A.158,B.158,C.158,A.159,B.159,C.159,A.160,B.160,C.160,A.161,B.161
0,-78.803963,42.898657,-78.80596,42.897773,-78.806386,42.897816,-78.806631,42.897823,-78.80874,42.897615,-78.809437,42.897524,-78.811806,42.897215,-78.813993,42.896927,-78.815212,42.896768,-78.815577,42.896724,-78.815907,42.896681,-78.816666,42.896582,-78.816911,42.896551,-78.817687,42.896452,-78.818965,42.896287,-78.819911,42.896166,-78.820054,42.896147,-78.820854,42.896044,-78.821118,42.89601,-78.821767,42.895928,-78.822103,42.895882,-78.822675,42.895805,-78.8236,42.895681,-78.824574,42.895548,-78.824736,42.895524,-78.825465,42.895418,-78.825937,42.895348,-78.82635,42.895287,-78.827245,42.895153,-78.828109,42.895023,-78.828255,42.895001,-78.82872,42.894931,-78.828722,42.894971,-78.828958,42.900899,-78.828972,42.903533,-78.828982,42.906516,-78.828332,42.906784,-78.824233,42.908426,-78.824262,42.910319,-78.824273,42.91445,-78.824413,42.918878,-78.82429,42.922567,-78.823739,42.92258,-78.821529,42.922599,-78.819523,42.922613,-78.817182,42.922633,-78.815989,42.920378,-78.812107,42.913283,-78.807543,42.905698,-78.807543,42.905698,-78.805244,42.901355,-78.803963,42.898657,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,-78.868971,42.902197,-78.870728,42.896804,-78.871256,42.895186,-78.871874,42.893324,-78.873049,42.893525,-78.877264,42.894248,-78.87823,42.894413,-78.878919,42.894532,-78.880118,42.895772,-78.881614,42.897335,-78.880946,42.897676,-78.883404,42.898735,-78.883066,42.898907,-78.884556,42.900473,-78.883528,42.901001,-78.882062,42.901757,-78.881686,42.902343,-78.88032,42.902277,-78.877137,42.902255,-78.873022,42.902232,-78.868971,42.902197,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,-78.878377,42.958633,-78.878419,42.944066,-78.896006,42.943805,-78.895372,42.948185,-78.892622,42.961293,-78.89236,42.961056,-78.891416,42.959908,-78.890804,42.958965,-78.890543,42.958562,-78.889127,42.958561,-78.88506,42.958591,-78.881397,42.958613,-78.880945,42.958615,-78.878377,42.958633,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,-78.902089,42.902039,-78.902075,42.902134,-78.901862,42.901936,-78.901621,42.902071,-78.901755,42.902331,-78.901683,42.90234,-78.900748,42.902504,-78.900015,42.901247,-78.898518,42.899039,-78.897792,42.898257,-78.896389,42.896798,-78.896379,42.896787,-78.895332,42.895676,-78.89331,42.893623,-78.89257,42.892909,-78.890334,42.890551,-78.889089,42.889501,-78.883544,42.883921,-78.880181,42.885245,-78.878331,42.890894,-78.877264,42.894248,-78.873049,42.893525,-78.8697,42.892944,-78.865595,42.89239,-78.866704,42.888996,-78.867398,42.886855,-78.869107,42.88157,-78.869884,42.878825,-78.870706,42.876289,-78.871444,42.873996,-78.871783,42.873007,-78.871817,42.872911,-78.871874,42.872728,-78.871915,42.872605,-78.871987,42.872385,-78.872076,42.872134,-78.872108,42.872085,-78.872194,42.871951,-78.872279,42.871847,-78.872345,42.871795,-78.872347,42.871794,-78.872372,42.871693,-78.87225,42.871608,-78.872233,42.871601,-78.87219,42.871612,-78.872155,42.871608,-78.872104,42.871585,-78.87196,42.871448,-78.871768,42.871333,-78.87152,42.871122,-78.87132,42.870988,-78.871254,42.870883,-78.871147,42.870785,-78.871019,42.870632,-78.87085,42.870382,-78.870735,42.870246,-78.870698,42.870149,-78.870682,42.870051,-78.870624,42.869925,-78.870567,42.869839,-78.870448,42.869659,-78.87023,42.869237,-78.869863,42.868648,-78.869453,42.867771,-78.869441,42.867746,-78.869438,42.867738,-78.86941,42.867678,-78.868967,42.866731,-78.868859,42.866555,-78.868819,42.866512,-78.868742,42.866462,-78.868684,42.866424,-78.868644,42.866388,-78.868613,42.866346,-78.867995,42.866451,-78.867923,42.866368,-78.868646,42.86602,-78.86862,42.865688,-78.868542,42.864876,-78.868587,42.864571,-78.868573,42.864348,-78.868534,42.863742,-78.868452,42.862874,-78.868068,42.862554,-78.867638,42.862195,-78.867601,42.862176,-78.867588,42.862172,-78.867422,42.862175,-78.867389,42.862182,-78.866795,42.862242,-78.866642,42.862264,-78.866558,42.862306,-78.865942,42.862701,-78.865745,42.862847,-78.864367,42.86375,-78.864174,42.863856,-78.863105,42.864078,-78.862723,42.864133,-78.862292,42.864129,-78.861745,42.864158,-78.861578,42.864154,-78.8611,42.864096,-78.860579,42.864009,-78.860182,42.863956,-78.860079,42.863998,-78.859923,42.863974,-78.859831,42.860571,-78.8601,42.860569,-78.860102,42.860676,-78.860195,42.860694,-78.860246,42.860744,-78.860331,42.861537,-78.860336,42.861736,-78.860409,42.861703,-78.860721,42.861476,-78.860988,42.861222,-78.861166,42.860939,-78.861308,42.860617,-78.86142,42.8603,-78.861486,42.860032,-78.861576,42.859709,-78.861605,42.859424,-78.861617,42.85899,-78.86161,42.858876,-78.861573,42.85854,-78.861428,42.85805,-78.861272,42.857691,-78.861091,42.857151,-78.861028,42.856975,-78.860997,42.856706,-78.860933,42.85654,-78.866392,42.853128,-78.866539,42.853085,-78.87166,42.85172,-78.871753,42.851696,-78.87177,42.851689,-78.872061,42.852162,-78.872287,42.85253,-78.872152,42.852565,-78.871778,42.852665,-78.869401,42.853362,-78.869223,42.853595,-78.869334,42.853703,-78.869522,42.854058,-78.869578,42.854163,-78.869646,42.854143,-78.872742,42.853251,-78.87283,42.853226,-78.872911,42.853343,-78.872934,42.853376,-78.87475,42.855979,-78.874953,42.856272,-78.875073,42.856445,-78.875208,42.856691,-78.875422,42.857096,-78.875681,42.857661,-78.87579,42.858018,-78.875966,42.858409,-78.87611,42.858702,-78.876097,42.858955,-78.875938,42.859068,-78.875185,42.859338,-78.873453,42.859992,-78.873602,42.860066,-78.873777,42.860161,-78.873835,42.860249,-78.873714,42.860319,-78.873813,42.86054,-78.874015,42.860548,-78.874177,42.860693,-78.874391,42.860979,-78.874486,42.861168,-78.874642,42.861192,-78.874711,42.861116,-78.874794,42.860991,-78.874873,42.860802,-78.874907,42.860707,-78.874924,42.860592,-78.874854,42.860404,-78.874917,42.86037,-78.87527,42.860174,-78.875714,42.860014,-78.876211,42.859861,-78.876382,42.859864,-78.876656,42.85996,-78.876933,42.860276,-78.877484,42.861185,-78.877651,42.861493,-78.877672,42.861635,-78.877721,42.861759,-78.878105,42.862442,-78.878529,42.862998,-78.878785,42.86329,-78.879015,42.863563,-78.879661,42.864272,-78.880265,42.865098,-78.880859,42.86579,-78.88128,42.866192,-78.881643,42.866453,-78.881944,42.866682,-78.882228,42.86696,-78.882365,42.86717,-78.882497,42.867314,-78.882645,42.867365,-78.882716,42.867531,-78.879258,42.86869,-78.879995,42.869549,-78.883659,42.868404,-78.884006,42.86888,-78.880427,42.870061,-78.880865,42.87058,-78.881132,42.870491,-78.88121,42.870522,-78.881487,42.870633,-78.881601,42.8706,-78.884621,42.869738,-78.884802,42.869993,-78.884792,42.870114,-78.884744,42.870217,-78.88463,42.870315,-78.884487,42.870362,-78.884307,42.87042,-78.883879,42.870498,-78.883624,42.870571,-78.883488,42.87063,-78.883456,42.870739,-78.883539,42.871056,-78.883823,42.871649,-78.884287,42.871424,-78.885273,42.870948,-78.885547,42.870956,-78.885624,42.871071,-78.886347,42.871863,-78.886628,42.872312,-78.887224,42.873024,-78.888845,42.874837,-78.88941,42.875467,-78.889885,42.875646,-78.890387,42.875974,-78.888373,42.876916,-78.888119,42.876922,-78.888077,42.877201,-78.889
4,-78.870236,42.915406,-78.869974,42.915399,-78.864482,42.915349,-78.86426,42.915339,-78.864638,42.914967,-78.865913,42.911249,-78.867589,42.905812,-78.868375,42.904019,-78.868971,42.902197,-78.873022,42.902232,-78.877137,42.902255,-78.88032,42.902277,-78.881686,42.902343,-78.881691,42.902335,-78.882018,42.902996,-78.881903,42.905422,-78.88267,42.905432,-78.883045,42.905884,-78.884495,42.907466,-78.885971,42.909045,-78.88543,42.909322,-78.886937,42.91092,-78.8861,42.911324,-78.887482,42.912981,-78.887293,42.913076,-78.888742,42.914611,-78.888151,42.914947,-78.888154,42.914966,-78.888151,42.915348,-78.886797,42.915381,-78.885666,42.915411,-78.882601,42.915489,-78.88083,42.915496,-78.877047,42.915465,-78.873515,42.915429,-78.870236,42.915406,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [181]:
arr = np.arange(len(df_data_0.columns)) % 2

df_data_0['longitude']  = df_data_0.iloc[:, arr == 0].mean(axis=1)
df_data_0['latitude'] = df_data_0.iloc[:, arr == 1].mean(axis=1)

In [182]:
buf_neighborhoods['neighborhood'] = df_data_1['NbhdName']
buf_neighborhoods['longitude'] = df_data_0['longitude'] 
buf_neighborhoods['latitude'] = df_data_0['latitude'] 

In [183]:
buf_neighborhoods.shape

(35, 3)

In [184]:
address = 'Buffalo, NY'

geolocator = Nominatim(user_agent="buf_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Buffalo are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Buffalo are 42.8867166, -78.8783922.


### Create a map of Buffalo with neighborhoods superimposed on top.

In [185]:
# create map of Buffalo using latitude and longitude values
map_buf = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(buf_neighborhoods['latitude'], buf_neighborhoods['longitude'], buf_neighborhoods['neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_buf)  
    
map_buf

Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

### Define Foursquare Credentials and Version

In [186]:
CLIENT_ID = 'VZQBUPXJSE1TOH5JQCHAAXSXSTHIFNJ3IESMRXN0BY3VTQ40' # your Foursquare ID
CLIENT_SECRET = 'UJGUZBODHXXYGS2N1G1OJQJ1P13Q4H0EBP5ATMFJPBKJJ1FD' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: VZQBUPXJSE1TOH5JQCHAAXSXSTHIFNJ3IESMRXN0BY3VTQ40
CLIENT_SECRET:UJGUZBODHXXYGS2N1G1OJQJ1P13Q4H0EBP5ATMFJPBKJJ1FD


### Explore the neighborhood at Buffalo

In [187]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


In [188]:
radius = 1000 # define radius
LIMIT = 200 # limit of number of venues returned by Foursquare API

In [189]:
Buffalo_venues = getNearbyVenues(names=buf_neighborhoods['neighborhood'],
                                   latitudes=buf_neighborhoods['latitude'],
                                   longitudes=buf_neighborhoods['longitude']
                                  )

Genesee-Moselle
Allentown
West Hertel
Central
Elmwood Bryant
Fillmore-Leroy
Delavan Grider
Lovejoy
Black Rock
Hopkins-Tifft
Lower West Side
Schiller Park
Kensington-Bailey
Ellicott
Kenfield
Masten Park
Grant-Amherst
Riverside
Seneca-Cazenovia
First Ward
Seneca Babcock
Pratt-Willert
Broadway Fillmore
Central Park
North Park
Upper West Side
Elmwood Bidwell
Parkside
West Side
University Heights
Hamlin Park
Fruit Belt
South Park
Kaisertown
MLK Park


ConnectionError: HTTPSConnectionPool(host='api.foursquare.com', port=443): Max retries exceeded with url: /v2/venues/explore?&client_id=VZQBUPXJSE1TOH5JQCHAAXSXSTHIFNJ3IESMRXN0BY3VTQ40&client_secret=UJGUZBODHXXYGS2N1G1OJQJ1P13Q4H0EBP5ATMFJPBKJJ1FD&v=20180605&ll=42.908313724726135,-78.83642300032338&radius=1000&limit=200 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f6036e3e5c0>: Failed to establish a new connection: [Errno -2] Name or service not known',))

In [190]:
print(Buffalo_venues.shape)
Buffalo_venues.head()

(306, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Genesee-Moselle,42.902022,-78.819049,Almin Liquor Store,42.902753,-78.822332,Liquor Store
1,Genesee-Moselle,42.902022,-78.819049,Community Food & Meat Market,42.90405,-78.816482,Convenience Store
2,Genesee-Moselle,42.902022,-78.819049,Ms. Goodies,42.904347,-78.813976,Diner
3,Genesee-Moselle,42.902022,-78.819049,Signature fashion,42.904554,-78.81414,Clothing Store
4,Allentown,42.898438,-78.877653,Allen Burger Venture,42.899567,-78.876494,Burger Joint


In [191]:
Buffalo_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allentown,42,42,42,42,42,42
Black Rock,11,11,11,11,11,11
Broadway Fillmore,15,15,15,15,15,15
Central,12,12,12,12,12,12
Delavan Grider,8,8,8,8,8,8
Ellicott,7,7,7,7,7,7
Elmwood Bidwell,18,18,18,18,18,18
Elmwood Bryant,23,23,23,23,23,23
Fillmore-Leroy,4,4,4,4,4,4
First Ward,9,9,9,9,9,9


In [192]:
print('There are {} uniques categories.'.format(len(Buffalo_venues['Venue Category'].unique())))

There are 119 uniques categories.


### Analyze Each Neighborhood

In [193]:
# one hot encoding
Buffalo_onehot = pd.get_dummies(Buffalo_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Buffalo_onehot['Neighborhood'] = Buffalo_venues['Neighborhood'] 


In [194]:

# move neighborhood column to the first column
fixed_columns = [Buffalo_onehot.columns[-1]] + list(Buffalo_onehot.columns[:-1])
Buffalo_onehot = Buffalo_onehot[fixed_columns]

Buffalo_onehot.head()

Unnamed: 0,Neighborhood,African Restaurant,American Restaurant,Antique Shop,Art Gallery,Art Museum,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bank,Bar,Baseball Field,Beer Bar,Beer Store,Boat Rental,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Burger Joint,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Canal Lock,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Deli / Bodega,Department Store,Diner,Discount Store,Dive Bar,Donut Shop,Dry Cleaner,Duty-free Shop,Electronics Store,Ethiopian Restaurant,Event Space,Farmers Market,Fast Food Restaurant,Food,Food Court,Food Truck,Furniture / Home Store,Garden,Gay Bar,General Entertainment,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Harbor / Marina,Health & Beauty Service,Home Service,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Lake,Latin American Restaurant,Liquor Store,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Movie Theater,Music Venue,New American Restaurant,Park,Performing Arts Venue,Pharmacy,Piercing Parlor,Pizza Place,Playground,Poutine Place,Pub,Recording Studio,Rental Car Location,Restaurant,Salad Place,Sandwich Place,Sausage Shop,Seafood Restaurant,Shopping Plaza,Skating Rink,Snack Place,Soccer Field,Southern / Soul Food Restaurant,Spa,Sporting Goods Shop,Sports Bar,Supermarket,Sushi Restaurant,Thai Restaurant,Theater,Thrift / Vintage Store,Trattoria/Osteria,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Waterfront,Women's Store,Yoga Studio
0,Genesee-Moselle,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Genesee-Moselle,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Genesee-Moselle,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Genesee-Moselle,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Allentown,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category¶

In [195]:
Buffalo_grouped = Buffalo_onehot.groupby('Neighborhood').mean().reset_index()
Buffalo_grouped.shape

(33, 120)

In [196]:
num_top_venues = 5

for hood in Buffalo_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Buffalo_grouped[Buffalo_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Allentown----
                venue  freq
0                 Bar  0.14
1  Mexican Restaurant  0.05
2         Coffee Shop  0.05
3      Sandwich Place  0.05
4            Dive Bar  0.05


----Black Rock----
             venue  freq
0    Deli / Bodega  0.18
1      Pizza Place  0.09
2      Music Venue  0.09
3  Thai Restaurant  0.09
4             Park  0.09


----Broadway Fillmore----
               venue  freq
0       Intersection  0.13
1  Convenience Store  0.07
2               Bank  0.07
3      Bowling Alley  0.07
4  Electronics Store  0.07


----Central----
             venue  freq
0    Boat or Ferry  0.17
1  Harbor / Marina  0.17
2             Park  0.17
3       Food Truck  0.08
4              Bar  0.08


----Delavan Grider----
            venue  freq
0     Coffee Shop  0.12
1      Food Court  0.12
2  Discount Store  0.12
3  Sandwich Place  0.12
4     Dry Cleaner  0.12


----Ellicott----
                       venue  freq
0                 Food Truck  0.43
1                Pizza Plac

In [197]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [198]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Buffalo_grouped['Neighborhood']

for ind in np.arange(Buffalo_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Buffalo_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allentown,Bar,Coffee Shop,Dive Bar,Sandwich Place,Hotel,Italian Restaurant,Mexican Restaurant,American Restaurant,Cuban Restaurant,New American Restaurant
1,Black Rock,Deli / Bodega,Thai Restaurant,Music Venue,Antique Shop,Pizza Place,Art Museum,Harbor / Marina,Canal Lock,Park,Martial Arts Dojo
2,Broadway Fillmore,Intersection,Electronics Store,Bank,Discount Store,Park,Market,Indian Restaurant,Bar,Bowling Alley,Theater
3,Central,Park,Boat or Ferry,Harbor / Marina,Bar,Brewery,Event Space,Food Truck,General Entertainment,Waterfront,Cuban Restaurant
4,Delavan Grider,Dry Cleaner,Coffee Shop,Convenience Store,Food Court,Intersection,Sandwich Place,Fast Food Restaurant,Discount Store,Yoga Studio,Concert Hall


In [199]:
neighborhoods_venues_sorted.shape

(33, 11)

### Using KMeans clustering for the clsutering of the neighbourhoods

In [200]:
# set number of clusters
kclusters = 4

Buffalo_grouped_clustering = Buffalo_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Buffalo_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 0, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 3, 2, 1, 1, 2, 2, 2], dtype=int32)

In [201]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [202]:
neighborhoods_venues_sorted.shape

(33, 12)

In [203]:
Buffalo_merged = buf_neighborhoods

In [204]:
# merge buffalo_grouped with toronto_data to add latitude/longitude for each neighborhood
Buffalo_merged.rename(columns = {'neighborhood':'Neighborhood'},inplace = True)
Buffalo_merged = Buffalo_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [205]:
Buffalo_merged = Buffalo_merged.dropna()

In [206]:
Buffalo_merged['Cluster Labels'] = Buffalo_merged['Cluster Labels'].astype(int)

In [207]:
Buffalo_merged.dtypes

Neighborhood               object
longitude                 float64
latitude                  float64
Cluster Labels              int64
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
dtype: object

In [208]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Buffalo_merged['latitude'], Buffalo_merged['longitude'], Buffalo_merged['Neighborhood'], Buffalo_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters

In [209]:
Buffalo_merged.loc[Buffalo_merged['Cluster Labels'] == 0, ]

Unnamed: 0,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Kenfield,-78.80592,42.927176,0,Department Store,Snack Place,Yoga Studio,Clothing Store,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Deli / Bodega,Diner


In [210]:
Buffalo_merged.loc[Buffalo_merged['Cluster Labels'] == 1, ]

Unnamed: 0,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
29,University Heights,-78.815385,42.952562,1,Coffee Shop,Pharmacy,Convenience Store,Athletics & Sports,Bus Stop,Electronics Store,Concert Hall,Cuban Restaurant,Deli / Bodega,Department Store
30,Hamlin Park,-78.849797,42.924102,1,Coffee Shop,Cafeteria,Yoga Studio,Clothing Store,Concert Hall,Convenience Store,Cuban Restaurant,Deli / Bodega,Department Store,Diner
31,Fruit Belt,-78.86035,42.8994,1,Hotel,Coffee Shop,Cafeteria,Donut Shop,Electronics Store,Concert Hall,Convenience Store,Cuban Restaurant,Deli / Bodega,Department Store
32,South Park,-78.807858,42.844979,1,Coffee Shop,Pharmacy,Bookstore,Golf Course,Sandwich Place,Farmers Market,Cocktail Bar,Concert Hall,Convenience Store,Cuban Restaurant


We notived that four neighborhoods (University Heights, Hamlin Park, Fuit Belt and South Park) have the most common Coffee shops. From this clue, we could see that these four neighborhoods have large demands in coffee shops. 

In [211]:
Buffalo_merged.loc[Buffalo_merged['Cluster Labels'] == 2, ]

Unnamed: 0,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Genesee-Moselle,-78.819049,42.902022,2,Clothing Store,Convenience Store,Diner,Liquor Store,Grocery Store,Coffee Shop,Concert Hall,Cuban Restaurant,Deli / Bodega,Department Store
1,Allentown,-78.877653,42.898438,2,Bar,Coffee Shop,Dive Bar,Sandwich Place,Hotel,Italian Restaurant,Mexican Restaurant,American Restaurant,Cuban Restaurant,New American Restaurant
2,West Hertel,-78.887202,42.956249,2,Diner,Pizza Place,Furniture / Home Store,Bus Station,Bar,Duty-free Shop,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant
3,Central,-78.874639,42.867918,2,Park,Boat or Ferry,Harbor / Marina,Bar,Brewery,Event Space,Food Truck,General Entertainment,Waterfront,Cuban Restaurant
4,Elmwood Bryant,-78.878727,42.910361,2,Gourmet Shop,Yoga Studio,New American Restaurant,Restaurant,Beer Store,Greek Restaurant,Health & Beauty Service,Dry Cleaner,Japanese Restaurant,Dive Bar
5,Fillmore-Leroy,-78.840822,42.931245,2,Music Venue,Auto Garage,Sandwich Place,Discount Store,Yoga Studio,Duty-free Shop,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant
6,Delavan Grider,-78.83063,42.922558,2,Dry Cleaner,Coffee Shop,Convenience Store,Food Court,Intersection,Sandwich Place,Fast Food Restaurant,Discount Store,Yoga Studio,Concert Hall
7,Lovejoy,-78.812847,42.890601,2,Pizza Place,Convenience Store,Diner,Liquor Store,Greek Restaurant,Cocktail Bar,Coffee Shop,Concert Hall,Cuban Restaurant,Deli / Bodega
8,Black Rock,-78.904081,42.935479,2,Deli / Bodega,Thai Restaurant,Music Venue,Antique Shop,Pizza Place,Art Museum,Harbor / Marina,Canal Lock,Park,Martial Arts Dojo
10,Lower West Side,-78.886988,42.894446,2,Latin American Restaurant,Liquor Store,Gym,Coffee Shop,Rental Car Location,Discount Store,Pharmacy,Food Court,Food,Cocktail Bar


In [212]:
Buffalo_merged.loc[Buffalo_merged['Cluster Labels'] == 3,]

Unnamed: 0,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Seneca Babcock,-78.836897,42.864293,3,Bakery,Yoga Studio,Electronics Store,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Deli / Bodega,Department Store,Diner


## Conclusion