<h2><center>Analyzing Location data for suggesting a new Restaurant - Python Date Science</center></h2>

---

<div class="section-inner sectionLayout--insetColumn">
<h3 name="fdbd" class="graf graf--h3 graf--leading">
<strong class="markup--strong markup--h3-strong">The Business Plan&#8202;—&#8202;Short summary:</strong></h3>
<p name="ca01" class="graf graf--p graf-after--h3">

For this project, I am going to utilize the Foursquare API for referencing
<strong class="markup--strong markup--p-strong">location data in the Boston, MA</strong> area.
This data will allow us to report on and provide feedback to interested parties
<strong class="markup--strong markup--p-strong">looking to open a restaurant</strong>.</p>


### Getting Started

In [1]:
# import the proper libraries
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analysis

import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
import folium # map rendering library

print('Libraries imported.')


Libraries imported.


#### Download and Explore Dataset

In [2]:
import wget

## GeoJSON Data

wget.download(
    'https://mapservices.bostonredevelopmentauthority.org/arcproxy/arcgis/rest/services/Maps/Bos_Neighborhoods_2018/mapserver/0/query?where=1%3D1&outFields=OBJECTID,Name,Neighborhood_ID,SqMiles&outSR=4326&f=json',
    'boston_data.json'
)

'boston_data.json'

In [3]:
with open('boston_data.json') as json_data:
    boston_data = json.load(json_data)

Looking at the raw data, the **features** key holds the relevant neighborhood data.

In [4]:
boston_neighborhoods = boston_data['features']

#### Transform the data into a *pandas* dataframe

In [5]:
# define the dataframe columns
column_names = ['Neighborhood', 'Neighborhood_ID', 'SqMiles','Latitude', 'Longitude']

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

Let's loop through the data and fill the dataframe one row at a time.

In [16]:
for data in boston_neighborhoods:
    neighborhood_name = data['attributes']['Name']
    neighborhood_id = data['attributes']['Neighborhood_ID']
    neighborhood_size = data['attributes']['SqMiles']
    neighborhood_latlon = data['geometry']['rings'][:1]
    neighborhood_lat = neighborhood_latlon[0]
    neighborhood_lon = neighborhood_latlon[0]

    neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
                                          'Neighborhood_ID': neighborhood_id,
                                          'SqMiles': neighborhood_size,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

Quickly examine the resulting dataframe.

In [17]:
neighborhoods.head(10)


Unnamed: 0,Neighborhood,Neighborhood_ID,SqMiles,Latitude,Longitude
0,Roslindale,15,2.51,"[[-71.12583076676759, 42.272212845889705], [-7...","[[-71.12592717485386, 42.272013107957406], [-7..."
1,Roslindale,15,2.51,"[[-71.12583076676759, 42.272212845889705], [-7...","[[-71.12592717485386, 42.272013107957406], [-7..."
2,Roslindale,15,2.51,"[-71.1257672032289, 42.27231595853639]","[-71.12583076676759, 42.272212845889705]"
3,Roslindale,15,2.51,"[[-71.12592717485386, 42.272013107957406], [-7...","[[-71.12592717485386, 42.272013107957406], [-7..."
4,Jamaica Plain,11,3.94,"[[-71.10499218689807, 42.326101682808066], [-7...","[[-71.10499218689807, 42.326101682808066], [-7..."
5,Mission Hill,13,0.55,"[[-71.0904343142608, 42.33576996328494], [-71....","[[-71.0904343142608, 42.33576996328494], [-71...."
6,Longwood,28,0.29,"[[-71.09810894210769, 42.33673037764089], [-71...","[[-71.09810894210769, 42.33673037764089], [-71..."
7,Bay Village,33,0.04,"[[-71.06662924918761, 42.34878268384542], [-71...","[[-71.06662924918761, 42.34878268384542], [-71..."
8,Leather District,27,0.02,"[[-71.05837839326242, 42.349831092881075], [-7...","[[-71.05837839326242, 42.349831092881075], [-7..."
9,Chinatown,26,0.12,"[[-71.0579055147603, 42.35237863170756], [-71....","[[-71.0579055147603, 42.35237863170756], [-71...."


Get a count of the total number of neighborhoods in Boston

In [18]:
print('Boston has a total of {} neighborhoods.'.format(
        len(neighborhoods['Neighborhood'].unique())
))

Boston has a total of 26 neighborhoods.


#### Using the geopy library to get the latitude and longitude values of Boston.

In order to define an instance of the geocoder, we need to define a user_agent. We
will name our agent <em>boston_explores</em>, as shown below.

In [19]:
address = 'Boston, MA'

geolocator = Nominatim(user_agent="boston_explores")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('The geographical coordinate of Boston are {}, {}.'.format(latitude, longitude))


The geographical coordinate of Boston are 42.3602534, -71.0582912.


#### Create a map of Boston with neighborhoods superimposed on top.

In [20]:
# create map of New York using latitude and longitude values
map_boston = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood_name, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'],
                                                     neighborhoods['Neighborhood'], neighborhoods['Neighborhood_ID']):
    label = '{}, {}'.format(neighborhood, neighborhood_name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_boston)

map_boston


ValueError: Location should consist of two numerical values, but [[-71.12583076676759, 42.272212845889705], [-71.1257672032289, 42.27231595853639], [-71.12588130976181, 42.27211361071484], [-71.12583076676759, 42.272212845889705]] of type <class 'list'> is not convertible to float.