I.	Introduction
    New York City – the city that never sleep, often called as New York, is the most populous and dense city in the United States.  Situated on one of the world’s largest natural harbors, New York City is the composed of five boroughs – Brooklyn, Queens, Manhattan, Bronx and Staten Island.  Based on the record – 62.7 million tourists visited New York City in 2017, especially Manhattan, attracted a large volume of tourists and visitors from all over the world every year.  Beside the most known places like: Times Square, Broadway Theater District, Rockefeller Center, and Wall Street, there are many places and things to do or to visit in New York City. Often, the first timer visiting New York City will need a guidance for choosing a place to stay, places to visit, places to shop and etc…


II.	Business Problem
    Due to this demand and the growth of the travel accommodation, Travel agency starting a project of preparing a list of hotels, shopping centers, places to visit, restaurants along with the rating, address and the map for their clients.  So that their clients can be prepare and know to what to do and where to go when they arrived in New York City – the City that never sleep.
This project is particularly useful for those visitors / tourists that is the first-time visiting New York City or a re-visit after many years.


III.	Target Audience

This project is targeting for Tourists that are not familiar with or never been to New York City before.


IV.	Data
To get all this information, we will need the following data:
-	New York City data containing the Neighborhoods and boroughs along with latitude and longitude coordinates.  This is required to plot the map and get the venue data.
-	Venue data for hotels, restaurants, coffee shops, shopping center so that we can analysis and explore New York City
Data Sources that are using to extract this information
-	New York City data with latitude and longitude coordinates can be found by using the source from the module 3 in this course:
https://cocl.us/new_york_dataset (newyork_data.json)
-	Use Foursquare API to get the venue data for the neighborhoods.  Foursquare API provides many categories of the venue data that we needed for this project such as; hotel, food, places to visit and etc…
-	This project will be required using of web scrapping (open source dataset), working with Foursquare API, data cleaning, data wrangling, map visualization by using Folium and plotting the map with matplotlib.
-	Will define all the detail data analysis in the next section - Methodology


V.   Methodology

- Data will be collected from https://cocl.us/new_york_dataset and cleaned and processed into a dataframe.

- FourSquare be used to locate all venues and then filtered by Restaurants, Hotels, Shopping center, coffee shops and likes by users will be counted and added to the dataframe.

- Data will be sorted based on rankings.

- Finally, the data be will be visually assessed using graphing from Python libraries.

Let's start by importing all the required libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analysis

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handel JSON files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # transform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 4.9.1
  latest version: 4.9.2

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.11.8  |       ha878542_0         145 KB  conda-forge
    certifi-2020.11.8          |   py36h5fab9bb_0         150 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-2.0.0                |     pyh9f0ad1d_0          63 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         392 KB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forg

Load and explore the data
For convenience, I am using the json file that was downloaded from the week 3 lab exercise

In [3]:
# loading the data
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)
    

Transform the data into a Pandas Dataframe
And then loop through the data and fill the data frame one row at a time

In [4]:
neighborhoods_data = newyork_data['features'] # define a new variable that includes this data
neighborhoods_data[0] # take a look at the first item in this list

#define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude']
#instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

#loop through the data and fill the dataframe one row at a time
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough']
    neighborhood_name = data['properties']['name']
    
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                         'Neighborhood': neighborhood_name,
                                         'Latitude': neighborhood_lat,
                                         'Longitude': neighborhood_lon}, ignore_index=True)
 
# examine the resulting dataframe
neighborhoods.head(10)

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585
5,Bronx,Kingsbridge,40.881687,-73.902818
6,Manhattan,Marble Hill,40.876551,-73.91066
7,Bronx,Woodlawn,40.898273,-73.867315
8,Bronx,Norwood,40.877224,-73.879391
9,Bronx,Williamsbridge,40.881039,-73.857446


In [5]:
# make sure the dataset has all 5 boroughs and 306 neighborhoods
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
       len(neighborhoods['Borough'].unique()),
       neighborhoods.shape[0]
       
        )
     )

The dataframe has 5 boroughs and 306 neighborhoods.


Use geopy library to get the latitude and longitude values of New York City

In [6]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent='ny_explorer')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


Create a New York City map with neighborhoods superimposed on top

In [7]:
# create map  of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
          [lat, lng],
          radius=5,
          popup=label,
          color='blue',
          fill=True,
          fill_color='#3186cc',
          fill_opacity=0.7,
          parse_html=False).add_to(map_newyork)
    
map_newyork

For this project, we are concentrating in Manhattan neighborhood for the tourists, so let's slice the original datafrane and create a new data frame of the Manhattan data

In [8]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


Get the geographical coordinates of Manhattan

In [18]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geographical coordinate of Manhattan are 40.7896239, -73.9598939.


And now we can visualize Manhattan Neighborhoods thru the map in below:

In [19]:
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)

map_manhatan

Now that we have the Manhattan map, going to start utilizing the Foursaqure API to explore the neighborhoods.
Define Foursqaure Credentials and Version

In [1]:
CLIENT_ID = 'HCKAPL0F5HQZMWMDE3ZY52QVJCLKD5K2AI4BQPKUH2W0P0GC' # input my Foursquare ID
CLIENT_SECRET = 'RX0PUSJJIBKDDP5TU0HIJW105XBDLWIJSVEZHY11SBPSE4OU' # input my Foursqaure secret
VERSION = '20180605' # Foursquare API version
LIMIT=100 # a default Foursquare API limit value

print('My credentails:')
print('My CLIENT_ID: ' + CLIENT_ID)
print('VERSION:' + VERSION)

My credentails:
My CLIENT_ID: HCKAPL0F5HQZMWMDE3ZY52QVJCLKD5K2AI4BQPKUH2W0P0GC
VERSION:20180605


Searching for all the hotels that are located in Manhattan within 1000 meters radius

In [3]:
search_query = 'Hotel'
radius = 1000
print('Searching for ' + search_query)

Searching for Hotel


Define the corresponding URL fo the search of hotels

In [31]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=HCKAPL0F5HQZMWMDE3ZY52QVJCLKD5K2AI4BQPKUH2W0P0GC&client_secret=RX0PUSJJIBKDDP5TU0HIJW105XBDLWIJSVEZHY11SBPSE4OU&ll=40.7896239,-73.9598939&v=20180605&query=Hotel&radius=1000&limit=100'

Send the get request and examine the results for the search of hotels

In [32]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5fcbfefe2aa0530e5a44b963'},
 'response': {'venues': [{'id': '4ad78cbff964a520140c21e3',
    'name': 'Hotel Wales',
    'location': {'address': '1295 Madison Ave',
     'crossStreet': '92nd St',
     'lat': 40.7847375,
     'lng': -73.9557131,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.7847375,
       'lng': -73.9557131}],
     'distance': 648,
     'postalCode': '10128',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['1295 Madison Ave (92nd St)',
      'New York, NY 10128',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d1fa931735',
      'name': 'Hotel',
      'pluralName': 'Hotels',
      'shortName': 'Hotel',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1607204606',
    'hasPerk': False},
   {'id': '4bc3a05adce4eee125af719d',
    

In [None]:
Get relevant part of JSON and transform it into a pandas dataframe

In [33]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
df_hotel = json_normalize(venues)
df_hotel.head()

  """


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,venuePage.id
0,4ad78cbff964a520140c21e3,Hotel Wales,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1607204606,False,1295 Madison Ave,92nd St,40.784737,-73.955713,"[{'label': 'display', 'lat': 40.7847375, 'lng'...",648,10128.0,US,New York,NY,United States,"[1295 Madison Ave (92nd St), New York, NY 1012...",
1,4bc3a05adce4eee125af719d,Hotel 99 Llc,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1607204606,False,244 W 99th St,,40.79669,-73.970555,"[{'label': 'display', 'lat': 40.79669018312864...",1194,10025.0,US,New York,NY,United States,"[244 W 99th St, New York, NY 10025, United Sta...",
2,514300d3e4b0ed42766e4049,Swimming Pool @ ONE UN Plaza Hotel,"[{'id': '4bf58dd8d48988d105941735', 'name': 'G...",v-1607204606,False,"Manhattan, NY",,40.790278,-73.959722,"[{'label': 'display', 'lat': 40.7902778, 'lng'...",74,,US,New York,NY,United States,"[Manhattan, NY, New York, NY, United States]",
3,57ad7e7b498e76de45bfc3a6,Hotel Bark Ave,"[{'id': '5032897c91d4c4b30a586d69', 'name': 'P...",v-1607204606,False,143 East 103rd,Lexington Ave.,40.79042,-73.948191,"[{'label': 'display', 'lat': 40.79041954961772...",990,10029.0,US,New,NY,United States,"[143 East 103rd (Lexington Ave.), New, NY 1002...",
4,58020e9e38fae0dde3c73496,Hotel Berkers,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1607204606,False,,,40.78771,-73.95272,"[{'label': 'display', 'lat': 40.78771, 'lng': ...",641,10029.0,US,New York,NY,United States,"[New York, NY 10029, United States]",


Define information of interest and filter dataframe for hotel information

In [35]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_hotel.columns if col.startswith('location.')] + ['id']
df1_filtered = df_hotel.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df1_filtered['categories'] = df1_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df1_filtered.columns = [column.split('.')[-1] for column in df1_filtered.columns]

df1_filtered.head()

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,id
0,Hotel Wales,Hotel,1295 Madison Ave,92nd St,40.784737,-73.955713,"[{'label': 'display', 'lat': 40.7847375, 'lng'...",648,10128.0,US,New York,NY,United States,"[1295 Madison Ave (92nd St), New York, NY 1012...",4ad78cbff964a520140c21e3
1,Hotel 99 Llc,Hotel,244 W 99th St,,40.79669,-73.970555,"[{'label': 'display', 'lat': 40.79669018312864...",1194,10025.0,US,New York,NY,United States,"[244 W 99th St, New York, NY 10025, United Sta...",4bc3a05adce4eee125af719d
2,Swimming Pool @ ONE UN Plaza Hotel,Gym Pool,"Manhattan, NY",,40.790278,-73.959722,"[{'label': 'display', 'lat': 40.7902778, 'lng'...",74,,US,New York,NY,United States,"[Manhattan, NY, New York, NY, United States]",514300d3e4b0ed42766e4049
3,Hotel Bark Ave,Pet Service,143 East 103rd,Lexington Ave.,40.79042,-73.948191,"[{'label': 'display', 'lat': 40.79041954961772...",990,10029.0,US,New,NY,United States,"[143 East 103rd (Lexington Ave.), New, NY 1002...",57ad7e7b498e76de45bfc3a6
4,Hotel Berkers,Brewery,,,40.78771,-73.95272,"[{'label': 'display', 'lat': 40.78771, 'lng': ...",641,10029.0,US,New York,NY,United States,"[New York, NY 10029, United States]",58020e9e38fae0dde3c73496


Visualizing the Hotels that are near by Manhattan

In [36]:
df1_filtered.name

0                                          Hotel Wales
1                                         Hotel 99 Llc
2                   Swimming Pool @ ONE UN Plaza Hotel
3                                       Hotel Bark Ave
4                                        Hotel Berkers
5                                      Hotel Wales Gym
6                                      West Park Hotel
7          The Juicy Naam Manhattan at The Hotel Wales
8                                             Hotel 89
9                             Helmsley Park Lane hotel
10                                     New Ebony Hotel
11                                       The Greystone
12                                             Karaoke
13                               Lido Hall Corporation
14                               The Greystone Rooftop
15    Nyinns – Extended Stay Hotels Manhattan New York
Name: name, dtype: object

In [37]:
# generate hotel map centred around the central Manhattan area
hotelvenues_map = folium.Map(location=[latitude, longitude], zoom_start=13)

# add all the hotels in as blue circle markers
for lat, lng, label in zip(df1_filtered.lat, df1_filtered.lng, df1_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(hotelvenues_map)

# display the hotel map
hotelvenues_map

Searching for a list of restuarants around 1000 meters radius around Manhanttan

In [42]:
searchr_query = 'Restaurant'
radius=1000
print('Searching for...' + searchr_query)

Searching for...Restaurant


Define the corresponding URL for the search of restaurants

In [43]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, searchr_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=HCKAPL0F5HQZMWMDE3ZY52QVJCLKD5K2AI4BQPKUH2W0P0GC&client_secret=RX0PUSJJIBKDDP5TU0HIJW105XBDLWIJSVEZHY11SBPSE4OU&ll=40.7896239,-73.9598939&v=20180605&query=Restaurant&radius=1000&limit=100'

Send the GET Request and examine the results for the search of restaurants

In [44]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5fcc093f037f4c3938b8a6a8'},
 'response': {'venues': [{'id': '4a897cb1f964a5201f0820e3',
    'name': '3 Guys Restaurant',
    'location': {'address': '49 E 96th St',
     'crossStreet': 'Madison Ave',
     'lat': 40.787442622504265,
     'lng': -73.95403610873488,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.787442622504265,
       'lng': -73.95403610873488},
      {'label': 'entrance', 'lat': 40.787227, 'lng': -73.953794}],
     'distance': 550,
     'postalCode': '10128',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['49 E 96th St (Madison Ave)',
      'New York, NY 10128',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d147941735',
      'name': 'Diner',
      'pluralName': 'Diners',
      'shortName': 'Diner',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/diner_',
       'suffix': '.png'},
      'primary': True}],

Get relevant part of JSON and transform it into a dataframe

In [45]:
# assign relevant part of JSON to venues
rest_venues = results['response']['venues']

# tranform venues into a dataframe
df_rest = json_normalize(rest_venues)
df_rest.head()

  after removing the cwd from sys.path.


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,venuePage.id,location.neighborhood
0,4a897cb1f964a5201f0820e3,3 Guys Restaurant,"[{'id': '4bf58dd8d48988d147941735', 'name': 'D...",v-1607207231,False,49 E 96th St,Madison Ave,40.787443,-73.954036,"[{'label': 'display', 'lat': 40.78744262250426...",550,10128,US,New York,NY,United States,"[49 E 96th St (Madison Ave), New York, NY 1012...",278300.0,https://www.seamless.com/menu/3-guys-96th-1381...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,,
1,4a7778a1f964a5209be41fe3,Carmine's Italian Restaurant,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",v-1607207231,False,2450 Broadway,btwn W 90th & W 91st,40.791096,-73.973991,"[{'label': 'display', 'lat': 40.7910963, 'lng'...",1199,10024,US,New York,NY,United States,"[2450 Broadway (btwn W 90th & W 91st), New Yor...",294727.0,https://www.seamless.com/menu/carmines-upper-w...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,,
2,4abc2282f964a5208a8620e3,Gennaro Restaurant,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",v-1607207231,False,665 Amsterdam Ave,W 93rd St,40.791932,-73.971931,"[{'label': 'display', 'lat': 40.79193160418364...",1046,10025,US,New York,NY,United States,"[665 Amsterdam Ave (W 93rd St), New York, NY 1...",,,,,,,,
3,4a2eb2b0f964a52036981fe3,Malecon Restaurant II,"[{'id': '4bf58dd8d48988d1be941735', 'name': 'L...",v-1607207231,False,764 Amsterdam Ave,btw 97th St & 98th St,40.794932,-73.969648,"[{'label': 'display', 'lat': 40.79493159833159...",1012,10025,US,New York,NY,United States,"[764 Amsterdam Ave (btw 97th St & 98th St), Ne...",1334178.0,https://www.seamless.com/menu/malecon-764-amst...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,,
4,4b21779bf964a5204e3c24e3,Nick's Restaurant & Pizzeria,"[{'id': '4bf58dd8d48988d1ca941735', 'name': 'P...",v-1607207231,False,1814 2nd Ave,at E 94th St,40.782923,-73.948014,"[{'label': 'display', 'lat': 40.78292250725332...",1248,10128,US,New York,NY,United States,"[1814 2nd Ave (at E 94th St), New York, NY 101...",1085037.0,https://www.seamless.com/menu/nicks-pizzeria-1...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,,


Define information of interest and filter dataframe

In [46]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_rest.columns if col.startswith('location.')] + ['id']
df2_filtered = df_rest.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df2_filtered['categories'] = df2_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df2_filtered.columns = [column.split('.')[-1] for column in df2_filtered.columns]

df2_filtered

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,id
0,3 Guys Restaurant,Diner,49 E 96th St,Madison Ave,40.787443,-73.954036,"[{'label': 'display', 'lat': 40.78744262250426...",550,10128.0,US,New York,NY,United States,"[49 E 96th St (Madison Ave), New York, NY 1012...",,4a897cb1f964a5201f0820e3
1,Carmine's Italian Restaurant,Italian Restaurant,2450 Broadway,btwn W 90th & W 91st,40.791096,-73.973991,"[{'label': 'display', 'lat': 40.7910963, 'lng'...",1199,10024.0,US,New York,NY,United States,"[2450 Broadway (btwn W 90th & W 91st), New Yor...",,4a7778a1f964a5209be41fe3
2,Gennaro Restaurant,Italian Restaurant,665 Amsterdam Ave,W 93rd St,40.791932,-73.971931,"[{'label': 'display', 'lat': 40.79193160418364...",1046,10025.0,US,New York,NY,United States,"[665 Amsterdam Ave (W 93rd St), New York, NY 1...",,4abc2282f964a5208a8620e3
3,Malecon Restaurant II,Latin American Restaurant,764 Amsterdam Ave,btw 97th St & 98th St,40.794932,-73.969648,"[{'label': 'display', 'lat': 40.79493159833159...",1012,10025.0,US,New York,NY,United States,"[764 Amsterdam Ave (btw 97th St & 98th St), Ne...",,4a2eb2b0f964a52036981fe3
4,Nick's Restaurant & Pizzeria,Pizza Place,1814 2nd Ave,at E 94th St,40.782923,-73.948014,"[{'label': 'display', 'lat': 40.78292250725332...",1248,10128.0,US,New York,NY,United States,"[1814 2nd Ave (at E 94th St), New York, NY 101...",,4b21779bf964a5204e3c24e3
5,Kouzan Japanese Restaurant,Sushi Restaurant,685 Amsterdam Ave,West 93rd St,40.792213,-73.971664,"[{'label': 'display', 'lat': 40.79221265537916...",1032,10025.0,US,New York,NY,United States,"[685 Amsterdam Ave (West 93rd St), New York, N...",,49e61c9df964a52009641fe3
6,Lex Restaurant,Italian Restaurant,1370 Lexington Ave,btwn E 90th & E 91st St,40.78253,-73.9537,"[{'label': 'display', 'lat': 40.78253, 'lng': ...",946,10128.0,US,New York,NY,United States,"[1370 Lexington Ave (btwn E 90th & E 91st St),...",,4aad53a2f964a520b75f20e3
7,The New Amity Restaurant,Diner,1134 Madison Ave,84th St.,40.779838,-73.959832,"[{'label': 'display', 'lat': 40.7798381, 'lng'...",1089,10028.0,US,New York,NY,United States,"[1134 Madison Ave (84th St.), New York, NY 100...",,4b282b9af964a520309024e3
8,Judy's Restaurant,Latin American Restaurant,1505 Lexington Ave,,40.786681,-73.950389,"[{'label': 'display', 'lat': 40.78668052588666...",865,10029.0,US,New York,NY,United States,"[1505 Lexington Ave, New York, NY 10029, Unite...",,4e248a5fe4cdf68591a40adf
9,Giovanna's Restaurant,Italian Restaurant,1567 Lexington Ave,E. 100th Street,40.788523,-73.949043,"[{'label': 'display', 'lat': 40.78852325970528...",922,10029.0,US,New York,NY,United States,"[1567 Lexington Ave (E. 100th Street), New Yor...",,4a7b0914f964a520d7e91fe3


Visualizing the restuarants that are near by

In [47]:
df2_filtered.name

0                                 3 Guys Restaurant
1                      Carmine's Italian Restaurant
2                                Gennaro Restaurant
3                             Malecon Restaurant II
4                      Nick's Restaurant & Pizzeria
5                        Kouzan Japanese Restaurant
6                                    Lex Restaurant
7                          The New Amity Restaurant
8                                 Judy's Restaurant
9                             Giovanna's Restaurant
10               El Internacional Cafe & Restaurant
11                             Chinatown Restaurant
12                                Island Restaurant
13                               Paola's Restaurant
14                  Bodrum Mediterranean Restaurant
15                                  Mole Restaurant
16                                 Telio Restaurant
17                               Polonia Restaurant
18                            Hanratty's Restaurant
19          

In [48]:
# generate a restaurant map centred around Manhattan area
restvenues_map = folium.Map(location=[latitude, longitude], zoom_start=13)

# add the restaurants as blue circle markers
for lat, lng, label in zip(df2_filtered.lat, df2_filtered.lng, df2_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(restvenues_map)

# display map
restvenues_map

Searching for coffee shops in Manhattan around 1000 meters radius area

In [4]:
searchco_query = 'Coffee'
radius = 1000
print('searching for...' + searchco_query)

searching for...Coffee


Define the corresponding URL for the search of coffee shops

In [59]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, searchco_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=HCKAPL0F5HQZMWMDE3ZY52QVJCLKD5K2AI4BQPKUH2W0P0GC&client_secret=RX0PUSJJIBKDDP5TU0HIJW105XBDLWIJSVEZHY11SBPSE4OU&ll=40.7896239,-73.9598939&v=20180605&query=Coffee&radius=1000&limit=100'

Send the GET Request and examine the results for the search of coffee shops

In [60]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5fcc0de91d30726847eca353'},
 'response': {'venues': [{'id': '5177d30c498e9b657328e30f',
    'name': 'Coffee Cart - 97th & Columbus',
    'location': {'address': 'W 97th Street',
     'crossStreet': 'Columbus Ave',
     'lat': 40.79462883157771,
     'lng': -73.96660262569878,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.79462883157771,
       'lng': -73.96660262569878}],
     'distance': 793,
     'postalCode': '10025',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['W 97th Street (Columbus Ave)',
      'New York, NY 10025',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d1cb941735',
      'name': 'Food Truck',
      'pluralName': 'Food Trucks',
      'shortName': 'Food Truck',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/streetfood_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1607208425

Get relevant part of JSON and transform it into a pandas dataframe

In [61]:
# assign relevant part of JSON to venues
cof_venues = results['response']['venues']

# tranform venues into a dataframe
df_coff = json_normalize(cof_venues)
df_coff.head()

  """


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,venuePage.id
0,5177d30c498e9b657328e30f,Coffee Cart - 97th & Columbus,"[{'id': '4bf58dd8d48988d1cb941735', 'name': 'F...",v-1607208425,False,W 97th Street,Columbus Ave,40.794629,-73.966603,"[{'label': 'display', 'lat': 40.79462883157771...",793,10025.0,US,New York,NY,United States,"[W 97th Street (Columbus Ave), New York, NY 10...",,,,,,,
1,4d2fb1575acfa35d7065f2cb,Coffee Cart - 96th St,"[{'id': '4bf58dd8d48988d1cb941735', 'name': 'F...",v-1607208425,False,96th St.,Central Park West,40.791754,-73.964864,"[{'label': 'display', 'lat': 40.79175386492058...",481,10025.0,US,New York,NY,United States,"[96th St. (Central Park West), New York, NY 10...",,,,,,,
2,58a5d1d426a95370d3355e0d,WFM Coffee Bar,"[{'id': '4bf58dd8d48988d1e0931735', 'name': 'C...",v-1607208425,False,808 Columbus Ave,,40.795145,-73.965855,"[{'label': 'display', 'lat': 40.795145, 'lng':...",793,10025.0,US,New York,NY,United States,"[808 Columbus Ave, New York, NY 10025, United ...",,,,,,,
3,5037e21fe4b0b68ee3d6546f,Birch Coffee,"[{'id': '4bf58dd8d48988d1e0931735', 'name': 'C...",v-1607208425,False,750 Columbus Ave,btwn W 96th & W 97th St,40.793131,-73.967241,"[{'label': 'display', 'lat': 40.79313058780598...",731,10025.0,US,New York,NY,United States,"[750 Columbus Ave (btwn W 96th & W 97th St), N...",,,,,,,
4,4d53548e2f638cfa759e616a,Coffee,"[{'id': '4bf58dd8d48988d1cb941735', 'name': 'F...",v-1607208425,False,43rd And 11th Ave,,40.792582,-73.965156,"[{'label': 'display', 'lat': 40.79258171403207...",552,,US,New York,NY,United States,"[43rd And 11th Ave, New York, NY, United States]",,,,,,,


Define information of interest and filter dataframe

In [63]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_coff.columns if col.startswith('location.')] + ['id']
df3_filtered = df_coff.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df3_filtered['categories'] = df3_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df3_filtered.columns = [column.split('.')[-1] for column in df3_filtered.columns]

df3_filtered

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,id
0,Coffee Cart - 97th & Columbus,Food Truck,W 97th Street,Columbus Ave,40.794629,-73.966603,"[{'label': 'display', 'lat': 40.79462883157771...",793,10025,US,New York,NY,United States,"[W 97th Street (Columbus Ave), New York, NY 10...",5177d30c498e9b657328e30f
1,Coffee Cart - 96th St,Food Truck,96th St.,Central Park West,40.791754,-73.964864,"[{'label': 'display', 'lat': 40.79175386492058...",481,10025,US,New York,NY,United States,"[96th St. (Central Park West), New York, NY 10...",4d2fb1575acfa35d7065f2cb
2,WFM Coffee Bar,Coffee Shop,808 Columbus Ave,,40.795145,-73.965855,"[{'label': 'display', 'lat': 40.795145, 'lng':...",793,10025,US,New York,NY,United States,"[808 Columbus Ave, New York, NY 10025, United ...",58a5d1d426a95370d3355e0d
3,Birch Coffee,Coffee Shop,750 Columbus Ave,btwn W 96th & W 97th St,40.793131,-73.967241,"[{'label': 'display', 'lat': 40.79313058780598...",731,10025,US,New York,NY,United States,"[750 Columbus Ave (btwn W 96th & W 97th St), N...",5037e21fe4b0b68ee3d6546f
4,Coffee,Food Truck,43rd And 11th Ave,,40.792582,-73.965156,"[{'label': 'display', 'lat': 40.79258171403207...",552,,US,New York,NY,United States,"[43rd And 11th Ave, New York, NY, United States]",4d53548e2f638cfa759e616a
5,Coffee Cart,Food Truck,85th and Lexington,,40.779683,-73.957511,"[{'label': 'display', 'lat': 40.77968253446027...",1124,,US,New York,NY,United States,"[85th and Lexington, New York, NY, United States]",4dca7fde1f6e28126777d908
6,Coffee Cart - 103rd St,Food Truck,103rd Street,CPW,40.796066,-73.961586,"[{'label': 'display', 'lat': 40.79606561476228...",731,,US,New York,NY,United States,"[103rd Street (CPW), New York, NY, United States]",4f6243bae4b0a20e1e91aabf
7,Frenchy Coffee,Café,129 E 102nd St,,40.789873,-73.948341,"[{'label': 'display', 'lat': 40.7898734, 'lng'...",974,10029,US,New York,NY,United States,"[129 E 102nd St, New York, NY 10029, United St...",5a772c1625ecca5ea8cd01cd
8,Coffee & Donuts Cart,Food Truck,92nd Street & Columbus,,40.790062,-73.969147,"[{'label': 'display', 'lat': 40.79006195068359...",781,10025,US,New York,NY,United States,"[92nd Street & Columbus, New York, NY 10025, U...",4fdb2abfe4b0cbb58909cb7b
9,Coffee Cart - 98th & Madison,Food Truck,98th Street,Madison Avenue (South East Corner),40.788527,-73.953016,"[{'label': 'display', 'lat': 40.78852683326035...",592,10029,US,New York,NY,United States,[98th Street (Madison Avenue (South East Corne...,4bdb92b7383276b0c8b67369


Visualizing the Coffee shops that are near by

In [64]:
df3_filtered.name

0               Coffee Cart - 97th & Columbus
1                       Coffee Cart - 96th St
2                              WFM Coffee Bar
3                                Birch Coffee
4                                      Coffee
5                                 Coffee Cart
6                      Coffee Cart - 103rd St
7                              Frenchy Coffee
8                        Coffee & Donuts Cart
9                Coffee Cart - 98th & Madison
10                    Jack’s Stir Brew Coffee
11                                Coffee Cart
12                             The Coffee Man
13    Mount Sinai Medical Center Coffee Stand
14                   102nd Street Coffee Cart
15                          Coffee and Canvas
16    Carnegie Hill Restaurant & Coffee House
17                 The Burger One Coffee Shop
18                          Street Coffee Guy
19                 Louie - Corner Coffee Cart
20                             Bluestone Lane
21                        Pete's C

In [65]:
# generate map centred around the Manhattan area
coffvenues_map = folium.Map(location=[latitude, longitude], zoom_start=13)

# add the coffee shops as blue circle markers
for lat, lng, label in zip(df3_filtered.lat, df3_filtered.lng, df3_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(coffvenues_map)

# display map
coffvenues_map