#  IBM Data Science Capstone Project 
## Planning Travel and Tours in New York City's Broadway Theatre District
### Comparison of Hotels, Restaurants, and Theaters in Manhattan, NY for a Broadway Theater Trip
### Brian Vineyard 

### Business Problem


**Business Problem**: A travel service is planning to offer Broadway trips with the best options in New York City, NY to include on their package. They want to build a package with varying hotels, and a list of recommended restaurants. 

Their clients are mostly interested in Broadway musicals and plays, so this project will focus on finding the best hotels and restaurants near the Broadway theater district.

![Broadway](77.jpg)

**Key Questions to Answer**

- What are the best theaters to catch the top Broadway shows?
- What hotels close to the Theater District have the best ratings and room rates?
- What are the best nearby restaurants?


**Foursquare API study of following city**:

- Broadway Theatre District, New York City, NY

Example search criteria: Hotels, Restaurants, Museums, Live Shows

**Project Details**: 

This project uses the following tools and technology:

- [Foursquare Places API](https://https://enterprise.foursquare.com/products/places)
- [2014 New York City Neighborhood Names](https://geo.nyu.edu/catalog/nyu_2451_34572)
- [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/)


The goal is to find the best area of the city to hold tours, based on our findings.

**Data Sources and Description**:

This project will use data for the latitude and longitude of zipcodes located in the following city:

- New York, NY


The coordinates will be passed to the Foursquare Places API, for querying New York City neighborhoods for venues such as:

- **Restaurants**
    - Liebman's Deli, La Morada, Royal 35 Steakhouse, Le Benardin
- **Hotels**
    - Gramercy Park, Life Hotel Nomad, Mansfield Hotel
- **Fun**
    - Museums: Metropoliton Museum of Art, 9/11 Memorial, American Museum of Natural History
    - Top Tourist Destinations: Empire State Building, Madison Square Garden, Statue of Liberty, Times Square
    - Broadway Productions: Hamilton, The Lion King, Wicked, Moulin Rouge, Phantom of the Opera
    - Sightseeing Cruises: Circle Line Full Island, Classic Harbor Lines, Zephyr Yacht, Liberty Cruises
    - Parks: Central Park, Washington Square Park, Fort Tryon Park, Bronx Zoo, 
    
 ### Table of Contents
 1. [Import Libraries, Foursquare Credentials](#import-1)
 2. [New York's Manhattan Borough Neighborhood Data](#borough-2)
   - [Map of Neighborhoods in New York City's Five Boroughs](#borough-map)
   - [Map of Neighborhoods in NYC's Manhattan Borough](#neighor-map)
 3. [Broadway Theatres and Current Productions](#broadway-3)
   - [Map of Broadway Theatre Locations](#broadway-map)
 4. [Hotels Near the Broadway Theatre District](#hotels-4)
   - [Map of Hotels Near Broadway Theatres](#hotels-map)
 5. [Restaurants Near the Broadway Theatre District](#restaurants-5)
   - [Map of Restaurants Near Broadway Theatres](#rest_map)
 6. [Conclusions and Findings](#conclusions)

## 1. Import Libraries, Foursquare Credentials, and Neighborhood Data <a name="import-1"></a>

### Import Libraries ###

In [49]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
import pickle

from geopy.geocoders import Nominatim # Convert addresses into GPS coordinates
from bs4 import BeautifulSoup

import requests # library to handle requests

from pandas.io.json import json_normalize # transform JSON file into a Pandas dataframe
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library
print('Libraries imported.')

Libraries imported.


### Load Foursquare Parameters ###

In [50]:
# Loads the Foursquare credentials for logging in to the API

file = open('credentials2.p', 'rb')
credentials = pickle.load(file)
file.close()
CLIENT_ID = credentials['CLIENT_ID']
CLIENT_SECRET = credentials['CLIENT_SECRET']
VERSION = '20180604'
LIMIT = 30

## 2. New York's Manhattan Borough Neighborhood Data <a name="borough-2"></a>

### Import Neighborhood Data ###

New York City has a total of 5 boroughs and 306 neighborhoods. We will be concentrating on the Manhattan borough, where Broadway and the Theatre District are located. We will start with a larger dataset, however, with the latitude and longitude of each of NYC's neighborhoods.

We will pull in the data as a JSON file, then load that into a pandas dataframe for analysis.

#### Open  and explore the JSON file for New York City Neighborhood Names ####

In [51]:
with open('nyu-2451-34572-geojson.json') as json_data:
    newyork_data = json.load(json_data)

In [53]:
# View the data from the JSON file

newyork_data

{'type': 'FeatureCollection',
 'totalFeatures': 306,
 'features': [{'type': 'Feature',
   'id': 'nyu_2451_34572.1',
   'geometry': {'type': 'Point',
    'coordinates': [-73.84720052054902, 40.89470517661]},
   'geometry_name': 'geom',
   'properties': {'name': 'Wakefield',
    'stacked': 1,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661]}},
  {'type': 'Feature',
   'id': 'nyu_2451_34572.2',
   'geometry': {'type': 'Point',
    'coordinates': [-73.82993910812398, 40.87429419303012]},
   'geometry_name': 'geom',
   'properties': {'name': 'Co-op City',
    'stacked': 2,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.87429419303012]}},
  {'type': 'Feature',
 

Notice how all the relevant data is in the features key, which is basically a list of the neighborhoods. So, let's define a new variable that includes this data.

In [54]:
neighborhoods_data = newyork_data['features']

Let's take a look at the first item in this list.

In [55]:
neighborhoods_data[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

#### Tranform the data into a pandas dataframe ####

In [56]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

Loop through the data and load the dataframe one row at a time

In [57]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [58]:
# Look at the first five rows of data
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [59]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


###  Map of Neighborhoods in the New York City' Five Boroughs <a name="borough-map"></a>

#### Get the New York City latitude and longitude coordinates from geopy ####

In [60]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of New York City are 40.7127281, -74.0060152.


**Build map of New York City in Folium, with neighborhoods in the five boroughs indicated with blue markers**

In [61]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

**Map of New York City Neighborhoods in Five Boroughs: Manhattan, The Bronx, Brooklyn, Queens, and Staten Island**

### Map of Neighborhoods in NYC's Manhattan Borough <a name="neighbor-map"></a>

New York City has five boroughs:
    
    - Manhattan
    - The Bronx
    - Brooklyn
    - Queens
    - Staten Island
    
We will be looking more closely at neighborhoods in Manhattan, as this is where Broadway's Theatre District is located.

Let's take the Manhattan data and build that into a new dataframe for that borough.

In [62]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


Let get the size of the Manhattan data frame:

In [63]:
manhattan_data.shape

(40, 4)

From our results, we can see there are 40 neighborhoods within Manhattan.

Let's take a look at all the values in the dataframe:

In [64]:
manhattan_data

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688
5,Manhattan,Manhattanville,40.816934,-73.957385
6,Manhattan,Central Harlem,40.815976,-73.943211
7,Manhattan,East Harlem,40.792249,-73.944182
8,Manhattan,Upper East Side,40.775639,-73.960508
9,Manhattan,Yorkville,40.77593,-73.947118


Close to Times Square, there is a TKTS ticket booth that sells ticket to all the Broadway shows.

Let's set our address in Nominatim for the TKTS ticket booth location get the coordinates.

In [65]:
address = 'Broadway at, W 47th St, New York, NY 10036'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = 40.7591855
longitude = -73.9848361
print('The geograpical coordinates of the TKTS booth near Times Square are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of the TKTS booth near Times Square are 40.7591855, -73.9848361.


### Build map of Manhattan neighborhoods from latitude and longitude coordinates ###

Let's draw a map using Folium, marking the TKTS booth near Times Square as a central location in the Theatre District, with green markers for the neighborhoods in Manhattan.

In [66]:
# create map of the Theatre District in Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add a red circle marker to represent the TKTS ticket booth
folium.features.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='TKTS ticket booth in Times Square',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(map_manhattan)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

**Manhattan, New York with markers for the 40 neighborhoods, and the TKTS booth near Times Square marked in red.**

## 3. Broadway Theatres and Current Productions <a name="broadway-3"></a>

## Theatres on Broadway - Maps and Analysis ##

Let's look at the 40 theatres currently showing Broadway productions.

**Some of the most popular shows at the time of this project are**:
![Shows](shows.jpg)

Let's start by using BeautifulSoup to parse a list of the theatres and currently running shows from the following Wikipedia page:
https://en.wikipedia.org/wiki/Broadway_theatre

### Build pandas dataframe with Theatre Data scraped from a Broadway Wiki Page using BeautifulSoup

In [67]:
# Import Pandas and Beautiful Soup libraries
import pandas as pd
import urllib.request
from bs4 import BeautifulSoup
import requests

In [68]:
# Use Beautiful Soup to get the theatre data from the wiki page
sourcelink = 'https://en.wikipedia.org/wiki/Broadway_theatre'
source = requests.get(sourcelink).text
soup = BeautifulSoup(source, 'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<html class="client-nojs" dir="ltr" lang="en">
 <head>
  <meta charset="utf-8"/>
  <title>
   Broadway theatre - Wikipedia
  </title>
  <script>
   document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonthNamesShort":["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],"wgRequestId":"XjrEPgpAADoAAF-vfJ0AAAEG","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"Broadway_theatre","wgTitle":"Broadway theatre","wgCurRevisionId":938745947,"wgRevisionId":938745947,"wgArticleId":725252,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Webarchive template wayback links","Articles with short de

In [69]:
# Build dataframe with theatre data

Theatre_Data_df = pd.DataFrame({'Theatre':'',
                          'Address':'',
                          'City':'',
                          'State':'', 
                          'Capacity':int(),
                          'OwnerOperator':'',
                          'CurrentProduction':'',
                          'Type':'',
                          'Opening':'',
                          'Closing':'',
}, 
             index=[1])

In [None]:
# Build theatres table from matching table on the theatres wiki page and view the HTML code
theatres_table = soup.find('table', {'class':'wikitable sortable'})
theatres_table

### Loop through the theatres table and load our dataframe row by row.

In [70]:
# Initialize values for dataframe fields
Theatre = 0
Address = 0
Capacity = 0
OwnerOperator = 0
CurrentProduction = 0
Type = 0
Opening = 0
Closing = 0

for tr in theatres_table.find_all('tr'):
    i = 0
    for td in tr.find_all('td'):
        if i == 0:
            Theatre = td.text
            i = i + 1
        elif i == 1:
            Address = td.text
            i = i + 1
        elif i == 2:
            Capacity = td.text
            i = i + 1
        elif i == 3:
            Owner_Operator = td.text
            i = i + 1
        elif i == 4:
            CurrentProduction = td.text
            i = i + 1
        elif i == 5:
            Type = td.text
            i = i + 1
        elif i == 6:
            Opening = td.text
            i = i + 1
        elif i == 7:
            Closing = td.text.strip('\n').replace(']','')
            i = i + 1        
            Theatre_Data_df = Theatre_Data_df.append({'Theatre': Theatre,'Address': Address,'Capacity': Capacity, 'OwnerOperator': Owner_Operator, 'CurrentProduction': CurrentProduction, 'Type': Type, 'Opening': Opening, 'Closing': Closing},ignore_index=True)

In [71]:
# Let's look at the data in our theatres dataframe
Theatre_Data_df

Unnamed: 0,Theatre,Address,City,State,Capacity,OwnerOperator,CurrentProduction,Type,Opening,Closing
0,,,,,0,,,,,
1,Al Hirschfeld Theatre,W. 45th St. (No. 302),,,1424,Jujamcyn Theaters,Moulin Rouge![46],Musical,"2019-07-25July 25, 2019",Open-ended
2,Ambassador Theatre,W. 49th St. (No. 219),,,1125,Shubert Organization,Chicago,Musical,"1996-11-14November 14, 1996",Open-ended
3,American Airlines Theatre,W. 42nd St. (No. 227),,,740,Roundabout Theatre Company,A Soldier's Play[47],Play,"2020-01-21January 21, 2020","2020-03-15March 15, 2020*"
4,August Wilson Theatre,W. 52nd St. (No. 245),,,1228,Jujamcyn Theaters,Mean Girls,Musical,"2018-04-08April 8, 2018",Open-ended
5,Belasco Theatre,W. 44th St. (No. 111),,,1018,Shubert Organization,Girl from the North Country[48],Musical,"2020-03-05March 5, 2020*",Open-ended
6,Bernard B. Jacobs Theatre,W. 45th St. (No. 242),,,1078,Shubert Organization,Company[49],Musical,"2020-03-22March 22, 2020*",Open-ended
7,Booth Theatre,W. 45th St. (No. 222),,,766,Shubert Organization,Who's Afraid of Virginia Woolf?[50],Play,"2020-04-09April 9, 2020*","2020-08-02August 2, 2020"
8,Broadhurst Theatre,W. 44th St. (No. 235),,,1186,Shubert Organization,Jagged Little Pill[51],Musical,"2019-12-05December 5, 2019",Open-ended
9,Broadway Theatre,W. 53rd St & Broadway (No. 1681),,,1761,Shubert Organization,West Side Story[52],Musical,"2020-02-20February 20, 2020*",Open-ended


In [72]:
# Let's drop the empty first row
Theatre_Data_df.drop([0])

Unnamed: 0,Theatre,Address,City,State,Capacity,OwnerOperator,CurrentProduction,Type,Opening,Closing
1,Al Hirschfeld Theatre,W. 45th St. (No. 302),,,1424,Jujamcyn Theaters,Moulin Rouge![46],Musical,"2019-07-25July 25, 2019",Open-ended
2,Ambassador Theatre,W. 49th St. (No. 219),,,1125,Shubert Organization,Chicago,Musical,"1996-11-14November 14, 1996",Open-ended
3,American Airlines Theatre,W. 42nd St. (No. 227),,,740,Roundabout Theatre Company,A Soldier's Play[47],Play,"2020-01-21January 21, 2020","2020-03-15March 15, 2020*"
4,August Wilson Theatre,W. 52nd St. (No. 245),,,1228,Jujamcyn Theaters,Mean Girls,Musical,"2018-04-08April 8, 2018",Open-ended
5,Belasco Theatre,W. 44th St. (No. 111),,,1018,Shubert Organization,Girl from the North Country[48],Musical,"2020-03-05March 5, 2020*",Open-ended
6,Bernard B. Jacobs Theatre,W. 45th St. (No. 242),,,1078,Shubert Organization,Company[49],Musical,"2020-03-22March 22, 2020*",Open-ended
7,Booth Theatre,W. 45th St. (No. 222),,,766,Shubert Organization,Who's Afraid of Virginia Woolf?[50],Play,"2020-04-09April 9, 2020*","2020-08-02August 2, 2020"
8,Broadhurst Theatre,W. 44th St. (No. 235),,,1186,Shubert Organization,Jagged Little Pill[51],Musical,"2019-12-05December 5, 2019",Open-ended
9,Broadway Theatre,W. 53rd St & Broadway (No. 1681),,,1761,Shubert Organization,West Side Story[52],Musical,"2020-02-20February 20, 2020*",Open-ended
10,Brooks Atkinson Theatre,W. 47th St. (No. 256),,,1094,Nederlander Organization,Six[53],Musical,"2020-03-12March 12, 2020*",Open-ended


In [73]:
# Let's drop some of the columns that aren't needed
Theatre_Data_df.drop(['OwnerOperator','Type','Opening','Closing'],axis=1)

Unnamed: 0,Theatre,Address,City,State,Capacity,CurrentProduction
0,,,,,0,
1,Al Hirschfeld Theatre,W. 45th St. (No. 302),,,1424,Moulin Rouge![46]
2,Ambassador Theatre,W. 49th St. (No. 219),,,1125,Chicago
3,American Airlines Theatre,W. 42nd St. (No. 227),,,740,A Soldier's Play[47]
4,August Wilson Theatre,W. 52nd St. (No. 245),,,1228,Mean Girls
5,Belasco Theatre,W. 44th St. (No. 111),,,1018,Girl from the North Country[48]
6,Bernard B. Jacobs Theatre,W. 45th St. (No. 242),,,1078,Company[49]
7,Booth Theatre,W. 45th St. (No. 222),,,766,Who's Afraid of Virginia Woolf?[50]
8,Broadhurst Theatre,W. 44th St. (No. 235),,,1186,Jagged Little Pill[51]
9,Broadway Theatre,W. 53rd St & Broadway (No. 1681),,,1761,West Side Story[52]


Let's download the data into a .csv file for manual processing in Excel. We will clean up the format of the address data.

In [74]:
Theatre_Data_df.to_csv(r'C:\Users\brian\BroadwayTheatres.csv')

In [75]:
# Build a new dataframe with cleaner theatre data
Column_Names = ['Theatre','Address','City','State','ZipCode','Latitude','Longitude','CurrentProduction','Type']
Theatre_Data_df4 = pd.DataFrame(columns=Column_Names)

In [76]:
Theatre_Data_df4 = pd.read_csv('BroadwayTheatres2.csv',dtype={'Latitude': float,'Longitude': float})
Theatre_Data_df4

Unnamed: 0,Theatre,Address,City,State,Capacity,CurrentProduction,Type,Latitude,Longitude
0,Al Hirschfeld Theatre,302 West 45th St.,New York,NY,1424,Moulin Rouge![46],Musical,40.759253,-73.989211
1,Ambassador Theatre,219 West 49th St.,New York,NY,1125,Chicago,Musical,40.761236,-73.98499
2,American Airlines Theatre,227 West 42nd St.,New York,NY,740,A Soldier's Play[47],Play,40.757156,-73.988119
3,August Wilson Theatre,245 West 52nd St.,New York,NY,1228,Mean Girls,Musical,40.763373,-73.984193
4,Belasco Theatre,111 West 44th St.,New York,NY,1018,Girl from the North Country[48],Musical,40.756644,-73.983801
5,Bernard B. Jacobs Theatre,242 West 45th St.,New York,NY,1078,Company[49],Musical,40.758608,-73.987741
6,Booth Theatre,222 West 45th St.,New York,NY,766,Who's Afraid of Virginia Woolf?[50],Play,40.758373,-73.98709
7,Broadhurst Theatre,235 West 44th St.,New York,NY,1186,Jagged Little Pill[51],Musical,40.758269,-73.987617
8,Broadway Theatre,1681 West 53rd St & Broadway,New York,NY,1761,West Side Story[52],Musical,40.764457,-73.985746
9,Brooks Atkinson Theatre,256 West 47th St.,New York,NY,1094,Six[53],Musical,40.759975,-73.986966


### Map of Broadway Theatre Locations <a name="broadway-map"></a>

In [77]:
# Setting the latitude and longitude of Broadway

broadway_latitude = 40.75659 # neighborhood latitude value
broadway_longitude = -73.98626, # neighborhood longitude value

neighborhood_name = 'Broadway'

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               broadway_latitude, 
                                                               broadway_longitude))

Latitude and longitude values of Broadway are 40.75659, (-73.98626,).


In [78]:
# create map of the Broadway theatre district using latitude and longitude values
map_broadway = folium.Map(location=[broadway_latitude, broadway_longitude], zoom_start=14)

# add markers to map
for lat, lng, label in zip(Theatre_Data_df4['Latitude'], Theatre_Data_df4['Longitude'], Theatre_Data_df4['Theatre']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='black',
        fill=True,
        fill_color='purple',
        fill_opacity=0.7,
        parse_html=False).add_to(map_broadway) 

map_broadway

**Map Showing Locations of 40 Broadway Theatres in Manhattan, New York City**

#### Theatre Location Analysis ####

Looking at the mapped results, we see that the greatest concentration of theatres appears clustered near 44th and 45th streets between 7th Avenue and 8th Avenue.

This area is looking like a great spot to find local restaurants and hotels.

Let's set our starting point at the Richard Rodgers Theatre where *Hamilton* is currently playing.

This theatre is also located near the center of our cluster:

![Richard Rodgers Theatre](RichardRodgersMap.jpg)


**Google Maps street level view of the intersection of West 47th Street and 7th Avenue from the TKTS booth location**:
![TKTS Booth View](TKTSView.jpg)

## 4. Hotels Near the Broadway Theatre District <a name="hotels-4"></a>

### Search with the Foursquare API for Hotels near the Richard Rodgers Theatre

We will use the address of Richard Rodgers Theatre as our starting point as we look for nearby hotels using the Foursquare API.

Let's get the hotel's latitude and longitude values.

In [155]:
# Set address of Richard Rodgers Theatre as our starting point
address = '226 West 46th Street, New York, NY'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)
    

40.7590309 -73.98674788284762


We can use Foursquare to view hotels within a given radius of the Richard Rodgers Theatre.

In [156]:
search_query = 'Hotel'
radius = 500
print(search_query + ' .... OK!')

Hotel .... OK!


In [157]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=QQ5ZYS0QBI2WYIMIAZZUFEZJEV10V25IVVCFTW3EB550JISX&client_secret=3CJR0S30CC0RXHCOYTX5XUG1RCMVGA5MKV00WCOEQD41ESE1&ll=40.7590309,-73.98674788284762&v=20180604&query=Hotel&radius=500&limit=30'

In [158]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e3b3514c94979001be70f99'},
 'response': {'venues': [{'id': '4b4bbe3ff964a52016a626e3',
    'name': 'Hotel Edison',
    'location': {'address': '228 W 47th St',
     'crossStreet': 'at 8th Ave',
     'lat': 40.75966639038906,
     'lng': -73.98608770066558,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.75966639038906,
       'lng': -73.98608770066558}],
     'distance': 90,
     'postalCode': '10036',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['228 W 47th St (at 8th Ave)',
      'New York, NY 10036',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d1fa931735',
      'name': 'Hotel',
      'pluralName': 'Hotels',
      'shortName': 'Hotel',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
       'suffix': '.png'},
      'primary': True}],
    'venuePage': {'id': '501560524'},
    'referralId': 'v-1580938530',

In [159]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'Hotel', 'pluralName': 'Hotels', 'shortName': 'Hotel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,4b4bbe3ff964a52016a626e3,228 W 47th St,US,New York,United States,at 8th Ave,90,"[228 W 47th St (at 8th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75966639038906, 'lng': -73.98608770066558}]",40.759666,-73.986088,,10036,NY,Hotel Edison,v-1580938530,501560524.0
1,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'Hotel', 'pluralName': 'Hotels', 'shortName': 'Hotel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,4d7ad1263fbf6dcbb60b6423,132 W 47th St,US,New York,United States,btwn 6th & 7th Ave.,303,"[132 W 47th St (btwn 6th & 7th Ave.), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7583581, 'lng': -73.9832623}]",40.758358,-73.983262,,10036,NY,Sanctuary Hotel New York,v-1580938530,
2,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'Hotel', 'pluralName': 'Hotels', 'shortName': 'Hotel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,4adbaf34f964a520012a21e3,235 W 46th St,US,New York,United States,,66,"[235 W 46th St, New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.759568887065484, 'lng': -73.98709530548786}]",40.759569,-73.987095,,10036,NY,Paramount Hotel,v-1580938530,33865901.0
3,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'Hotel', 'pluralName': 'Hotels', 'shortName': 'Hotel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,5bc65d8e1822230025fffcb1,310 W 40th St,US,New York,United States,,514,"[310 W 40th St, New York, NY 10018, United States]","[{'label': 'display', 'lat': 40.75620715052, 'lng': -73.99157595205548}]",40.756207,-73.991576,,10018,NY,Aliz Hotel,v-1580938530,
4,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'Hotel', 'pluralName': 'Hotels', 'shortName': 'Hotel', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,5093c236830214706abb75db,218 W 50th St,US,New York,United States,50th & B'way,332,"[218 W 50th St (50th & B'way), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.761691026346256, 'lng': -73.98495316332877}]",40.761691,-73.984953,,10019,NY,citizenM Hotel New York Times Square,v-1580938530,


#### Let's look at data for Hotels within 500 meters of the Richard Rodgers Theatre

In [160]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns2 = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered2 = dataframe.loc[:, filtered_columns2]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered2['categories'] = dataframe_filtered2.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered2.columns = [column.split('.')[-1] for column in dataframe_filtered2.columns]

dataframe_filtered2

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,Hotel Edison,Hotel,228 W 47th St,US,New York,United States,at 8th Ave,90,"[228 W 47th St (at 8th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75966639038906, 'lng': -73.98608770066558}]",40.759666,-73.986088,,10036.0,NY,4b4bbe3ff964a52016a626e3
1,Sanctuary Hotel New York,Hotel,132 W 47th St,US,New York,United States,btwn 6th & 7th Ave.,303,"[132 W 47th St (btwn 6th & 7th Ave.), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7583581, 'lng': -73.9832623}]",40.758358,-73.983262,,10036.0,NY,4d7ad1263fbf6dcbb60b6423
2,Paramount Hotel,Hotel,235 W 46th St,US,New York,United States,,66,"[235 W 46th St, New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.759568887065484, 'lng': -73.98709530548786}]",40.759569,-73.987095,,10036.0,NY,4adbaf34f964a520012a21e3
3,Aliz Hotel,Hotel,310 W 40th St,US,New York,United States,,514,"[310 W 40th St, New York, NY 10018, United States]","[{'label': 'display', 'lat': 40.75620715052, 'lng': -73.99157595205548}]",40.756207,-73.991576,,10018.0,NY,5bc65d8e1822230025fffcb1
4,citizenM Hotel New York Times Square,Hotel,218 W 50th St,US,New York,United States,50th & B'way,332,"[218 W 50th St (50th & B'way), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.761691026346256, 'lng': -73.98495316332877}]",40.761691,-73.984953,,10019.0,NY,5093c236830214706abb75db
5,"The Algonquin Hotel, Autograph Collection",Hotel,59 W 44th St,US,New York,United States,,503,"[59 W 44th St, New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7559927, 'lng': -73.9823172}]",40.755993,-73.982317,,10036.0,NY,45f8e590f964a5203f441fe3
6,Night Hotel Times Square,Hotel,157 W 47th St,US,New York,United States,7th Avenue,256,"[157 W 47th St (7th Avenue), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75908099999999, 'lng': -73.983701}]",40.759081,-73.983701,,10036.0,NY,5145c277e4b0cec0809bad8c
7,Millennium Broadway Hotel,Hotel,145 W 44th St,US,New York,United States,at Broadway,272,"[145 W 44th St (at Broadway), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.757073, 'lng': -73.9848}]",40.757073,-73.9848,,10036.0,NY,4a0215c6f964a5202a711fe3
8,The Manhattan at Times Square Hotel,Hotel,790 7th Ave,US,New York,United States,at W 51st St,472,"[790 7th Ave (at W 51st St), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.76208099999999, 'lng': -73.9828534}]",40.762081,-73.982853,,10019.0,NY,4bc4f73c0191c9b6d2e8eab1
9,The Belvedere Hotel,Hotel,319 W 48th St,US,New York,United States,btw 8th Ave & 9th Ave,306,"[319 W 48th St (btw 8th Ave & 9th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.76159900821748, 'lng': -73.9880682833074}]",40.761599,-73.988068,,10036.0,NY,4af0ae0df964a52037de21e3


In [161]:
dataframe_filtered2.name

0     Hotel Edison                                                      
1     Sanctuary Hotel New York                                          
2     Paramount Hotel                                                   
3     Aliz Hotel                                                        
4     citizenM Hotel New York Times Square                              
5     The Algonquin Hotel, Autograph Collection                         
6     Night Hotel Times Square                                          
7     Millennium Broadway Hotel                                         
8     The Manhattan at Times Square Hotel                               
9     The Belvedere Hotel                                               
10    Renaissance New York Times Square Hotel                           
11    Royalton Hotel                                                    
12    The Michelangelo Hotel                                            
13    Kimpton Muse Hotel                           

Let's map out the hotels returned in our Foursquare search.

### Map of Hotels Near Broadway Theatres <a name="hotels-map"></a>

First let's show a map of the hotel locations, so we can get a view of where they are in relation to the theatre.

In [162]:
# create map of Manhattan using latitude and longitude values
map_hotels = folium.Map(location=[latitude, longitude], zoom_start=15)

# add a red circle marker to represent the Midtown Manhattan neighborhood center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='Richard Rodgers Theatre - Home of Hamilton',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(map_hotels)

# add markers to map
for lat, lng, label in zip(dataframe_filtered2['lat'], dataframe_filtered2['lng'], dataframe_filtered2['name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='purple',
        fill=True,
        fill_color='purple',
        fill_opacity=0.5,
        parse_html=False).add_to(map_hotels)  
    
map_hotels

**Map of Hotels Near the Richard Rodgers Theatre returned by searching for Hotels within a 500 meter radius**

Let's sort the data for the hotels to view the closest ones from the theater, then list the others in ascending order.

In [163]:
dataframe_filtered2.sort_values(by=['distance'])

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
2,Paramount Hotel,Hotel,235 W 46th St,US,New York,United States,,66,"[235 W 46th St, New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.759568887065484, 'lng': -73.98709530548786}]",40.759569,-73.987095,,10036.0,NY,4adbaf34f964a520012a21e3
0,Hotel Edison,Hotel,228 W 47th St,US,New York,United States,at 8th Ave,90,"[228 W 47th St (at 8th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75966639038906, 'lng': -73.98608770066558}]",40.759666,-73.986088,,10036.0,NY,4b4bbe3ff964a52016a626e3
29,Riu Plaza Times Square Hotel,Resort,W 46th Street,US,New York,United States,,167,"[W 46th Street, New York, NY, United States]","[{'label': 'display', 'lat': 40.76017910501307, 'lng': -73.98802929364761}]",40.760179,-73.988029,Hell's Kitchen,,NY,56d61e11498e4c3c305da704
10,Renaissance New York Times Square Hotel,Hotel,"Two Times Square, 714 Seventh Avenue At W. 48th Street",US,New York,United States,at W 48th St,192,"[Two Times Square, 714 Seventh Avenue At W. 48th Street (at W 48th St), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75971, 'lng': -73.984643}]",40.75971,-73.984643,,10036.0,NY,4a7d0598f964a5205aee1fe3
6,Night Hotel Times Square,Hotel,157 W 47th St,US,New York,United States,7th Avenue,256,"[157 W 47th St (7th Avenue), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75908099999999, 'lng': -73.983701}]",40.759081,-73.983701,,10036.0,NY,5145c277e4b0cec0809bad8c
7,Millennium Broadway Hotel,Hotel,145 W 44th St,US,New York,United States,at Broadway,272,"[145 W 44th St (at Broadway), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.757073, 'lng': -73.9848}]",40.757073,-73.9848,,10036.0,NY,4a0215c6f964a5202a711fe3
14,Serafina Time Hotel,Italian Restaurant,224 W 49th St,US,New York,United States,8th Avenue,277,"[224 W 49th St (8th Avenue), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.76130213956465, 'lng': -73.98537892620858}]",40.761302,-73.985379,,10019.0,NY,4b63389ef964a520986b2ae3
26,Mayfair Hotel,Hotel,242 W 49th St,US,New York,United States,,284,"[242 W 49th St, New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.761485994830515, 'lng': -73.98579163446011}]",40.761486,-73.985792,,10019.0,NY,4afadedbf964a520351922e3
16,The Pearl Hotel,Hotel,233 W 49th St,US,New York,United States,btwn Broadway & 8th Ave.,284,"[233 W 49th St (btwn Broadway & 8th Ave.), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.76143050528299, 'lng': -73.98557585645281}]",40.761431,-73.985576,,10019.0,NY,4cc5e3de06c254813e2e9f47
13,Kimpton Muse Hotel,Hotel,130 W 46th St,US,New York,United States,btwn 6th & 7th Ave,286,"[130 W 46th St (btwn 6th & 7th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.757808, 'lng': -73.983764}]",40.757808,-73.983764,,10036.0,NY,4a9f2f6ff964a520d93c20e3


Let's save the query results to a local .CSV file.

Let's check the ratings of some the hotels in our dataset, starting with the closest one, the Paramount.

In [164]:
venue_id = '4adbaf34f964a520012a21e3' # Paramount Hotel ID
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION) 

In [165]:
result = requests.get(url).json()
print(result['response']['venue'].keys())
result['response']['venue']

dict_keys(['id', 'name', 'contact', 'location', 'canonicalUrl', 'categories', 'verified', 'stats', 'url', 'hasMenu', 'likes', 'dislike', 'ok', 'rating', 'ratingColor', 'ratingSignals', 'menu', 'allowMenuUrlEdit', 'beenHere', 'specials', 'photos', 'venuePage', 'reasons', 'description', 'storeId', 'page', 'hereNow', 'createdAt', 'tips', 'shortUrl', 'timeZone', 'listed', 'hours', 'pageUpdates', 'inbox', 'attributes', 'bestPhoto', 'colors'])


{'id': '4adbaf34f964a520012a21e3',
 'name': 'Paramount Hotel',
 'contact': {'phone': '2127645500',
  'formattedPhone': '(212) 764-5500',
  'twitter': 'nycparamount',
  'facebook': '101786016079',
  'facebookUsername': 'NYCParamount',
  'facebookName': 'Paramount Hotel Times Square New York'},
 'location': {'address': '235 W 46th St',
  'lat': 40.759568887065484,
  'lng': -73.98709530548786,
  'labeledLatLngs': [{'label': 'display',
    'lat': 40.759568887065484,
    'lng': -73.98709530548786}],
  'postalCode': '10036',
  'cc': 'US',
  'city': 'New York',
  'state': 'NY',
  'country': 'United States',
  'formattedAddress': ['235 W 46th St',
   'New York, NY 10036',
   'United States']},
 'canonicalUrl': 'https://foursquare.com/v/paramount-hotel/4adbaf34f964a520012a21e3',
 'categories': [{'id': '4bf58dd8d48988d1fa931735',
   'name': 'Hotel',
   'pluralName': 'Hotels',
   'shortName': 'Hotel',
   'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
    'suffix': '.pn

In [167]:
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

6.1


That's not a very good rating, so let's try the citizenM Hotel which is only 332 meters away.

In [168]:
venue_id = '5093c236830214706abb75db' # citizenM Hotel ID
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION) 

In [169]:
result = requests.get(url).json()
print(result['response']['venue'].keys())
result['response']['venue']

dict_keys(['id', 'name', 'contact', 'location', 'canonicalUrl', 'categories', 'verified', 'stats', 'url', 'likes', 'dislike', 'ok', 'rating', 'ratingColor', 'ratingSignals', 'allowMenuUrlEdit', 'beenHere', 'specials', 'photos', 'reasons', 'storeId', 'page', 'hereNow', 'createdAt', 'tips', 'shortUrl', 'timeZone', 'listed', 'pageUpdates', 'inbox', 'attributes', 'bestPhoto', 'colors'])


{'id': '5093c236830214706abb75db',
 'name': 'citizenM Hotel New York Times Square',
 'contact': {'phone': '2124613638',
  'formattedPhone': '(212) 461-3638',
  'twitter': 'citizenm',
  'facebook': '173364980699',
  'facebookUsername': 'citizenMhotels',
  'facebookName': 'citizenM hotels'},
 'location': {'address': '218 W 50th St',
  'crossStreet': "50th & B'way",
  'lat': 40.761691026346256,
  'lng': -73.98495316332877,
  'labeledLatLngs': [{'label': 'display',
    'lat': 40.761691026346256,
    'lng': -73.98495316332877}],
  'postalCode': '10019',
  'cc': 'US',
  'city': 'New York',
  'state': 'NY',
  'country': 'United States',
  'formattedAddress': ["218 W 50th St (50th & B'way)",
   'New York, NY 10019',
   'United States']},
 'canonicalUrl': 'https://foursquare.com/v/citizenm-hotel-new-york-times-square/5093c236830214706abb75db',
 'categories': [{'id': '4bf58dd8d48988d1fa931735',
   'name': 'Hotel',
   'pluralName': 'Hotels',
   'shortName': 'Hotel',
   'icon': {'prefix': 'https:/

In [170]:
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

9.2


Wow, this looks like a great choice for our customers. Let's get the number of ratings for this hotel.

In [121]:
result['response']['venue']['ratingSignals']

412

Now let's get the number of tips.

That's a lot of ratings, so this seems like a good hotel for our customers. Let's get the hotel's tips from Foursquare.

In [120]:
result['response']['venue']['tips']['count']

83

In [128]:
tips = results['response']['tips']['items']

tip = results['response']['tips']['items'][0]
tip.keys()

dict_keys(['id', 'createdAt', 'text', 'type', 'canonicalUrl', 'photo', 'photourl', 'lang', 'likes', 'logView', 'agreeCount', 'disagreeCount', 'lastVoteText', 'lastUpvoteTimestamp', 'todo', 'user', 'authorInteractionType'])

In [129]:
pd.set_option('display.max_colwidth', -1)

tips_df = json_normalize(tips) # json normalize tips

# columns to keep
filtered_columns = ['text', 'agreeCount', 'disagreeCount', 'id', 'user.firstName', 'user.lastName', 'user.gender', 'user.id']
tips_filtered = tips_df.loc[:, filtered_columns]

# display tips
tips_filtered

Unnamed: 0,text,agreeCount,disagreeCount,id,user.firstName,user.lastName,user.gender,user.id
0,"The people are great, the rooms are high tech and kept very clean. Right on Times Square so shopping heaven. Can't recommend CitizenM enough.",9,0,57bdc1f8cd1020c89b827702,Nick,J,,331804026


Let'select the citizenM Hotel as our preferred hotel, and search for the best restaurants nearby.

## 5. Restaurants Near the Broadway Theatre District and citizenM Hotel <a name="restaurants-5"></a>

In [172]:
# Latitude and Longitude of citizenM Hotel
latitude = 40.761691
longitude = -73.984953

In [175]:
search_query = 'Restaurant'
radius = 500
print(search_query + ' .....OK!')

Restaurant .....OK!


In [176]:
# Set up the search url to query Foursquare
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

In [177]:
# Get the results as a JSON file
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e3b35862115366da62d842c'},
 'response': {'venues': [{'id': '4c1d4eff63750f47bc12b867',
    'name': "L'ybane Restaurant",
    'location': {'address': '709 8th Ave',
     'crossStreet': 'Btwn 44th and 45th St',
     'lat': 40.7591018676758,
     'lng': -73.9888381958008,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.7591018676758,
       'lng': -73.9888381958008}],
     'distance': 436,
     'postalCode': '10036',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['709 8th Ave (Btwn 44th and 45th St)',
      'New York, NY 10036',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d11e941735',
      'name': 'Cocktail Bar',
      'pluralName': 'Cocktail Bars',
      'shortName': 'Cocktail',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/cocktails_',
       'suffix': '.png'},
      'primary': True}],
    'delivery': {'id': '27

In [178]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d11e941735', 'name': 'Cocktail Bar', 'pluralName': 'Cocktail Bars', 'shortName': 'Cocktail', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/cocktails_', 'suffix': '.png'}, 'primary': True}]",277694.0,/delivery_provider_seamless_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",seamless,https://www.seamless.com/menu/lybane-709-8th-ave-new-york/277694?affiliate=1131&utm_source=foursquare-affiliate-network&utm_medium=affiliate&utm_campaign=1131&utm_content=277694,False,4c1d4eff63750f47bc12b867,709 8th Ave,US,New York,United States,Btwn 44th and 45th St,436,"[709 8th Ave (Btwn 44th and 45th St), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7591018676758, 'lng': -73.9888381958008}]",40.759102,-73.988838,,10036,NY,L'ybane Restaurant,v-1580938776,77030291.0
1,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'American Restaurant', 'pluralName': 'American Restaurants', 'shortName': 'American', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/default_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,462a6065f964a520d9451fe3,1515 Broadway,US,New York,United States,at W 45th St,394,"[1515 Broadway (at W 45th St), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.758349343546215, 'lng': -73.98651265011574}]",40.758349,-73.986513,Theater District,10036,NY,Junior's Restaurant & Bakery,v-1580938776,77465459.0
2,"[{'id': '4bf58dd8d48988d110941735', 'name': 'Italian Restaurant', 'pluralName': 'Italian Restaurants', 'shortName': 'Italian', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/italian_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,3fd66200f964a5209ee81ee3,200 W 44th St,US,New York,United States,btwn Broadway & 8th Ave,491,"[200 W 44th St (btwn Broadway & 8th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7574973, 'lng': -73.9867788}]",40.757497,-73.986779,,10036,NY,Carmine’s Italian Restaurant,v-1580938776,
3,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'Restaurant', 'pluralName': 'Restaurants', 'shortName': 'Restaurant', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/default_', 'suffix': '.png'}, 'primary': True}]",,,,,,,False,3fd66200f964a520bfea1ee3,1535 Broadway,US,New York,United States,47th & 48th Fl,352,"[1535 Broadway (47th & 48th Fl), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75869, 'lng': -73.986285}]",40.75869,-73.986285,,10036,NY,The View Restaurant & Lounge,v-1580938776,60159912.0
4,"[{'id': '52e81612bcbc57f1066b7a06', 'name': 'Irish Pub', 'pluralName': 'Irish Pubs', 'shortName': 'Irish', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/pub_', 'suffix': '.png'}, 'primary': True}]",801912.0,/delivery_provider_seamless_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",seamless,https://www.seamless.com/menu/connollys-irish-pub--restaurant-121-w-45th-st-new-york/801912?affiliate=1131&utm_source=foursquare-affiliate-network&utm_medium=affiliate&utm_campaign=1131&utm_content=801912,False,4a6a25c0f964a520b7cc1fe3,121 W 45th St,US,New York,United States,at 6th Ave,494,"[121 W 45th St (at 6th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7573681, 'lng': -73.9835798}]",40.757368,-73.98358,,10036,NY,Connolly's Pub & Restaurant,v-1580938776,


In [179]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
0,L'ybane Restaurant,Cocktail Bar,709 8th Ave,US,New York,United States,Btwn 44th and 45th St,436,"[709 8th Ave (Btwn 44th and 45th St), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7591018676758, 'lng': -73.9888381958008}]",40.759102,-73.988838,,10036,NY,4c1d4eff63750f47bc12b867
1,Junior's Restaurant & Bakery,American Restaurant,1515 Broadway,US,New York,United States,at W 45th St,394,"[1515 Broadway (at W 45th St), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.758349343546215, 'lng': -73.98651265011574}]",40.758349,-73.986513,Theater District,10036,NY,462a6065f964a520d9451fe3
2,Carmine’s Italian Restaurant,Italian Restaurant,200 W 44th St,US,New York,United States,btwn Broadway & 8th Ave,491,"[200 W 44th St (btwn Broadway & 8th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7574973, 'lng': -73.9867788}]",40.757497,-73.986779,,10036,NY,3fd66200f964a5209ee81ee3
3,The View Restaurant & Lounge,Restaurant,1535 Broadway,US,New York,United States,47th & 48th Fl,352,"[1535 Broadway (47th & 48th Fl), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75869, 'lng': -73.986285}]",40.75869,-73.986285,,10036,NY,3fd66200f964a520bfea1ee3
4,Connolly's Pub & Restaurant,Irish Pub,121 W 45th St,US,New York,United States,at 6th Ave,494,"[121 W 45th St (at 6th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.7573681, 'lng': -73.9835798}]",40.757368,-73.98358,,10036,NY,4a6a25c0f964a520b7cc1fe3
5,O'Donoghues Pub & Restaurant,Bar,156 W 44th St,US,New York,United States,at 7th Ave,520,"[156 W 44th St (at 7th Ave), New York, NY 10036, United States]","[{'label': 'display', 'lat': 40.75701783963928, 'lng': -73.98519861512638}]",40.757018,-73.985199,,10036,NY,4e8cd404775bde318a4d6454
6,Astro Restaurant,Diner,1361 6th Ave,US,New York,United States,at W 55th St,573,"[1361 6th Ave (at W 55th St), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.762936219081475, 'lng': -73.97835009430995}]",40.762936,-73.97835,,10019,NY,4a9819d5f964a520782a20e3
7,Junior's Restaurant,American Restaurant,1626 Broadway,US,New York,United States,,105,"[1626 Broadway, New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.760829115440394, 'lng': -73.98442916213794}]",40.760829,-73.984429,,10019,NY,5941b55f3ba7674a46081efd
8,Patsy's Italian Restaurant,Italian Restaurant,236 W 56th St,US,New York,United States,Broadway,487,"[236 W 56th St (Broadway), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.76576339691694, 'lng': -73.98281550732071}]",40.765763,-73.982816,,10019,NY,4af0e7f3f964a52009e021e3
9,Connolly's Pub & Restaurant,Pub,43 W 54th St,US,New York,United States,btwn 5th & 6th Ave,626,"[43 W 54th St (btwn 5th & 6th Ave), New York, NY 10019, United States]","[{'label': 'display', 'lat': 40.76214148855147, 'lng': -73.97754851921036}]",40.762141,-73.977549,,10019,NY,4b1c488ff964a520820524e3


In [180]:
dataframe_filtered.name

0     L'ybane Restaurant                    
1     Junior's Restaurant & Bakery          
2     Carmine’s Italian Restaurant          
3     The View Restaurant & Lounge          
4     Connolly's Pub & Restaurant           
5     O'Donoghues Pub & Restaurant          
6     Astro Restaurant                      
7     Junior's Restaurant                   
8     Patsy's Italian Restaurant            
9     Connolly's Pub & Restaurant           
10    Ding BBQ and Hot Pot Restaurant       
11    Utsav Restaurant                      
12    Cancun Mexican Restaurant             
13    Restaurant Row                        
14    Zheng's Lucky Sunday Garden Restaurant
15    Smith's Bar & Restaurant              
16    lillie's victorian bar & restaurant   
17    Wolf Restaurant                       
18    Da Marino Restaurant                  
19    Patrick’s Restaurant                  
20    Remi Restaurant                       
21    The Restaurant I'm Not Allowed To Name
22    Subw

### Map of Restaurants Near the Broadway Theatre District <a name="rest-map"></a>

Let's map out the location of the 29 restaurants near the citizenM Hotel returned by our Foursquare query.

In [186]:
restaurants_map = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around the citizenM Hotel

# add a red circle marker to represent the citizenM Hotel
folium.features.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='citizenM Hotel',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(restaurants_map)

# add the restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='yellow',
        popup=label,
        fill = True,
        fill_color='red',
        fill_opacity=0.6
    ).add_to(restaurants_map)

# display map
restaurants_map

**Restaurants within 500 meters of the citizenM Hotel, marked in red, returned by our Foursquare API query**

After some data cleaning and processing in Excel, let's load our data back into a pandas dataframe.

In [None]:
# Build a new dataframe with cleaner theatre data
Column_Names = ['Name','Category','Address','City','State','PostalCode','CrossStreet','Distance','Lat','Lng','id','Stars','HighRate','LowRate','AvgRate']
Broadway_Hotels_df = pd.DataFrame(columns=Column_Names)
Broadway_Hotels_df

In [85]:
dataframe_filtered2.to_csv(r'C:\Users\brian\broadwayhotels4square.csv')

In [None]:
Broadway_Hotels_df

In [None]:
# create map of Manhattan using latitude and longitude values
map_hotels = folium.Map(location=[latitude, longitude], zoom_start=14)

# add a red circle marker to represent the Midtown Manhattan neighborhood center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='Richard Rodgers Theatre - Home of Hamilton',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(map_hotels)

# add markers to map
for lat, lng, label in zip(dataframe_filtered2['lat'], dataframe_filtered2['lng'], dataframe_filtered2['name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='purple',
        fill=True,
        fill_color='purple',
        fill_opacity=0.5,
        parse_html=False).add_to(map_hotels)  
    
map_hotels

Now let's load in some data for hotels rated 3 stars and above in Manhattan.

This data was downloaded from a dataset on Kaggle at: https://www.kaggle.com/gdberrio/new-york-hotels

The data was then cleaned in Excel and fitered to remove hotels rated under 3 stars.


In [None]:
# Build dataframe with hotels data
Hotels_df = pd.DataFrame(columns=[{'Name':'',
                            'Address':'',
                            'City':'',       
                            'State':'',
                            'PostalCode':'',
                            'Latitude':float(),
                            'Longitude':float(),
                            'Stars':float(),
                            'HighRate':float(),
                            'LowRate':float(),
                            'AvgRate':float()
                            }])

In [None]:
hotels_df = pd.read_csv('C:/users/brian/hotels.csv', encoding='cp1252')

Let's look at the first five rows of hotel data.

In [None]:
hotels_df.head()

We can see there are some very expensive hotels in the 5 star range. Let's look at the size of our dataset.

In [None]:
hotels_df.shape

We can see there are 355 hotels in our data, with 11 columns, currently sorted in descending order by Avg Rate, with the most expensive 5 star hotel listed first. 

So we can see, just from the first five rows of data, that staying at Safehouse Suites is going to cost almost $6,000 a night on the low end. That's way beyond the price point we want to set for our travel packages, so let's analyze our data and see if we can find the best combination of rating and affordability.

Let's check the datatypes for our columns:

In [None]:
hotels_df.dtypes

**Map showing locations of 355 hotels rated 3 stars and above in Manhattan, New York, with red dot for TKTS booth location**.

**Count the hotels by star rating and save the results to a dataframe**

In [None]:
hotels_df['Stars'].value_counts().to_frame()

**Find the correlation between the data columns and the average rate for a room**

In [None]:
hotels_df.corr()['AvgRate'].sort_values()

We can see that, as expected the Star rating for the hotel has a fair amount of correlation to the average rate.

We can see that we currently have 355 hotels in our dataset, located all over Manhattan so let clean this up a bit more by concentrating on those closer to the Theatre District.


### Map Hotel Data 

Let's show a map of the hotel locations, so we can get a view of where they are in Manhattan.

In [None]:
# create map of Manhattan using latitude and longitude values
map_hotels = folium.Map(location=[hotel_map_latitude, hotel_map_longitude], zoom_start=11)

# add a red circle marker to represent the Midtown Manhattan neighborhood center
folium.features.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='TKTS booth',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(map_hotels)

# add markers to map
for lat, lng, label in zip(hotels_df['Latitude'], hotels_df['Longitude'], hotels_df['Name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='purple',
        fill=True,
        fill_color='purple',
        fill_opacity=0.5,
        parse_html=False).add_to(map_hotels)  
    
map_hotels

 **Map showing location of hotels in Manhattan from our dataset**

This data is sorted by the High Rate, with the most expensive 5 star hotel listed first. So we can see, just from the first five rows of data, that staying at Safehouse Suites is going to cost almost $6,000 a night on the low end. That's way beyond the price point we want to set for our travel packages, so let's look at a scatter plot of the data.


Let's analyze our data and see if we can find the best combination of rating and affordability.

Data was manually cleaned in Excel, with columns and data added for geographical coordinates.

Let's upload the modified .csv file so we can view the mapped data.

In [None]:
results = requests.get(url).json()
results

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

In [None]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Now let's search for restaurants nearby

In [None]:
search_query = 'restaurant'
radius = 1000
print(search_query + ' .... OK!')

In [None]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Let's save our results to a .csv file.

In [None]:
dataframe_filtered2.to_csv(r'C:\Users\brian\broadwayrest4square.csv')

Data was manually cleaned, with some columns dropped and renamed, in Excel, then saved as 'BroadwayRestClean.csv'

In [None]:
Restaurant_Data_df = pd.read_csv('BroadwayRestClean.csv',dtype={'Latitude': float,'Longitude': float})
Restaurant_Data_df

Let's mark the location of the returned restaurants on our map.

In [None]:
# create map of the Broadway theatre district using latitude and longitude values
map_restaurants = folium.Map(location=[broadway_latitude, broadway_longitude], zoom_start=14)

# add markers to map
for lat, lng, label in zip(Restaurant_Data_df['Latitude'], Restaurant_Data_df['Longitude'], Restaurant_Data_df['Restaurant']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='yellow',
        fill_opacity=0.7,
        parse_html=False).add_to(map_restaurants) 

map_restaurants

**Map of 29 Restaurants Found by the Foursquare API Near the Broadway Theatre District**

In [None]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 2000 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

In [None]:
results = requests.get(url).json()
results

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

In [None]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now let's clean the JSON and structure it into a pandas dataframe

In [None]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

In [None]:
nearby_venues

Let's see how many venues were returned by Foursquare:

In [None]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

## 6. Conclusions and Findings <a name="conclusions"></a>

Now we are ready to clean the json and structure it into a *pandas* dataframe.

**Let's create a function to repeat the same process to all the neighborhoods in Manhattan**

In [None]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

**Now write the code to run the above function on each neighborhood and create a new dataframe called *manhattan_venues*.**

In [None]:
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )


#### Let's check the size of the resulting dataframe ####

In [None]:
print(manhattan_venues.shape)
manhattan_venues.head()

Let's check how many venues were returned for each neighborhood

In [None]:
manhattan_venues.groupby('Neighborhood').count()

#### Let's find out how many unique categories can be curated from all the returned venues. ####

In [None]:
print('There are {} uniques categories.'.format(len(manhattan_venues['Venue Category'].unique())))

## 3. Analyze Each Neighborhood

In [None]:
# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

In [None]:
manhattan_onehot.shape

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category ####

In [None]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

#### Let's confirm the new size ####

In [None]:
manhattan_grouped.shape

#### Let's print each neighborhood along with the top 5 most common venues

In [None]:
num_top_venues = 5

for hood in manhattan_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [None]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [None]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

## Building our Travel Packages

search_query = 'Hotels'
radius = 500
print(search_query + ' .... OK!')

### Selecting Hotels for our Travel Packages ##

We can see from the neighborhood results from Midtown Manhattan that hotels are common there, so let's search for some good hotel choices to include in our travel packages.

In [None]:
#### Searching Midtown for Hotels

## 4. Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [None]:
# set number of clusters
kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [None]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head() # check the last columns!

In [None]:
manhattan_merged

Finally, let's visualize the resulting clusters

In [None]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 5. Examine Clusters

Looking at each cluster, we can label it according to the most common venue categories.

**Cluster 1** - Koreatown - Korean restaurants are most common venue.

In [None]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

**Cluster 2** - Little Italy - Italian restaurants are the most common venue.

In [None]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Let's name this cluster as Little Italy

**Cluster 3**

In [None]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

**Cluster 4**

In [None]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

**Cluster 5**

In [None]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

## Find Existing Italian Restaurants in Selected Neighborhoods in Austin, TX ##

### Search for Italian restaurants using Foursquare API starting from Downtown, Austin  ###

Searching for Italian restaurants in Austin, Texas starting from zip code 78701, which includes Downtown Austin, the Red River District, and the Warehouse District.

In [None]:
# Convert address for Wooldridge Park near the center of the 78701 zip code to longitude and latitude
address = 'Downtown Austin, Austin TX'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

In [None]:
search_query = 'Italian'
radius = 1000
LIMIT = 50
print(search_query + ' .....OK!')

## Search for a specific venue category
> `https://api.foursquare.com/v2/venues/`**search**`?client_id=`**CLIENT_ID**`&client_secret=`**CLIENT_SECRET**`&ll=`**LATITUDE**`,`**LONGITUDE**`&v=`**VERSION**`&query=`**QUERY**`&radius=`**RADIUS**`&limit=`**LIMIT**

In [None]:

# Set up the search url to query Foursquare
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&search_query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

In [None]:
# Get the results as a JSON file
results = requests.get(url).json()
results

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

In [None]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered 

### View list of matches for 'Live Music' in the filtered dataframe ###

In [None]:
dataframe_filtered.name

In [None]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around Zilker Park

# add a red circle marker to represent the zip code centroid
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Downtown Austin',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the venues as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

Our band members have heard of Stubb's BBQ and love to eat that type of food, so let's look at the rating:

In [None]:
venue_id = '40fb0f00f964a520fc0a1fe3' # ID of Stubb's BBQ
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION) 

In [None]:

result = requests.get(url).json()
print(result['response']['venue'].keys())
result['response']['venue']

In [None]:
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

That is a great rating, but let's keep checking to see if we can find a better venue. We probably can't find any better location than Moody Theater, which hosts Austin City Limits Live, the home for the long running PBS concert series.

In [None]:
venue_id = '4c77cbe5947ca1cd90694837' # ID of Austin City Limits Live
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

result = requests.get(url).json()
try:
    print(result['response']['venue']['rating'])
except:
    print('This venue has not been rated yet.')

Wow, that's a great rating. Let's explore the venue's tips:

In [None]:
## Austin City Limits Live Tips
limit = 15 # set limit to be greater than or equal to the total number of tips
url = 'https://api.foursquare.com/v2/venues/{}/tips?client_id={}&client_secret={}&v={}&limit={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION, limit)

results = requests.get(url).json()
results

In [None]:
tips = results['response']['tips']['items']

tip = results['response']['tips']['items'][0]
tip.keys()

In [None]:
pd.set_option('display.max_colwidth', -1)

tips_df = json_normalize(tips) # json normalize tips

# columns to keep
filtered_columns = ['text', 'agreeCount', 'disagreeCount', 'id', 'user.firstName', 'user.lastName', 'user.gender', 'user.id']
tips_filtered = tips_df.loc[:, filtered_columns]

# display tips
tips_filtered

## Music Festivals in Austin, Texas ##

Our band loves to play outdoors to large crowds, so we want to find out about local music festivals, which draw crowds from around the world.

In [None]:
search_query = 'music festival'
radius = 2000
print(search_query + ' .....OK!')

In [None]:
# Set up the search url to query Foursquare
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

In [None]:
# Get the results as a JSON file
results = requests.get(url).json()
results

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

In [None]:
http://api.foursquare.com/v2/categories