## IBM CAPSTONE PROJECT
### THE BATTLE OF THE NEIGHBORHOODS (WEEK 1 &2)

### Background and Introduction

#### Problem Description

In today’s society there are many ways to shop for everything.  These ways include online shopping, in person brick and mortar, and there are services that businesses provide where they will go and shop for the specific items their customers want.  Often times it is difficult for a personal shopper or an individual to buy things if they do not know what they specifically want.  Even with today’s recommender systems that can make recommendations based upon a profile, these systems have their weaknesses.  Part of the problem is that there are so many choices and the overwhelming number of choices make it difficult for people to decide.  According to this NYTimes article https://www.nytimes.com/2010/02/27/your-money/27shortcuts.html "Research also shows that an excess of choices often leads us to be less, not more, satisfied once we actually decide. There’s often that nagging feeling we could have done better."

A city like New York is an enormous city and many times when US domestic visitors or foreigners are looking for a hotel in NYC the choices can be overwhelming.  It is stated that there are anywhere from 20,000 to 30,000 licensed realtors in NYC.  So not only is the problem of choosing a hotel extremely difficult but finding the right area to stay in adds to the complexity.  Often times many tourists that visit NYC don’t know anything about the city, but they can often, and without hesitation tell you what it is they like they do.  They can very easiy share the types of experiences they like to have, the venues they visit the most, etc.

What I am proposing is to use Foursquare data and code a solution that pulls venue data from Fousquare and builds a dataframe that gives you the top ten venues by neighborhood.  From there we can pull the foursquare data that has all of the hotels for that area.  This would simplify the arduous task of picking a hotel from the 669 hotels that are currently available in NYC.

#### Data Description

The data driven solution will incorporate the use of a Jupyter Notebook as the developer environment that I will work from.  While utilizing Python as the scripting language, I will leverage the many different Python libraries that handle JSON data, Dataframes, and mapping objects. The Dataframes will mostly contain location data that is extracted through the Foursquare API.  Foursquare is a repository of location data that it obtains through interacting with its partners like Snap, Twitter, Google etc.  The Foursquare database is rich with comprehensive location data and it will provide all of the data that we need for this evaluation.

#### Data Science Methodology

1.  Collect the data
2.  Inspect/Review/Understand the data
3.  Prepare the data
4.  Model the solution

### Import the necessary libraries

In [3]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


### Define the Foursquare Credentials

In [4]:
CLIENT_ID = 'UE5KS4XQBNFQPRLSRJVHJ3SST4HC5XU04Y1QNOKYL1MTWMPH' # your Foursquare ID
CLIENT_SECRET = 'PJ3FNRYDZJ1ZH5D3NRL4L2K1GXFSFEXOIFQGWCXDGUZBK532' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: UE5KS4XQBNFQPRLSRJVHJ3SST4HC5XU04Y1QNOKYL1MTWMPH
CLIENT_SECRET:PJ3FNRYDZJ1ZH5D3NRL4L2K1GXFSFEXOIFQGWCXDGUZBK532


In [None]:
address = '129 dawns edge drive, Montgomery, TX'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

### Import the geo json data from https://geo.nyu.edu/catalog/nyu_2451_34572 to my local machine

In [5]:
import json

with open('/Users/isaacshareef/Desktop/Python_Code/Jupyter_Notebook/Notebooks/Learning Notebooks/Visualizations & Mapping/nyu-geojson.json')as f:
  data = json.load(f)

In [6]:
data

{'type': 'FeatureCollection',
 'totalFeatures': 306,
 'features': [{'type': 'Feature',
   'id': 'nyu_2451_34572.1',
   'geometry': {'type': 'Point',
    'coordinates': [-73.84720052054902, 40.89470517661]},
   'geometry_name': 'geom',
   'properties': {'name': 'Wakefield',
    'stacked': 1,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661]}},
  {'type': 'Feature',
   'id': 'nyu_2451_34572.2',
   'geometry': {'type': 'Point',
    'coordinates': [-73.82993910812398, 40.87429419303012]},
   'geometry_name': 'geom',
   'properties': {'name': 'Co-op City',
    'stacked': 2,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.87429419303012]}},
  {'type': 'Feature',
 

In [15]:
neighborhoods_info = data['features']

In [16]:
neighborhoods_info[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

In [12]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [27]:
for data in neighborhoods_info:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [18]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [28]:
neighborhood_latitude = neighborhoods.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = neighborhoods.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = neighborhoods.loc[0, 'Neighborhood']  # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Wakefield are 40.89470517661, -73.84720052054902.


In [29]:
manhattan = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [30]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [31]:
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan['Latitude'], manhattan['Longitude'], manhattan['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

In [32]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [34]:
manhattan_venues = getNearbyVenues(names=manhattan['Neighborhood'],
                                   latitudes=manhattan['Latitude'],
                                   longitudes=manhattan['Longitude']
                                  )

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards
Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyve

In [35]:
print(manhattan_venues.shape)
manhattan_venues.head()

(2354, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
1,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
3,Marble Hill,40.876551,-73.91066,Dunkin',40.877136,-73.906666,Donut Shop
4,Marble Hill,40.876551,-73.91066,Starbucks,40.877531,-73.905582,Coffee Shop


In [36]:
manhattan_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,60,60,60,60,60,60
Carnegie Hill,60,60,60,60,60,60
Central Harlem,60,60,60,60,60,60
Chelsea,60,60,60,60,60,60
Chinatown,60,60,60,60,60,60
Civic Center,60,60,60,60,60,60
Clinton,60,60,60,60,60,60
East Harlem,60,60,60,60,60,60
East Village,60,60,60,60,60,60
Financial District,60,60,60,60,60,60


In [38]:
print('There are {} uniques categories.'.format(len(manhattan_venues['Venue Category'].unique())))

There are 238 uniques categories.


In [39]:
# one hot encoding
manhattan_onehot = pd.get_dummies(manhattan_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = manhattan_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

Unnamed: 0,Neighborhood,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
2,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Marble Hill,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [40]:
manhattan_onehot.shape

(2354, 239)

In [41]:
manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Battery Park City,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Carnegie Hill,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333
2,Central Harlem,0.0,0.0,0.1,0.066667,0.0,0.0,0.033333,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chelsea,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0
4,Chinatown,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333
5,Civic Center,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.033333,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333
6,Clinton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0
7,East Harlem,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,East Village,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,...,0.033333,0.0,0.0,0.0,0.066667,0.0,0.066667,0.033333,0.0,0.0
9,Financial District,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0


In [42]:
manhattan_grouped.shape

(40, 239)

In [43]:
num_top_venues = 5

for hood in manhattan_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Battery Park City----
           venue  freq
0           Park  0.10
1  Memorial Site  0.10
2     Food Court  0.07
3  Movie Theater  0.03
4  Shopping Mall  0.03


----Carnegie Hill----
                  venue  freq
0           Pizza Place  0.07
1    Italian Restaurant  0.07
2                  Café  0.07
3  Gym / Fitness Center  0.07
4                   Gym  0.07


----Central Harlem----
                  venue  freq
0    African Restaurant  0.10
1                   Bar  0.07
2   American Restaurant  0.07
3     French Restaurant  0.07
4  Caribbean Restaurant  0.03


----Chelsea----
                venue  freq
0               Hotel  0.07
1  Seafood Restaurant  0.07
2         Coffee Shop  0.07
3      Ice Cream Shop  0.07
4   French Restaurant  0.07


----Chinatown----
                venue  freq
0  Chinese Restaurant  0.10
1                 Spa  0.07
2      Sandwich Place  0.07
3           Roof Deck  0.03
4  Spanish Restaurant  0.03


----Civic Center----
                  venue  freq


In [44]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [47]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Battery Park City,Park,Memorial Site,Food Court,Building,Monument / Landmark,Burrito Place,Smoke Shop,Shopping Mall,Scenic Lookout,Sandwich Place
1,Carnegie Hill,Pizza Place,Gym,Café,Coffee Shop,Bookstore,Gym / Fitness Center,Italian Restaurant,Karaoke Bar,Gourmet Shop,French Restaurant
2,Central Harlem,African Restaurant,Bar,American Restaurant,French Restaurant,Dessert Shop,Cafeteria,Caribbean Restaurant,Fried Chicken Joint,Spa,Boutique
3,Chelsea,Ice Cream Shop,Hotel,French Restaurant,Coffee Shop,Seafood Restaurant,Café,Liquor Store,Speakeasy,Fish Market,Butcher
4,Chinatown,Chinese Restaurant,Spa,Sandwich Place,Hotel,Salon / Barbershop,Roof Deck,Pizza Place,Noodle House,New American Restaurant,Museum
5,Civic Center,French Restaurant,Cocktail Bar,Bakery,Spa,Gym / Fitness Center,Park,Falafel Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Burrito Place
6,Clinton,Theater,Gym / Fitness Center,Hotel,Pizza Place,Building,Café,Mediterranean Restaurant,Peruvian Restaurant,Sporting Goods Shop,Sports Bar
7,East Harlem,Mexican Restaurant,Thai Restaurant,Latin American Restaurant,Bakery,Taco Place,French Restaurant,Steakhouse,Café,Seafood Restaurant,Donut Shop
8,East Village,Dessert Shop,Wine Bar,Vietnamese Restaurant,Park,Speakeasy,Burger Joint,Scandinavian Restaurant,Caribbean Restaurant,Cheese Shop,Coffee Shop
9,Financial District,Coffee Shop,Gym / Fitness Center,Pizza Place,New American Restaurant,Restaurant,Roof Deck,Salad Place,Museum,French Restaurant,Steakhouse


In [48]:
hotel_address = 'Lincoln Square, NY'
geolocator = Nominatim(user_agent = 'foursquare_agent')
location = geolocator.geocode(hotel_address)
latitiude = location.latitude
longitude = location.longitude
print(latitude, longitude)

40.7896239 -73.9844012


In [71]:
search_query = 'hotel'
radius2 = 750
LIMIT2 = 10
print(search_query + ' .... OK!')

hotel .... OK!


In [72]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    latitude,
    longitude,
    VERSION,
    search_query,
    radius2,
    LIMIT2)
url

'https://api.foursquare.com/v2/venues/search?client_id=UE5KS4XQBNFQPRLSRJVHJ3SST4HC5XU04Y1QNOKYL1MTWMPH&client_secret=PJ3FNRYDZJ1ZH5D3NRL4L2K1GXFSFEXOIFQGWCXDGUZBK532&ll=40.7896239,-73.9844012&v=20180604&query=hotel&radius=750&limit=10'

In [73]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5f1e3195195236547005bfe9'},
 'response': {'venues': [{'id': '4b1c3322f964a520210424e3',
    'name': 'Belnord Hotel',
    'location': {'address': '209 W 87th St',
     'crossStreet': 'Broadway',
     'lat': 40.7889054,
     'lng': -73.9750543,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.7889054,
       'lng': -73.9750543},
      {'label': 'entrance', 'lat': 40.788769, 'lng': -73.975268}],
     'distance': 791,
     'postalCode': '10024',
     'cc': 'US',
     'city': 'New York',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['209 W 87th St (Broadway)',
      'New York, NY 10024',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d1fa931735',
      'name': 'Hotel',
      'pluralName': 'Hotels',
      'shortName': 'Hotel',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1595814327',
 

In [74]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = pd.json_normalize(venues)
dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,venuePage.id,location.neighborhood
0,4b1c3322f964a520210424e3,Belnord Hotel,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595814327,False,209 W 87th St,Broadway,40.788905,-73.975054,"[{'label': 'display', 'lat': 40.7889054, 'lng'...",791,10024,US,New York,NY,United States,"[209 W 87th St (Broadway), New York, NY 10024,...",,
1,4ba06349f964a5208b6b37e3,Riverside Tower Hotel,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595814327,False,80 Riverside Dr,W. 80th,40.785606,-73.98197,"[{'label': 'display', 'lat': 40.78560594397916...",491,10024,US,New York,NY,United States,"[80 Riverside Dr (W. 80th), New York, NY 10024...",,
2,4b4f866bf964a520020a27e3,Hotel Belleclaire,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595814327,False,2175 Broadway,Broadway,40.782452,-73.981237,"[{'label': 'display', 'lat': 40.7824522, 'lng'...",841,10024,US,New York,NY,United States,"[2175 Broadway (Broadway), New York, NY 10024,...",77479228.0,
3,52d946ed11d29cb294a4449a,Hotel Belleclaire Gym,"[{'id': '4bf58dd8d48988d176941735', 'name': 'G...",v-1595814327,False,250 W 77th St,Broadway,40.782501,-73.981331,"[{'label': 'display', 'lat': 40.78250122070312...",834,10024,US,New York,NY,United States,"[250 W 77th St (Broadway), New York, NY 10024,...",,
4,50e30eece4b0f6dd2432b172,Hotel Bellclaire,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595814327,False,250 W 77th St,,40.782442,-73.981342,"[{'label': 'display', 'lat': 40.78244199999999...",840,10024,US,New York,NY,United States,"[250 W 77th St, New York, NY 10024, United Sta...",,


In [75]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,id
0,Belnord Hotel,Hotel,209 W 87th St,Broadway,40.788905,-73.975054,"[{'label': 'display', 'lat': 40.7889054, 'lng'...",791,10024.0,US,New York,NY,United States,"[209 W 87th St (Broadway), New York, NY 10024,...",,4b1c3322f964a520210424e3
1,Riverside Tower Hotel,Hotel,80 Riverside Dr,W. 80th,40.785606,-73.98197,"[{'label': 'display', 'lat': 40.78560594397916...",491,10024.0,US,New York,NY,United States,"[80 Riverside Dr (W. 80th), New York, NY 10024...",,4ba06349f964a5208b6b37e3
2,Hotel Belleclaire,Hotel,2175 Broadway,Broadway,40.782452,-73.981237,"[{'label': 'display', 'lat': 40.7824522, 'lng'...",841,10024.0,US,New York,NY,United States,"[2175 Broadway (Broadway), New York, NY 10024,...",,4b4f866bf964a520020a27e3
3,Hotel Belleclaire Gym,Gym,250 W 77th St,Broadway,40.782501,-73.981331,"[{'label': 'display', 'lat': 40.78250122070312...",834,10024.0,US,New York,NY,United States,"[250 W 77th St (Broadway), New York, NY 10024,...",,52d946ed11d29cb294a4449a
4,Hotel Bellclaire,Hotel,250 W 77th St,,40.782442,-73.981342,"[{'label': 'display', 'lat': 40.78244199999999...",840,10024.0,US,New York,NY,United States,"[250 W 77th St, New York, NY 10024, United Sta...",,50e30eece4b0f6dd2432b172
5,Imperial Court Hotel,Hotel,307 W 79th St,,40.784643,-73.981728,"[{'label': 'display', 'lat': 40.78464298463871...",598,10024.0,US,New York,NY,United States,"[307 W 79th St, New York, NY 10024, United Sta...",,4bc05567920eb713dacb182c
6,Arthouse Hotel NYC,Hotel,2178 Broadway,77th Street,40.782088,-73.98042,"[{'label': 'display', 'lat': 40.782088, 'lng':...",903,10024.0,US,New York,NY,United States,"[2178 Broadway (77th Street), New York, NY 100...",Upper West Side,5c62c5bfc876c8002cacaf52
7,Central Park West Hotel (Central Park West Hos...,Motel,201 West 87th Street,,40.788535,-73.974809,"[{'label': 'display', 'lat': 40.78853529710084...",817,,US,New York,NY,United States,"[201 West 87th Street, New York, NY, United St...",,56f0b474498e0a04d09f877e
8,Paris Hotel Regina,Motel,2 Rue Da Royal,,40.786621,-73.976087,"[{'label': 'display', 'lat': 40.786621, 'lng':...",776,10024.0,US,Paris,NY,United States,"[2 Rue Da Royal, Paris, NY 10024, United States]",,4fb44cd2e4b008453ed0f9b6
9,Lucerne Hotel Fitness Center,Gym / Fitness Center,201 W 79th St,Amsterdam Ave.,40.783446,-73.978596,"[{'label': 'display', 'lat': 40.783446, 'lng':...",844,10024.0,US,New York,NY,United States,"[201 W 79th St (Amsterdam Ave.), New York, NY ...",,54bd2927498e7bffede7c884


In [76]:
dataframe_filtered.name

0                                        Belnord Hotel
1                                Riverside Tower Hotel
2                                    Hotel Belleclaire
3                                Hotel Belleclaire Gym
4                                     Hotel Bellclaire
5                                 Imperial Court Hotel
6                                   Arthouse Hotel NYC
7    Central Park West Hotel (Central Park West Hos...
8                                   Paris Hotel Regina
9                         Lucerne Hotel Fitness Center
Name: name, dtype: object

In [78]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Desired Destination',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map