## IBM CAPSTONE PROJECT
### THE BATTLE OF THE NEIGHBORHOODS (WEEK 1 & 2)

### Background and Introduction

#### Problem Description

In today’s society there are many ways to shop for just about everything.  The ways include online shopping, in person brick and mortar, and there are services where businesses provide personnel who will go and shop for the specific items their customers want.  Often times it is difficult for a personal shopper or an individual to buy things if they do not know what they specifically want.  Then if you factor in an enormous amount of choices and the challenge only escalates.  Even with today’s computerized recommender systems that can make recommendations based upon a profile, these systems have their weaknesses as well.  Simply put, the problem is that there are too many choices, and the overwhelming number of choices makes it very difficult for people and smart machines to decide and decide well.   According to this NYTimes article https://www.nytimes.com/2010/02/27/your-money/27shortcuts.html  "Research also shows that an excess of choices often leads us to be less, not more, satisfied once we actually decide. There’s often that nagging feeling we could have done better."  

A city like New York is an enormous city and many times when visitors to the city are looking for a hotel in NYC the choices can be overwhelming.  So not only is the problem of choosing a hotel extremely difficult but finding the right area to stay in adds to the complexity.  Often times many tourists that visit NYC don’t know anything about the city, but they can often, and without hesitation tell you what it is that they like to do.  They can very easily share the types of experiences they like to have, the venues they visit the most, etc.  I believe that when a customer tells you what they like to do, therein lies the solution to part of the problem of having too many choices.  If you can tell me what you like to do, I can help you determine where you would best like to stay and give you options that target the types of venues you most frequent when traveling.  I feel that this solution leaves the buyer with less remorse in the long run.

What I am proposing is to use publicly available location data and Foursquare data to code a solution that pulls venue data from Foursquare and builds a dataframe that gives you the top ten venues by neighborhood.  From there we can pull the foursquare data that has all of the hotels for that area you want to stay based on your tastes in venues.  This would simplify the arduous task of picking a hotel from the 669 hotels that are currently available in NYC.

The target audience for this type of solution is the traveler who is easily overwhelmed by the number of lodging options in NYC or any large city destination for that matter.  This solution gives a travel a more focused solution that starts with what types of venues they are attracted to and offer the areas, and hotels with the highest concentrations of those venues.  From there the traveler can select the area they are most attracted to and then the solution would be to give them a list of nearby hotels.  The ultimate goal is to simplify the process of choosing a hotel, lower the travelers stress and make the trip more enjoyable.

#### Data Science Methodology

1.  Collect the data
2.  Inspect the data / Review the data / Understand the data
3.  Prepare the data
4.  Model the solution and visualize

## 1. Collect the Data

### Importing the necessary Python libraries

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analytics
import numpy as np # library for numerical computations
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library


Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



### Define the Foursquare Credentials and Paramaters for API requests

In [16]:
CLIENT_ID = 'PRIVATE' # your Foursquare ID
CLIENT_SECRET = 'PRIVATE' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30

### Import the geo json data from https://geo.nyu.edu/catalog/nyu_2451_34572 to my local machine

In [17]:
import json

with open('/Users/isaacshareef/Desktop/Python_Code/Jupyter_Notebook/Notebooks/Learning Notebooks/Visualizations & Mapping/nyu-geojson.json')as f:
  data = json.load(f)

## 2. Inspect/Review/Understand the Data

In [18]:
#Review the data and its format
data

{'type': 'FeatureCollection',
 'totalFeatures': 306,
 'features': [{'type': 'Feature',
   'id': 'nyu_2451_34572.1',
   'geometry': {'type': 'Point',
    'coordinates': [-73.84720052054902, 40.89470517661]},
   'geometry_name': 'geom',
   'properties': {'name': 'Wakefield',
    'stacked': 1,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661]}},
  {'type': 'Feature',
   'id': 'nyu_2451_34572.2',
   'geometry': {'type': 'Point',
    'coordinates': [-73.82993910812398, 40.87429419303012]},
   'geometry_name': 'geom',
   'properties': {'name': 'Co-op City',
    'stacked': 2,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.87429419303012]}},
  {'type': 'Feature',
 

In [19]:
#assigning features of JSON data to variable
neighborhoods_info = data['features']

Data review and inspection process

In [20]:
#printing the first item to inspect its features
neighborhoods_info[0]

{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}

In [21]:
# creating a variable that contains the dataframe column names
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# create a pandas dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [22]:
#iterate through neighborhoods_info to append data to new dataframe
for data in neighborhoods_info:
    borough = neighborhood_info = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
    
    #define latitude and longitude
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    #append thy values to the neighborhood df
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

We can now start to review the data in a structured format using Pandas Dataframes

In [23]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [24]:
neighborhood_latitude = neighborhoods.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = neighborhoods.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = neighborhoods.loc[0, 'Neighborhood']  # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Wakefield are 40.89470517661, -73.84720052054902.


### One way to make sure that the data that we have incorporated into this data set meets our needs would be to plot the data using the Folium map maker library.  This will ensure that we are covering all of the boroughs and within the meets and bounds of what we now as New York City.

Since we selected NYC as our target city dataset we will enter Manhattan as the desired point of exploration from a mapping standpoint.

In [25]:
#Where the user needs to put in their desired location
address = 'Manhattan, NY' # <----------- Manhattan

#gets the coordinates for that location
geolocator = Nominatim(user_agent="NY")
location1 = geolocator.geocode(address)
latitude1 = location1.latitude
longitude1 = location1.longitude
print('The Latitude Longitude for the requested address is {}, {}.'.format(latitude1, longitude1))

The Latitude Longitude for the requested address is 40.7896239, -73.9598939.


In [26]:
# create map of Manhattan using latitude and longitude values
map = folium.Map(location=[latitude1, longitude1], zoom_start=11.5)

# add markers to map
for lat, lng, label in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=False,
        #fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

### Looks like the dataset is comprehensive of all 5 Boroughs:
    Manhattan
    Bronx <----Birthplace
    Brooklyn <------My favorite
    Staten Island <------Wu Tang
    Queens

## 3. Prepare the Data

#### We need to prepare the data for modeling our solution

Since we are targeting NYC, we will choose Brooklyn as our test location and look to evaluate the venues in the different neighborhoods of Manhattan.  Simply put all the worlds best rappers came from Brooklyn so we want to analyze the venues in Brooklyn.

In [27]:
#creating a dataframe that is specific to Brooklyn
brooklyn = neighborhoods[neighborhoods['Borough'] == 'Brooklyn'].reset_index(drop=True)
brooklyn.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Brooklyn,Bay Ridge,40.625801,-74.030621
1,Brooklyn,Bensonhurst,40.611009,-73.99518
2,Brooklyn,Sunset Park,40.645103,-74.010316
3,Brooklyn,Greenpoint,40.730201,-73.954241
4,Brooklyn,Gravesend,40.59526,-73.973471


In [28]:
#defining a function to get venues
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [29]:
#running the getNearbyVenues function
brooklyn_venues = getNearbyVenues(names=brooklyn['Neighborhood'],
                                   latitudes=brooklyn['Latitude'],
                                   longitudes=brooklyn['Longitude']
                                  )

Bay Ridge
Bensonhurst
Sunset Park
Greenpoint
Gravesend
Brighton Beach
Sheepshead Bay
Manhattan Terrace
Flatbush
Crown Heights
East Flatbush
Kensington
Windsor Terrace
Prospect Heights
Brownsville
Williamsburg
Bushwick
Bedford Stuyvesant
Brooklyn Heights
Cobble Hill
Carroll Gardens
Red Hook
Gowanus
Fort Greene
Park Slope
Cypress Hills
East New York
Starrett City
Canarsie
Flatlands
Mill Island
Manhattan Beach
Coney Island
Bath Beach
Borough Park
Dyker Heights
Gerritsen Beach
Marine Park
Clinton Hill
Sea Gate
Downtown
Boerum Hill
Prospect Lefferts Gardens
Ocean Hill
City Line
Bergen Beach
Midwood
Prospect Park South
Georgetown
East Williamsburg
North Side
South Side
Ocean Parkway
Fort Hamilton
Ditmas Park
Wingate
Rugby
Remsen Village
New Lots
Paerdegat Basin
Mill Basin
Fulton Ferry
Vinegar Hill
Weeksville
Broadway Junction
Dumbo
Homecrest
Highland Park
Madison
Erasmus


In [30]:
#assessing the size and shape of the new Brooklyn Venues dataframe
print(brooklyn_venues.shape)
brooklyn_venues.head()

(1604, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bay Ridge,40.625801,-74.030621,Pilo Arts Day Spa and Salon,40.624748,-74.030591,Spa
1,Bay Ridge,40.625801,-74.030621,Bagel Boy,40.627896,-74.029335,Bagel Shop
2,Bay Ridge,40.625801,-74.030621,Cocoa Grinder,40.623967,-74.030863,Juice Bar
3,Bay Ridge,40.625801,-74.030621,Pegasus Cafe,40.623168,-74.031186,Breakfast Spot
4,Bay Ridge,40.625801,-74.030621,Leo's Casa Calamari,40.6242,-74.030931,Pizza Place


In [31]:
brooklyn_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bath Beach,30,30,30,30,30,30
Bay Ridge,30,30,30,30,30,30
Bedford Stuyvesant,27,27,27,27,27,27
Bensonhurst,27,27,27,27,27,27
Bergen Beach,6,6,6,6,6,6
...,...,...,...,...,...,...
Vinegar Hill,29,29,29,29,29,29
Weeksville,16,16,16,16,16,16
Williamsburg,30,30,30,30,30,30
Windsor Terrace,27,27,27,27,27,27


In [32]:
print('There are {} uniques categories.'.format(len(brooklyn_venues['Venue Category'].unique())))

There are 243 uniques categories.


In [33]:
# one hot encoding
brooklyn_onehot = pd.get_dummies(brooklyn_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
brooklyn_onehot['Neighborhood'] = brooklyn_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [brooklyn_onehot.columns[-1]] + list(brooklyn_onehot.columns[:-1])
brooklyn_onehot = brooklyn_onehot[fixed_columns]

brooklyn_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Airport Terminal,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,...,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [34]:
brooklyn_onehot.shape

(1604, 243)

In [35]:
brooklyn_grouped = brooklyn_onehot.groupby('Neighborhood').mean().reset_index()
brooklyn_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Airport Terminal,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,...,Video Game Store,Video Store,Vietnamese Restaurant,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Bath Beach,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0
1,Bay Ridge,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0
2,Bedford Stuyvesant,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.037037,0.037037,0.0,0.0
3,Bensonhurst,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0
4,Bergen Beach,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
65,Vinegar Hill,0.000000,0.0,0.0,0.034483,0.034483,0.0,0.0,0.068966,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.034483,0.0,0.0
66,Weeksville,0.000000,0.0,0.0,0.062500,0.000000,0.0,0.0,0.000000,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0
67,Williamsburg,0.033333,0.0,0.0,0.000000,0.000000,0.0,0.0,0.033333,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.033333,0.000000,0.0,0.0
68,Windsor Terrace,0.000000,0.0,0.0,0.037037,0.037037,0.0,0.0,0.000000,0.037037,...,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.037037,0.0,0.0


In [36]:
brooklyn_grouped.shape

(70, 243)

#### Lets list the top 5 venues per neighborhood

In [37]:
num_top_venues = 5

for hood in brooklyn_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = brooklyn_grouped[brooklyn_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bath Beach----
                  venue  freq
0    Italian Restaurant  0.07
1       Bubble Tea Shop  0.07
2              Pharmacy  0.07
3  Fast Food Restaurant  0.07
4    Chinese Restaurant  0.07


----Bay Ridge----
                venue  freq
0         Pizza Place  0.10
1  Italian Restaurant  0.10
2       Grocery Store  0.07
3                 Spa  0.07
4    Greek Restaurant  0.07


----Bedford Stuyvesant----
           venue  freq
0    Coffee Shop  0.11
1            Bar  0.07
2           Café  0.07
3  Deli / Bodega  0.07
4    Pizza Place  0.07


----Bensonhurst----
                venue  freq
0          Donut Shop  0.07
1  Italian Restaurant  0.07
2      Ice Cream Shop  0.07
3    Sushi Restaurant  0.07
4  Chinese Restaurant  0.07


----Bergen Beach----
                venue  freq
0     Harbor / Marina  0.33
1      Baseball Field  0.17
2        Hockey Field  0.17
3          Playground  0.17
4  Athletics & Sports  0.17


----Boerum Hill----
                    venue  freq
0  Furnitur

              venue  freq
0       Pizza Place   0.4
1    Ice Cream Shop   0.1
2  Video Game Store   0.1
3       Candy Store   0.1
4            Bakery   0.1


----Mill Basin----
                 venue  freq
0   Chinese Restaurant  0.13
1          Pizza Place  0.10
2  Japanese Restaurant  0.07
3                 Bank  0.07
4           Bagel Shop  0.07


----Mill Island----
          venue  freq
0          Pool   0.5
1     Locksmith   0.5
2   Yoga Studio   0.0
3  Outlet Store   0.0
4   Music Venue   0.0


----New Lots----
                 venue  freq
0        Grocery Store  0.14
1          Pizza Place  0.10
2   Chinese Restaurant  0.10
3  Fried Chicken Joint  0.10
4         Intersection  0.05


----North Side----
                 venue  freq
0        Jewelry Store  0.07
1  American Restaurant  0.07
2          Coffee Shop  0.07
3               Bakery  0.07
4          Yoga Studio  0.03


----Ocean Hill----
                             venue  freq
0                    Deli / Bodega  0.17
1   

In [38]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Putting the top 10 venues per neighborhood into a dataframe

This is an important aspect of the analysis as it allows to see with neigborhoods in our selected borough offer the venues that match our tastes.

In [39]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = brooklyn_grouped['Neighborhood']

for ind in np.arange(brooklyn_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(brooklyn_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(35)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bath Beach,Pharmacy,Bubble Tea Shop,Chinese Restaurant,Fast Food Restaurant,Italian Restaurant,German Restaurant,Cantonese Restaurant,Surf Spot,Sushi Restaurant,Donut Shop
1,Bay Ridge,Pizza Place,Italian Restaurant,Grocery Store,Greek Restaurant,Spa,Breakfast Spot,Sports Bar,Taco Place,Caucasian Restaurant,Bookstore
2,Bedford Stuyvesant,Coffee Shop,Deli / Bodega,Pizza Place,Bar,Café,Fried Chicken Joint,Playground,New American Restaurant,Cocktail Bar,Park
3,Bensonhurst,Sushi Restaurant,Donut Shop,Chinese Restaurant,Ice Cream Shop,Italian Restaurant,Butcher,Supermarket,Spa,Flower Shop,Shabu-Shabu Restaurant
4,Bergen Beach,Harbor / Marina,Athletics & Sports,Hockey Field,Playground,Baseball Field,Women's Store,Ethiopian Restaurant,Event Service,Event Space,Factory
5,Boerum Hill,Furniture / Home Store,Yoga Studio,Bar,Coffee Shop,Spa,Italian Restaurant,Japanese Restaurant,Kids Store,Indian Restaurant,Sushi Restaurant
6,Borough Park,Bank,Pharmacy,Café,Pizza Place,Fast Food Restaurant,Hotel,Bakery,Restaurant,Coffee Shop,Chinese Restaurant
7,Brighton Beach,Restaurant,Sushi Restaurant,Eastern European Restaurant,Gourmet Shop,Russian Restaurant,Supplement Shop,Korean Restaurant,Supermarket,Beach,Food & Drink Shop
8,Broadway Junction,Donut Shop,Diner,Fried Chicken Joint,Bakery,Pizza Place,Burger Joint,Ice Cream Shop,Gas Station,Discount Store,Seafood Restaurant
9,Brooklyn Heights,Yoga Studio,Pet Store,Coffee Shop,Playground,Deli / Bodega,Pizza Place,Cosmetics Shop,Diner,Japanese Restaurant,Scenic Lookout


## 4. Model the solution 

#### This section is where we select the neighborhood with the venues that match our taste and then look to retrieve the list of hotels that are in that area.

In [40]:
venue_neighborhood = 'Crown Heights, Brooklyn, NY'

geolocator2 = Nominatim(user_agent = 'brooklyn_agent')
location2 = geolocator2.geocode(venue_neighborhood)
latitude2 = location2.latitude
longitude2 = location2.longitude
print(latitude2, longitude2)

40.6688226 -73.9311116


In [41]:
# a new search query for hotels in our selected neighborhood
search_query2 = 'hotel'
radius2 = 10000
LIMIT2 = 10

In [None]:
url2 = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    latitude2,
    longitude2,
    VERSION,
    search_query2,
    radius2,
    LIMIT2)
url2

In [46]:
results2 = requests.get(url2).json()
results2

{'meta': {'code': 200, 'requestId': '5f1f9519cbd37754cc08ead9'},
 'response': {'venues': [{'id': '58804440ef46947925418207',
    'name': 'Hotel RL Brooklyn',
    'location': {'address': '1080 Broadway',
     'lat': 40.69441592010015,
     'lng': -73.93097178347433,
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.69441592010015,
       'lng': -73.93097178347433},
      {'label': 'entrance', 'lat': 40.694369, 'lng': -73.930877}],
     'distance': 2849,
     'postalCode': '11221',
     'cc': 'US',
     'city': 'Brooklyn',
     'state': 'NY',
     'country': 'United States',
     'formattedAddress': ['1080 Broadway',
      'Brooklyn, NY 11221',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d1fa931735',
      'name': 'Hotel',
      'pluralName': 'Hotels',
      'shortName': 'Hotel',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1595905755',
    'hasPer

In [47]:
# assign relevant part of JSON to venues
venues = results2['response']['venues']

# tranform venues into a dataframe
dataframe = pd.json_normalize(venues)
dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet,venuePage.id
0,58804440ef46947925418207,Hotel RL Brooklyn,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595905755,False,1080 Broadway,40.694416,-73.930972,"[{'label': 'display', 'lat': 40.69441592010015...",2849,11221,US,Brooklyn,NY,United States,"[1080 Broadway, Brooklyn, NY 11221, United Sta...",,
1,4bab688ff964a52011a73ae3,Best Western Plus Arena Hotel,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595905755,False,1324 Atlantic Ave,40.678146,-73.948757,"[{'label': 'display', 'lat': 40.67814557015919...",1815,11216,US,Brooklyn,NY,United States,"[1324 Atlantic Ave (at Nostrand Ave), Brooklyn...",at Nostrand Ave,
2,47ba56ddf964a520c94d1fe3,Hotel Delmano,"[{'id': '4bf58dd8d48988d11e941735', 'name': 'C...",v-1595905755,False,128 N 9th St.,40.719774,-73.957869,"[{'label': 'display', 'lat': 40.71977374527582...",6104,11211,US,Brooklyn,NY,United States,"[128 N 9th St. (at Berry St), Brooklyn, NY 112...",at Berry St,81172028.0
3,3fd66200f964a52053eb1ee3,Soho Grand Hotel,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595905755,False,54 Watts St,40.723911,-74.005224,"[{'label': 'display', 'lat': 40.72391073, 'lng...",8759,10013,US,New York,NY,United States,[54 Watts St (Between Grand Street & Canal Str...,Between Grand Street & Canal Street,72165042.0
4,515a1b3ee4b0f84f522c4b7f,The High Line Hotel,"[{'id': '4bf58dd8d48988d1fa931735', 'name': 'H...",v-1595905755,False,180 10th Ave,40.745924,-74.005389,"[{'label': 'display', 'lat': 40.74592369192379...",10627,10011,US,New York,NY,United States,"[180 10th Ave (at W 20th St), New York, NY 100...",at W 20th St,


In [48]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,crossStreet,id
0,Hotel RL Brooklyn,Hotel,1080 Broadway,40.694416,-73.930972,"[{'label': 'display', 'lat': 40.69441592010015...",2849,11221.0,US,Brooklyn,NY,United States,"[1080 Broadway, Brooklyn, NY 11221, United Sta...",,58804440ef46947925418207
1,Best Western Plus Arena Hotel,Hotel,1324 Atlantic Ave,40.678146,-73.948757,"[{'label': 'display', 'lat': 40.67814557015919...",1815,11216.0,US,Brooklyn,NY,United States,"[1324 Atlantic Ave (at Nostrand Ave), Brooklyn...",at Nostrand Ave,4bab688ff964a52011a73ae3
2,Hotel Delmano,Cocktail Bar,128 N 9th St.,40.719774,-73.957869,"[{'label': 'display', 'lat': 40.71977374527582...",6104,11211.0,US,Brooklyn,NY,United States,"[128 N 9th St. (at Berry St), Brooklyn, NY 112...",at Berry St,47ba56ddf964a520c94d1fe3
3,Soho Grand Hotel,Hotel,54 Watts St,40.723911,-74.005224,"[{'label': 'display', 'lat': 40.72391073, 'lng...",8759,10013.0,US,New York,NY,United States,[54 Watts St (Between Grand Street & Canal Str...,Between Grand Street & Canal Street,3fd66200f964a52053eb1ee3
4,The High Line Hotel,Hotel,180 10th Ave,40.745924,-74.005389,"[{'label': 'display', 'lat': 40.74592369192379...",10627,10011.0,US,New York,NY,United States,"[180 10th Ave (at W 20th St), New York, NY 100...",at W 20th St,515a1b3ee4b0f84f522c4b7f
5,Hotel Indigo,Hotel,171 Ludlow St,40.721762,-73.988092,"[{'label': 'display', 'lat': 40.72176221543275...",7606,10002.0,US,New York,NY,United States,"[171 Ludlow St (btw Houston & Stanton), New Yo...",btw Houston & Stanton,54c0151d498e4c827296cd41
6,Hotel 50 Bowery NYC,Hotel,50 Bowery,40.715936,-73.996789,"[{'label': 'display', 'lat': 40.7159364, 'lng'...",7631,10013.0,US,New York,NY,United States,"[50 Bowery (btwn Bayard & Canal St), New York,...",btwn Bayard & Canal St,578692f4498e1054905dbde7
7,Metro Court Hotel,Hotel,,40.669278,-73.926662,"[{'label': 'display', 'lat': 40.669278, 'lng':...",379,10001.0,US,Port Charles,NY,United States,"[Port Charles, NY 10001, United States]",,4bac308df964a52016ea3ae3
8,Hotel Chantelle,Bar,92 Ludlow St,40.718482,-73.989056,"[{'label': 'display', 'lat': 40.71848249951786...",7380,10002.0,US,New York,NY,United States,"[92 Ludlow St (btwn Delancey & Broome St), New...",btwn Delancey & Broome St,4cbcafab035d236aebebe64e
9,The Best Exotic Utica Hotel,Resort,,40.665473,-73.931986,"[{'label': 'display', 'lat': 40.665473, 'lng':...",380,,US,,New York,United States,"[New York, United States]",,4fbebecfe4b0ccc91e010a4d


In [49]:
dataframe_filtered.name

0                Hotel RL Brooklyn
1    Best Western Plus Arena Hotel
2                    Hotel Delmano
3                 Soho Grand Hotel
4              The High Line Hotel
5                     Hotel Indigo
6              Hotel 50 Bowery NYC
7                Metro Court Hotel
8                  Hotel Chantelle
9      The Best Exotic Utica Hotel
Name: name, dtype: object

## 5. Results Section

In [50]:
venues_map = folium.Map(location=[latitude2, longitude2], zoom_start=13) # generate map centred around the desired destination

# add a red circle marker to represent the centroid of our desired destination
folium.features.CircleMarker(
    [latitude2, longitude2],
    radius=10,
    color='red',
    popup='Desired Destination',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the hotels as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=10,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map