# Where Should I Move To? Introduction
One of the biggest struggles when moving into a new city is deciding which neighborhood to move to. 
This project will target the audience that would be interested in moving to New York City, who lack knowledge in neighborhoods and will assist on recommending which neighborhood is the most suitable for the client based on user's preference.

# Process
This will be done by using the List of New York City Neighborhood from NYU and Foursquare.

Steps that will be taken:
1. Extract NYC Neighborhood data from NYU and join with venue data from FourSquare.
2. Apply ratings and likes data from FourSquare and group the average by venue category per Neighborhood.
3. Create lists per Neighbor hood with top 10~20 rated/liked venue categories.
4. User will select 5-10 venue categories that is the most important for them.
5. User preference data will be used as a benchmark to determine which neighborhood is the most suitable for the client.

In [1]:
import pandas as pd
import requests

In [2]:
url1 = 'https://cocl.us/new_york_dataset'

resp = requests.get(url=url1)
newyork_data = resp.json()    
ny_neibourhood = newyork_data['features']

In [3]:
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
neighborhoods = pd.DataFrame(columns=column_names)

for data in ny_neibourhood:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [4]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [5]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode('New York City, NY')

lat = location.latitude
lon = location.longitude

print('New York City: {}, {}.'.format(lat, lon))

New York City: 40.7127281, -74.0060152.


In [6]:
import folium
# create map of New York using latitude and longitude values
map_ny = folium.Map(location=[lat, lon], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_ny)  
    
map_ny

In [7]:
CLIENT_ID = 'UA2LVSZWPOI1XSP20TO24BEUWRMFX0BO3VKAPAMKJ3BPMRQQ' # your Foursquare ID
CLIENT_SECRET = 'FD5Y1O3CRUE2WRXK0QGLLMFRGEV4Z44W2E3ESR2G1K3AMWXX' # your Foursquare Secret
VERSION = '20210405' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: UA2LVSZWPOI1XSP20TO24BEUWRMFX0BO3VKAPAMKJ3BPMRQQ
CLIENT_SECRET:FD5Y1O3CRUE2WRXK0QGLLMFRGEV4Z44W2E3ESR2G1K3AMWXX


In [8]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [9]:
ny_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )

Wakefield
Co-op City
Eastchester
Fieldston
Riverdale
Kingsbridge
Marble Hill
Woodlawn
Norwood
Williamsbridge
Baychester
Pelham Parkway
City Island
Bedford Park
University Heights
Morris Heights
Fordham
East Tremont
West Farms
High  Bridge
Melrose
Mott Haven
Port Morris
Longwood
Hunts Point
Morrisania
Soundview
Clason Point
Throgs Neck
Country Club
Parkchester
Westchester Square
Van Nest
Morris Park
Belmont
Spuyten Duyvil
North Riverdale
Pelham Bay
Schuylerville
Edgewater Park
Castle Hill
Olinville
Pelham Gardens
Concourse
Unionport
Edenwald
Bay Ridge
Bensonhurst
Sunset Park
Greenpoint
Gravesend
Brighton Beach
Sheepshead Bay
Manhattan Terrace
Flatbush
Crown Heights
East Flatbush
Kensington
Windsor Terrace
Prospect Heights
Brownsville
Williamsburg
Bushwick
Bedford Stuyvesant
Brooklyn Heights
Cobble Hill
Carroll Gardens
Red Hook
Gowanus
Fort Greene
Park Slope
Cypress Hills
East New York
Starrett City
Canarsie
Flatlands
Mill Island
Manhattan Beach
Coney Island
Bath Beach
Borough Park
Dyker

In [10]:
print(ny_venues.shape)
ny_venues.drop(ny_venues.loc[ny_venues['Venue Category']=='Neighborhood'].index, inplace=True)
print(ny_venues.shape)

ny_venues

(10167, 7)
(10162, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Wakefield,40.894705,-73.847201,Lollipops Gelato,40.894123,-73.845892,Dessert Shop
1,Wakefield,40.894705,-73.847201,Rite Aid,40.896649,-73.844846,Pharmacy
2,Wakefield,40.894705,-73.847201,Walgreens,40.896528,-73.844700,Pharmacy
3,Wakefield,40.894705,-73.847201,Carvel Ice Cream,40.890487,-73.848568,Ice Cream Shop
4,Wakefield,40.894705,-73.847201,Dunkin',40.890459,-73.849089,Donut Shop
...,...,...,...,...,...,...,...
10162,Fox Hills,40.617311,-74.081740,SUBWAY,40.618939,-74.082881,Sandwich Place
10163,Fox Hills,40.617311,-74.081740,Mona's Cuisine,40.618282,-74.084975,African Restaurant
10164,Fox Hills,40.617311,-74.081740,Bums Backyard,40.618083,-74.085603,Cocktail Bar
10165,Fox Hills,40.617311,-74.081740,Stop 1 Supermarket,40.614576,-74.084714,Grocery Store


In [122]:
ny_venues['Venue Category'].unique()

array(['Dessert Shop', 'Pharmacy', 'Ice Cream Shop', 'Donut Shop',
       'Sandwich Place', 'Food', 'Laundromat', 'Pizza Place',
       'Discount Store', 'Post Office', 'Bagel Shop', 'Grocery Store',
       'Restaurant', 'Fast Food Restaurant', 'Baseball Field',
       'Jazz Club', 'Fried Chicken Joint', 'Trail', 'Bus Station', 'Park',
       'Accessories Store', 'Caribbean Restaurant', 'Diner',
       'Seafood Restaurant', 'Deli / Bodega', 'Chinese Restaurant',
       'Bowling Alley', 'Business Service', 'Bus Stop', 'Automotive Shop',
       'Food & Drink Shop', 'Platform', 'Convenience Store', 'Juice Bar',
       'Cosmetics Shop', 'Plaza', 'River', 'Music Venue', 'Bank',
       'Food Truck', 'Gym', 'Playground', 'Gourmet Shop',
       'Latin American Restaurant', 'Pub', 'Beer Bar', 'Burger Joint',
       'Mexican Restaurant', 'Spanish Restaurant', 'Coffee Shop',
       'Thrift / Vintage Store', 'Warehouse Store', 'Bar', 'Wings Joint',
       'Supermarket', 'Bakery', 'Candy Store', 'R

In [23]:
catcount = ny_venues.groupby(['Neighborhood', 'Venue Category']).size().reset_index(name='counts')
catcount = catcount.where(catcount['counts']!=1)
catcount = catcount.dropna(how='all').sort_values(by=['counts'])
catcount

Unnamed: 0,Neighborhood,Venue Category,counts
5,Allerton,Deli / Bodega,2.0
3097,Hamilton Heights,Latin American Restaurant,2.0
3093,Hamilton Heights,Indian Restaurant,2.0
5540,Rockaway Beach,Ice Cream Shop,2.0
3083,Hamilton Heights,Cocktail Bar,2.0
...,...,...,...
4788,North Side,Coffee Shop,11.0
3045,Greenwich Village,Italian Restaurant,11.0
535,Belmont,Italian Restaurant,18.0
4337,Midtown South,Korean Restaurant,19.0


In [107]:
catpivot = catcount.pivot(index='Neighborhood', columns='Venue Category', values='counts')
catpivot = catpivot.reset_index(level=['Neighborhood'])
catpivot

Venue Category,Neighborhood,African Restaurant,American Restaurant,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,...,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Allerton,,,,,,,,,,...,,,,,,,,,,
1,Annadale,,,,,,,,,,...,,,,,,,,,,
2,Arrochar,,,,,,,,,,...,,,,,,,,,,
3,Arverne,,,,,,,,,,...,,,,,,,,,,
4,Astoria,,,,,,,,,2.0,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
245,Wingate,,,,,,,,,,...,,,,,,,,,,
246,Woodhaven,,,,,,,,,,...,,,,,,,,,,
247,Woodlawn,,,,,,,,,,...,,,,,,,,,,
248,Woodside,,3.0,,,,,,,,...,,,,,,,,,,


In [112]:
Features = catpivot.iloc[:,1:]
Features

Venue Category,African Restaurant,American Restaurant,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Toy / Game Store,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,,,,,,,,,,,...,,,,,,,,,,
1,,,,,,,,,,,...,,,,,,,,,,
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,2.0,3.0,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
245,,,,,,,,,,,...,,,,,,,,,,
246,,,,,,,,,,,...,,,,,,,,,,
247,,,,,,,,,,,...,,,,,,,,,,
248,,3.0,,,,,,,,4.0,...,,,,,,,,,,


In [136]:
catpivot_norm=((Features-Features.min())/(Features.max()-Features.min()))

In [137]:
catpivot_norm['Neighborhood'] = catpivot['Neighborhood']

In [138]:
catpivot_norm

Venue Category,African Restaurant,American Restaurant,Arepa Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,...,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Neighborhood
0,,,,,,,,,,,...,,,,,,,,,,Allerton
1,,,,,,,,,,,...,,,,,,,,,,Annadale
2,,,,,,,,,,,...,,,,,,,,,,Arrochar
3,,,,,,,,,,,...,,,,,,,,,,Arverne
4,,,,,,,,,0.0,0.142857,...,,,,,,,,,,Astoria
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
245,,,,,,,,,,,...,,,,,,,,,,Wingate
246,,,,,,,,,,,...,,,,,,,,,,Woodhaven
247,,,,,,,,,,,...,,,,,,,,,,Woodlawn
248,,0.333333,,,,,,,,0.285714,...,,,,,,,,,,Woodside


User will be selecting the venue category that is the most important for them, this can also be done via FourSquare User detail and see what they have rated in the past.

In [149]:
UserPref = catpivot_norm[['Neighborhood','Fast Food Restaurant','Bar', 'Karaoke Bar', 'Coffee Shop' ,'Korean Restaurant', 'Baseball Stadium', 'Dog Run']]

UserPref['PrefRating'] = UserPref.sum(axis=1)
Recommended = UserPref.sort_values(by=['PrefRating'], ascending=False)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  UserPref['PrefRating'] = UserPref.sum(axis=1)


In [151]:
Recommended.head()

Venue Category,Neighborhood,Fast Food Restaurant,Bar,Karaoke Bar,Coffee Shop,Korean Restaurant,Baseball Stadium,Dog Run,PrefRating
154,Murray Hill,,0.333333,,0.666667,1.0,,,2.0
101,Greenpoint,,0.833333,,0.555556,,,,1.388889
163,North Side,,0.333333,,1.0,0.0,,,1.333333
79,Financial District,,0.333333,,0.888889,,,,1.222222
69,East Village,,1.0,,0.111111,0.05,,,1.161111
