# Capstone Final Project - Neighbourhood Recommender Tool

### This notebook contains the code for the capstone project for the IBM data science course

**The notebooks is structured in 4 parts: 1) Data collection 2) Data curation 3) Model building 4) Conclusion and Visualization**

            
### 1. Data collection

### 1.1 Getting the neighbourhood list and geodata for Manhattan and the start neighbourhood

Import required modules

In [11]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files
#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
#!conda install -c conda-forge folium=0.5.0 --yes
import folium
import io
from bs4 import BeautifulSoup
print('Libraries imported.')

Libraries imported.


Defining the start neighbourhood for example case and getting geodata:

In [3]:
current_hood = "Friedrichshain-Kreuzberg, Berlin"

In [232]:
geolocator = Nominatim(user_agent="berlin_explorer")
location = geolocator.geocode(current_hood)
current_latitude = location.latitude
current_longitude = location.longitude
print('The geograpical coordinates of Friedrichshain-Kreuzberg, Berlin are {}, {}.'.format(current_latitude, current_longitude))

The geograpical coordinates of Friedrichshain-Kreuzberg, Berlin are 52.5011154, 13.4442848.


In [235]:
current_list = [[current_hood,current_latitude,current_longitude]]
df_current = pd.DataFrame(current_list, columns=[["Hood","Latitude","Longitude"]])

In [236]:
df_current.head()

Unnamed: 0,Hood,Latitude,Longitude
0,"Friedrichshain-Kreuzberg, Berlin",52.501115,13.444285


In [238]:
df_current.dtypes

Hood          object
Latitude     float64
Longitude    float64
dtype: object

**Defining target neighbourhoods in Manhattan and getting geodata:**

In [12]:
url = "https://en.wikipedia.org/wiki/List_of_Manhattan_neighborhoods"

In [47]:
#create a soup object based in the webpage and find the table
data  = requests.get(url).text 
soup = BeautifulSoup(data,"xml")
tables = soup.find_all('table',{'class':'wikitable sortable'})


In [64]:
t1 = tables[0]
t2 = tables[1]
t3 = tables[2]
t4 = tables[3]
table_list = [t1,t2,t3,t4]

In [134]:
data = []
for tx in table_list:
    for row in tx.find_all('tr'):
        data.append([t.text.strip() for t in row.find_all('td')])

In [196]:
df = pd.DataFrame(data)

In [197]:
df.head()

Unnamed: 0,0,1
0,,
1,Upper Manhattan,Above 96th Street
2,Marble Hill,Physically located on the mainland
3,Inwood,Above Dyckman Street
4,Fort George (part of Washington Heights),East of Broadway between 181st Street and Dyck...


In [198]:
df.drop(columns=[1],inplace=True)
df.drop([0],inplace=True)
df.rename(columns={0:"Hood"},inplace=True)
df.dropna()
df["Hood"]  = df['Hood'].str.replace(r" \(.*\)","")
df["Hood"]  = df['Hood'].str.replace(r" ,.*","")
df["Hood"]  = df['Hood'].str.replace(r"\[.*","")
df["Hood"]  = df['Hood'].str.replace(r" aka.*","")
df["Hood"]  = df['Hood'].str.replace(r"†","")

In [199]:
target_list = df["Hood"].values

In [201]:
target_data = []
for target in target_list:
    
    try:
        target1 = target+", NY"
        geolocator = Nominatim(user_agent="explorer")
        location = geolocator.geocode(target1)
        latitude = location.latitude
        longitude = location.longitude
        target_data.append([target,latitude,longitude])
    except:
        pass

In [229]:
df_target = pd.DataFrame(target_data)
df_target.rename(columns={0:"Hood",1:"Latitude",2:"Longitude"},inplace=True)

In [230]:
df_target.head()

Unnamed: 0,Hood,Latitude,Longitude
0,Upper Manhattan,40.789624,-73.959894
1,Marble Hill,40.876298,-73.910429
2,Inwood,40.869258,-73.920495
3,Fort George,40.859947,-73.928225
4,Washington Heights,40.840198,-73.940221


## 1.2 Finding the most common venues for the current and target neighbourhoods with the Foursquare API

**Setup Foursquare API and define function**

In [215]:
CLIENT_ID = 'LO3PDN4UPA3QVXRMB5NSHK1EPTBJKZPOWDXRQXMK1VLTFRGD' # your Foursquare ID
CLIENT_SECRET = 'R3KEA5X1MO3EE042KWZ3JRQVZAVWFCADQATRW3HEMKPPL0DM' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value
ACCESS_TOKEN = 'CXGJ5PHGQ1UYLTQNKFX0ZHRLITIBUYSZL2PIDVC1WDBE4BK0' # your FourSquare Access Token

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: LO3PDN4UPA3QVXRMB5NSHK1EPTBJKZPOWDXRQXMK1VLTFRGD
CLIENT_SECRET:R3KEA5X1MO3EE042KWZ3JRQVZAVWFCADQATRW3HEMKPPL0DM


In [246]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Hood', 
                  'Hood Latitude', 
                  'Hood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [255]:
def getNearbyVenues_current(name, lat, lng, radius=500):
    
    venues_list=[]
            
    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)
            
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
        
    # return only relevant information for each nearby venue
    venues_list.append([(name, lat, lng, v['venue']['name'], v['venue']['location']['lat'], v['venue']['location']['lng'], v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Hood', 'Hood Latitude', 'Hood Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    
    return(nearby_venues)

**Finding most popular venues for current neighbourhood**

In [257]:
current_venues = getNearbyVenues_current(name="Friedrichshain-Kreuzberg",lat=52.5011154,lng=13.444285)

In [424]:
# one hot encoding
current_onehot = pd.get_dummies(current_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
current_onehot['Hood'] = current_venues['Hood'] 

# move neighborhood column to the first column
fixed_columns = [current_onehot.columns[-1]] + list(current_onehot.columns[:-1])
current_onehot = current_onehot[fixed_columns]

# group venues by neighbourhood
current_grouped = current_onehot.groupby('Hood').sum().reset_index()

In [425]:
current_grouped.head()

Unnamed: 0,Hood,American Restaurant,Art Gallery,Asian Restaurant,Bakery,Bar,Beer Bar,Beer Garden,Bike Shop,Bookstore,Bridge,Burger Joint,Burrito Place,Café,Canal Lock,Caucasian Restaurant,Cocktail Bar,Coffee Shop,Cupcake Shop,Deli / Bodega,Dive Bar,Doner Restaurant,Exhibit,Falafel Restaurant,German Restaurant,Gourmet Shop,Grocery Store,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Theater,Italian Restaurant,Lebanese Restaurant,Mediterranean Restaurant,Mexican Restaurant,Modern European Restaurant,Monument / Landmark,Multiplex,Museum,Music Venue,Nightclub,Organic Grocery,Park,Performing Arts Venue,Pie Shop,Pizza Place,Plaza,Portuguese Restaurant,Pub,Public Art,Record Shop,Restaurant,Sandwich Place,Scenic Lookout,Schnitzel Restaurant,Shoe Store,Silesian Restaurant,Snack Place,Tea Room,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Shop
0,Friedrichshain-Kreuzberg,1,1,1,4,6,1,1,1,1,1,1,1,10,1,1,2,2,1,1,1,1,1,2,1,1,1,1,5,2,4,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,1,2,1,2,1,1,1,1,2,1,1,1,1,1,3,1


**Finding most popular venues for target neighbourhoods**

In [247]:
target_venues = getNearbyVenues(names=df_target['Hood'],latitudes=df_target['Latitude'],longitudes=df_target['Longitude'])

In [307]:
# one hot encoding
target_onehot = pd.get_dummies(target_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
target_onehot['Hood'] = target_venues['Hood'] 

# move neighborhood column to the first column
fixed_columns = [target_onehot.columns[-1]] + list(target_onehot.columns[:-1])
target_onehot = target_onehot[fixed_columns]

# group venues by neighbourhood
target_grouped = target_onehot.groupby('Hood').sum().reset_index()

In [308]:
target_grouped.head()

Unnamed: 0,Hood,Accessories Store,Adult Boutique,African Restaurant,American Restaurant,Amphitheater,Animal Shelter,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Australian Restaurant,Austrian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Bridal Shop,Bridge,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Cha Chaan Teng,Cheese Shop,Chinese Restaurant,Chocolate Shop,Christmas Market,Church,Churrascaria,Circus,Climbing Gym,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Arts Building,College Bookstore,College Cafeteria,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Convenience Store,Cooking School,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Cycle Studio,Czech Restaurant,Dance Studio,Daycare,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General College & University,General Entertainment,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Israeli Restaurant,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Kosher Restaurant,Latin American Restaurant,Laundry Service,Leather Goods Store,Lebanese Restaurant,Library,Lingerie Store,Liquor Store,Lounge,Luggage Store,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mattress Store,Medical Center,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,New American Restaurant,Newsstand,Nightclub,Non-Profit,Noodle House,North Indian Restaurant,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Outdoor Supply Store,Outdoors & Recreation,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Pedestrian Plaza,Peking Duck Restaurant,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Service,Pet Store,Pharmacy,Photography Studio,Piano Bar,Pie Shop,Pier,Pilates Studio,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Post Office,Pub,Public Art,Puerto Rican Restaurant,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Rock Club,Roof Deck,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soba Restaurant,Soccer Field,Social Club,Soup Place,Southern / Soul Food Restaurant,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stables,Stationery Store,Steakhouse,Street Art,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swiss Restaurant,Synagogue,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tech Startup,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Tiki Bar,Tourist Information Center,Toy / Game Store,Track,Trail,Train Station,Tree,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Watch Shop,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Alphabet City and Loisaida,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,8,0,0,0,0,2,0,1,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,6,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,3,0,0,0,0,0,0,1,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,3,0,2,1,0,0,0,0,0,0,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,2,0,0,0,0,0,1,0,0,0,1,2,0,0,0,0,0,0,1,0,0,0,1,0,3,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,3,0,0,0,4,1,0,0,1
1,Astor Row,0,0,3,1,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,2,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,2,0,0,0,0,2,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,4,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,1,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1
2,Battery Park City,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,1,0,2,0,0,0,0,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,4,0,0,7,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,2,0,0,0,0,0,1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,2,0,1,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,2,0,0,0,0,2,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,4,0,0,2,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,2,0,2,3,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,2,0,0,1,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0
3,Bowery,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,5,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,0,0,0,5,0,0,0,0,0,0,0,4,1,0,0,0,0,0,1,0,6,3,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,2,3,0,0,0,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,3,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,3,0,0,0,3,1,0,0,0
4,Carnegie Hill,0,0,0,2,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,1,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,3,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,0,0,1


In [401]:
common_cols = list(set(current_grouped.columns).intersection(target_grouped.columns))



In [402]:
target_reduced = target_grouped[common_cols]

In [406]:
target_reduced.set_index(["Hood"],inplace=True)

In [420]:
target_reduced = target_reduced.reindex(sorted(target_reduced.columns), axis=1)


## 1.3 Build Recommender Model

**Build Neighborhood Profile for current neighbourhood**

In [426]:
current_profile = current_grouped

In [427]:
current_profile.drop(columns=["Hood"],inplace=True)

In [428]:
current_profile = current_profile.transpose()

In [437]:
mult = current_profile[0]

In [442]:
recom = mult*target_reduced

In [447]:
final_recom = recom.dropna(axis = 1, how = 'all')


In [449]:
final_recom["Total"] = final_recom.sum(axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_recom["Total"] = final_recom.sum(axis=1)


In [455]:
final_recom.sort_values(by=["Total"],ascending=False,inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_recom.sort_values(by=["Total"],ascending=False,inplace=True)


In [456]:
final_recommendation = final_recom["Total"]

In [457]:
final_recommendation

Hood
Bowery                                      153
Lower East Side                             152
Rose Hill                                   150
Nolita                                      142
NoMad                                       139
Financial District                          136
Theater District                            135
Little Italy                                135
Hudson Yards                                132
Flower District                             131
Murray Hill                                 130
Garment District                            130
Times Square                                125
Alphabet City and Loisaida                  124
Koreatown                                   122
Upper West Side                             120
Chelsea                                     117
Herald Square                               112
Meatpacking District                        112
Chinatown                                   111
Tudor City                         

In [458]:
df_final_target = df_target.merge(final_recommendation,how="inner",on=["Hood"])

In [461]:
df_final_target.sort_values(by=["Total"],ascending=False,inplace=True)

## 4) Visualize top locations in manhattan

**Top 3 are marked in green, the Top 4-10 are marked in yellow**

In [474]:
# create map of North York using latitude and longitude values
map_recom = folium.Map(location=[40.754932, -73.984016], zoom_start=11)

# add markers to map
for lat, lng, label, rank in zip(df_final_target['Latitude'],df_final_target['Longitude'], df_final_target['Hood'], df_final_target["Total"]):
    if rank >= 150:
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker([lat, lng],radius=5,popup=label,color='green',fill=True,fill_color='#32a852',fill_opacity=0.7,parse_html=False).add_to(map_recom)  
    elif rank >=130:
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker([lat, lng],radius=5,popup=label,color='yellow',fill=True,fill_color='#eded39',fill_opacity=0.7,parse_html=False).add_to(map_recom)  
    else:
        label = folium.Popup(label, parse_html=True)
        folium.CircleMarker([lat, lng],radius=5,popup=label,color='blue',fill=True,fill_color='#3186cc',fill_opacity=0.7,parse_html=False).add_to(map_recom)  
    
    
map_recom