<h1>Defining the problem</h1>

<p1>A real state company has an important client that lives in a district located in the North of Madrid, and she asks to be advised about the most similar neighbourhood in the South of Spain´s capital, as she is planning to move closer to her job place but loves very much her neighbourhood</p1>

<p1>The client tells us she is living in "Tetuan" District and she would like to move to one of the following neighbourhoods down south: Moratalaz, Retiro, Arganzuela</p1>
<p1>So, considering the big amount of data and venues available in Foursquare, we will check statistically which one of the three neighbourhoods in the south of Madrid has similar amenities to "Tetuan" in the north, so we can propose a suitable area</p1>

<h1>Describing data available</h1>

<p1>Given the option to access the venues placed in each of these three neighborhoods using Foursquare API, we will scrap the data and analyse which kind of venues we have available in our currently and preferred neighborhood (Tetuán) and the candidate neighborhoods (Moratalaz, Retiro and Arganzuela). <br> <br>
Once we have the data, we will define a Content Based Recommendation model. ¿What are the steps to make that posible?: <br>
<ol>
  <li>Defining the 4 Neighborhoods and their coordinates in a pandas dataframe</li>
  <li>Getting the venues data with FourSquare API and storing them for each different Neighborhood</li>
  <li>Processing the data:
    <ul>
      <li>Getting dummy variables from column "Category of venue"</li>
      <li>Calculating frequence of ocurrence</li>
      <li>Asuming our client gives maximum rating to her current Neighborhood</li>
      <li>Based on the amenities her current Neighborhood has, calculating which of the three given areas is a similar choice</li>
    </ul>
</ol>
    
</p1>

<h3>Importing libraries that we are to use</h3>

In [12]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

#import BeautifulSoup for webscrapping
from bs4 import BeautifulSoup

#import geocoder to get lat and long
import geocoder

print('Libraries imported.')

Libraries imported.


<h3>Visualizing Madrid neighbourhoods</h3>

In [13]:
madrid_map=folium.Map(location=[40.41,-3.70], zoom_start=12)
#Current Neighbourhood of our client
tetuan=folium.map.FeatureGroup()
tetuan.add_child(folium.features.CircleMarker([40.4605,-3.6982],radius=20,color="red",fill_color="Red"))
madrid_map.add_child(tetuan)
#Target Neighbourhoods
moratalaz=folium.map.FeatureGroup()
moratalaz.add_child(folium.features.CircleMarker([40.4071,-3.6411],radius=10,color="blue",fill_color="blue"))
madrid_map.add_child(moratalaz)

retiro=folium.map.FeatureGroup()
retiro.add_child(folium.features.CircleMarker([40.4138,-3.6762],radius=10,color="blue",fill_color="blue"))
madrid_map.add_child(retiro)

arganzuela=folium.map.FeatureGroup()
arganzuela.add_child(folium.features.CircleMarker([40.3990,-3.6944],radius=10,color="blue",fill_color="blue"))
madrid_map.add_child(arganzuela)


<p1>Our client lives in the red circle and she is used to its amenities, so lets check which one of the three blue destination neighbourhoods will look as familiar as possible</p1>

<h3>Getting venues data from Foursquare in the given 4 neighbourhoods</h3>

In [15]:
#Getting the neighbourhoods, latitude and longitude into a pandas df:
columns=["Neighbourhood","Latitude","Longitude"]
df=pd.DataFrame({'Neighbourhood':["Tetuán","Arganzuela","Retiro","Moratalaz"],'Latitude':[40.4605,40.3990,40.4138,40.4071],"Longitude":[-3.6982,-3.6944,-3.6762,-3.6411]},columns=columns)
df.head()

Unnamed: 0,Neighbourhood,Latitude,Longitude
0,Tetuán,40.4605,-3.6982
1,Arganzuela,40.399,-3.6944
2,Retiro,40.4138,-3.6762
3,Moratalaz,40.4071,-3.6411


In [16]:
#setting credentials for Foursquare API
CLIENT_ID = 'UG3X4D4PEUWB2LU20ZU4BZT12XYRE5OOW0IF3ZEK1SVWYZBN' # your Foursquare ID
CLIENT_SECRET = 'QEMOWWWVMZ4T0ODWLIRNWDXR1NUXZXEBP5HXRFVWWHR2TBUL' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 200 # As they are big areas, lets set the venues limit to 200 hundred so we have enough data to compare accurately

In [17]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

mad_venues = getNearbyVenues(names=df['Neighbourhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Tetuán
Arganzuela
Retiro
Moratalaz


In [18]:
mad_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Tetuán,40.4605,-3.6982,Asador Donostiarra,40.461761,-3.695967,Spanish Restaurant
1,Tetuán,40.4605,-3.6982,El Jamón y el Churrasco,40.460516,-3.696212,Spanish Restaurant
2,Tetuán,40.4605,-3.6982,Ducati Madrid,40.462315,-3.696406,Motorcycle Shop
3,Tetuán,40.4605,-3.6982,Sabor Gaucho,40.46019,-3.694397,Brazilian Restaurant
4,Tetuán,40.4605,-3.6982,La Papita,40.459668,-3.69895,Bar


In [19]:
# one hot encoding
mad_onehot = pd.get_dummies(mad_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
mad_onehot['Neighborhood'] = mad_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [mad_onehot.columns[-1]] + list(mad_onehot.columns[:-1])
mad_onehot = mad_onehot[fixed_columns]

mad_onehot.shape

(335, 106)

In [20]:
mad_grouped = mad_onehot.groupby('Neighborhood').mean().reset_index()
mad_grouped

Unnamed: 0,Neighborhood,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Art Studio,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Chinese Restaurant,Circus,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Cupcake Shop,Dessert Shop,Diner,Discount Store,Dog Run,Falafel Restaurant,Farmers Market,Flea Market,Food Truck,Fountain,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kebab Restaurant,Korean Restaurant,Market,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Paella Restaurant,Park,Performing Arts Venue,Pet Café,Pizza Place,Playground,Plaza,Pool,Pub,Resort,Restaurant,Rock Club,Sandwich Place,Science Museum,Seafood Restaurant,Skate Park,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Thai Restaurant,Theater,Theme Restaurant,Trade School,Train Station,Used Bookstore,Vegetarian / Vegan Restaurant,Wine Bar
0,Arganzuela,0.0,0.01,0.02,0.04,0.02,0.0,0.01,0.0,0.0,0.03,0.01,0.01,0.02,0.0,0.02,0.0,0.0,0.02,0.01,0.02,0.0,0.01,0.02,0.01,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.01,0.0,0.0,0.03,0.01,0.0,0.03,0.03,0.02,0.01,0.0,0.01,0.0,0.01,0.03,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.07,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.06,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.01
1,Moratalaz,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.057143,0.085714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.028571,0.028571,0.0,0.0,0.028571,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.028571,0.0,0.028571,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.057143,0.0,0.0,0.057143,0.057143,0.057143,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.057143,0.028571,0.085714,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Retiro,0.0,0.0,0.01,0.02,0.01,0.0,0.02,0.02,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.03,0.02,0.01,0.02,0.01,0.0,0.0,0.05,0.02,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.0,0.02,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.17,0.01,0.01,0.0,0.02,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0
3,Tetuán,0.01,0.0,0.01,0.01,0.0,0.01,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.02,0.03,0.01,0.01,0.03,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.01,0.01,0.0,0.04,0.02,0.0,0.0,0.0,0.03,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.03,0.01,0.1,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.19,0.0,0.0,0.02,0.02,0.01,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0


In [21]:
num_top_venues = 5

for hood in mad_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = mad_grouped[mad_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Arganzuela----
                venue  freq
0          Restaurant  0.07
1  Spanish Restaurant  0.06
2    Tapas Restaurant  0.06
3         Art Gallery  0.04
4              Bakery  0.03


----Moratalaz----
                venue  freq
0  Spanish Restaurant  0.09
1                 Bar  0.09
2         Pizza Place  0.06
3                Park  0.06
4                Café  0.06


----Retiro----
                venue  freq
0  Spanish Restaurant  0.17
1    Tapas Restaurant  0.07
2          Restaurant  0.05
3               Hotel  0.05
4  Italian Restaurant  0.04


----Tetuán----
                venue  freq
0  Spanish Restaurant  0.19
1          Restaurant  0.10
2               Hotel  0.04
3  Italian Restaurant  0.03
4                 Pub  0.03




In [22]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#running top 10 venues
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = mad_grouped['Neighborhood']

for ind in np.arange(mad_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(mad_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arganzuela,Restaurant,Tapas Restaurant,Spanish Restaurant,Art Gallery,Indie Theater,Gym,Bakery,Grocery Store,Garden,Coffee Shop
1,Moratalaz,Bar,Spanish Restaurant,Plaza,Park,Pizza Place,Playground,Café,Soccer Field,Bakery,Diner
2,Retiro,Spanish Restaurant,Tapas Restaurant,Hotel,Restaurant,Italian Restaurant,Garden,Mediterranean Restaurant,Coffee Shop,Brewery,Seafood Restaurant
3,Tetuán,Spanish Restaurant,Restaurant,Hotel,Italian Restaurant,Tapas Restaurant,Japanese Restaurant,Pub,Burger Joint,Chinese Restaurant,Ice Cream Shop


In [63]:
userNeighProfile=mad_grouped[mad_grouped["Neighborhood"]=="Tetuán"]
userNeighProfile=userNeighProfile.drop("Neighborhood",1)
userNeighProfile

Unnamed: 0,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Art Studio,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Chinese Restaurant,Circus,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Cupcake Shop,Dessert Shop,Diner,Discount Store,Dog Run,Falafel Restaurant,Farmers Market,Flea Market,Food Truck,Fountain,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kebab Restaurant,Korean Restaurant,Market,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Paella Restaurant,Park,Performing Arts Venue,Pet Café,Pizza Place,Playground,Plaza,Pool,Pub,Resort,Restaurant,Rock Club,Sandwich Place,Science Museum,Seafood Restaurant,Skate Park,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Thai Restaurant,Theater,Theme Restaurant,Trade School,Train Station,Used Bookstore,Vegetarian / Vegan Restaurant,Wine Bar
3,0.01,0.0,0.01,0.01,0.0,0.01,0.02,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.02,0.03,0.01,0.01,0.03,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.01,0.01,0.0,0.04,0.02,0.0,0.0,0.0,0.03,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.03,0.01,0.1,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.19,0.0,0.0,0.02,0.02,0.01,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0


In [64]:
userNeighProfile=userNeighProfile.transpose()
userNeighProfile

Unnamed: 0,3
American Restaurant,0.01
Arepa Restaurant,0.0
Argentinian Restaurant,0.01
Art Gallery,0.01
Art Museum,0.0
Art Studio,0.01
Asian Restaurant,0.02
Athletics & Sports,0.0
BBQ Joint,0.01
Bakery,0.01


In [65]:
userTargetNeigh=mad_grouped[mad_grouped["Neighborhood"]!="Tetuán"]
userTargetNeigh=userTargetNeigh.drop("Neighborhood",1)
userTargetNeigh

Unnamed: 0,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Art Studio,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Chinese Restaurant,Circus,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cuban Restaurant,Cupcake Shop,Dessert Shop,Diner,Discount Store,Dog Run,Falafel Restaurant,Farmers Market,Flea Market,Food Truck,Fountain,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kebab Restaurant,Korean Restaurant,Market,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music Venue,Nightclub,Paella Restaurant,Park,Performing Arts Venue,Pet Café,Pizza Place,Playground,Plaza,Pool,Pub,Resort,Restaurant,Rock Club,Sandwich Place,Science Museum,Seafood Restaurant,Skate Park,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Thai Restaurant,Theater,Theme Restaurant,Trade School,Train Station,Used Bookstore,Vegetarian / Vegan Restaurant,Wine Bar
0,0.0,0.01,0.02,0.04,0.02,0.0,0.01,0.0,0.0,0.03,0.01,0.01,0.02,0.0,0.02,0.0,0.0,0.02,0.01,0.02,0.0,0.01,0.02,0.01,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.01,0.0,0.0,0.03,0.01,0.0,0.03,0.03,0.02,0.01,0.0,0.01,0.0,0.01,0.03,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.07,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.06,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.01
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.057143,0.085714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.028571,0.028571,0.0,0.0,0.028571,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.028571,0.0,0.028571,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.057143,0.0,0.0,0.057143,0.057143,0.057143,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.057143,0.028571,0.085714,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.01,0.02,0.01,0.0,0.02,0.02,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.03,0.02,0.01,0.02,0.01,0.0,0.0,0.05,0.02,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.02,0.0,0.02,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.17,0.01,0.01,0.0,0.02,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0


In [82]:
RecomendatedNeighborhood=pd.DataFrame(np.dot(userTargetNeigh,userNeighProfile))
RecomendatedNeighborhood["Neighborhood"]=neighborhoods_venues_sorted["Neighborhood"]
RecomendatedNeighborhood.rename(columns = {0: "Similarity"}, inplace=True)
RecomendatedNeighborhood.sort_values(by="Similarity", ascending=False)

Unnamed: 0,Similarity,Neighborhood
2,0.047,Retiro
0,0.0261,Arganzuela
1,0.023429,Moratalaz


<h1>Commenting the results</h1>

<p1>As we can see in the previous table, the most similar neighborhood to "Tetuán", the place where our client currently lives, is the Neighborhood of Retiro. The output of the model seems to be coherent with reality, given that the non-chosen target Neighborhoods are residential areas, while Retiro and Tetuan are near the comercial and business area of madrid, placed around the center.</p1> <br> <br>
<p1>To make sure about that, lets again review top 10 most common places in each neighborhood:</p1>

In [83]:
neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arganzuela,Restaurant,Tapas Restaurant,Spanish Restaurant,Art Gallery,Indie Theater,Gym,Bakery,Grocery Store,Garden,Coffee Shop
1,Moratalaz,Bar,Spanish Restaurant,Plaza,Park,Pizza Place,Playground,Café,Soccer Field,Bakery,Diner
2,Retiro,Spanish Restaurant,Tapas Restaurant,Hotel,Restaurant,Italian Restaurant,Garden,Mediterranean Restaurant,Coffee Shop,Brewery,Seafood Restaurant
3,Tetuán,Spanish Restaurant,Restaurant,Hotel,Italian Restaurant,Tapas Restaurant,Japanese Restaurant,Pub,Burger Joint,Chinese Restaurant,Ice Cream Shop


<p1>As we can see, Retiro and Tetuan has lots of places to wine and dine, which might be the main preference of our client, while Arganzuela and Moratalaz have a bigger amount of amenities house-related such as Gyms, Bakeries, Grocery Stores, Playground and so on.</p1> <br> <br>
<p1>So, in order to make a good Real Estate service to our client, we will discuss with her the results to make sure her future preferences are not going to change, (ie, she is not planning to start a family, because in that case Moratalaz might be a better fit) and Retiro will remain a good choice</p1>