# The Neighborhood Locator App 

### A data science report created by Belvin Thomas


## 1.	Introduction 


#### 1.1  A description of the problem and a discussion of the background
I am currently working in Toronto, Canada. I have been here for a while and have kept moving places frequently until I started living in my current neighborhood. I am excited that I got a new job offer in NewYork. I would be moving there in a month. I need to locate a neighborhood similar to my current one in Toronto as I like the amenities around me. Since I have moved around a lot I know that I will not be very happy to settle down in a very different neighborhood. My conditions are that I should get to walk to my new office while being able to enjoy similar amenities as I am currently enjoying in Toronto.  But, NewYork is not familiar for me as I have never travelled to US. I would like to solve this as a data science problem using the location data available from foursquare api. 

#### 1.2 Clear definition of business problem and potential audience

The business problem is to identify a neighborhood in Newyork ,USA with facilities similar to those being enjoyed in Toronto, Canada. The addresses of source (St Patrick street, Toronto) and target city (West Houston Street, Manhattan, NewYork)are given.However, the facilities that make the neighborhoods similar need to be identified based on location data.

Currently, I am solving this as a data science problem for my own purpose. But, this problem and its solution can be scaled for other people. This will be useful for people moving from one city to another city to select a favorite neighborhood in the target city. I am interested to develop an app later based on my solution. It should be able to have conditions based on my favorite amenities and desired location in target city similar to those being enjoyed in the current city.





## 2.	Data Section 

#### 2.1 Source of data
**Given Information :**

Source address : St Patrick street, Toronto, Canada

Target address : West Houston Street, Manhattan, NewYork

**Neighborhood Dataset for Toronto :** 
    Get data by scraping the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe. Process the dataset as described in clustering_sourcecity_neighborhoods notebook. A few dataframes from this notebook will be used to prepare dataset for current problem
    
**Neighborhood Dataset for NewYork :**
NewYork neighborhood has a total of 5 boroughs and 306 neighborhoods. In order to segement the neighborhoods and explore them, we will essentially need a dataset that contains the 5 boroughs and the neighborhoods that exist in each borough as well as the the latitude and logitude coordinates of each neighborhood.Luckily, this dataset exists for free on the web. Here is the link to the dataset: https://geo.nyu.edu/catalog/nyu_2451_34572.  

#### 2.2 Description of data

This project uses the foursquare location data acquired for the neighborhoods of Toronto and NewYork. The final dataset used for this project is made by merging required data based on clustering of source and target city - Neighborhood datasets related to Toronto and NewYork. The logic used to merge and collate the data is described below. 

From the clustering of Toronto neighborhoods , the clustered data shows two neighborhoods nearest to the source address are clustered together.  These two neighborhoods are 1)Kensington Market, Chinatown, Grange Park and 2) Toronto Dominion Centre, Design Exchange. The data related to these neighborhoods point to the user's current interests. This can be then used in combination with NewYork neighborhoods data to solve the problem.

**1)manhattan_toronto_grouped_clustering.csv**
This data is prepared by merging the grouped_clustering dataframes of Manhattan neighborhood clustering and selected neighborhoods in toronto clustering. It is passed as input to the combined k-means clustering 

**2)mh_tor_data.csv**
This contains the merged location info for selected Toronto neighborhoods and all Manhattan neighborhoods.

**3)mh_tor_neighborhoods_venues_sorted.csv**
This data will be used to identify the most favorite amenities preferred by the user.neighborhood_venues_sorted dataframe contains the 10 most common venues in each neighborhood sorted based on their popularity in that neighborhood. From the neighborhood_venues_sorted dataframe, the rows related to these two neighborhoods are selected. Thus the user's favorite amenities are identified. This is merged with neighborhood_venues_sorted dataframe from Manhattan Clustering problem. Thus the merged mh_tor_neighborhoods_venues_sorted.csv is obtained

These files are stored in following csv files:
1) manhattan_toronto_grouped_clustering.csv
2) mh_tor_data.csv
3) mh_tor_neighborhoods_venues_sorted.csv

#### 2.3 How the data is used to solve the problem

At first, the source city neighborhoods are clustered (See Clustering_SourceCity_Neighborhoods notebook for more details). From the resulting clusters, the particular cluster containing most of the neighborhoods nearest to the source address is identified.  It turns out to be the cluster with neighborhoods 1)Kensington Market, Chinatown, Grange Park and 2) Toronto Dominion Centre, Design Exchange. The data related to these neighborhoods point to the user's current interests. Once the user's interests are extracted from this cluster, then this information is used in combination with NewYork neighborhoods data to solve the problem.

The relevant data from these two neighborhoods and the Manhattan neighborhoods are merged and saved as explained in previous section. Now, this combined data will be segmented using k-means clustering. In this final clustering, the cluster containing the two selected Toronto neighborhood will provide the NewYork neighborhoods matching with them, in terms of the amenities. From this reduced list of NewYork neighborhoods, the user can identify the neighborhood closest to his potential office. This will find the right neighborhood for him and satisfy all his conditions.

## 3. Methodology 
(Section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.)

The problem discussed here is intended to help a user to locate a neighborhood close to his potential NewYork office. Source address of user is given as a street address in Toronto, Canada. An inferential data analysis was performed based on the Source City clustering notebook (Clustering_SourceCity_Neighborhoods). From these clustering results, the clusters with those neighborhoods close to the source address (Kensington Market and Toronto Dominion Centre)are considered to be containing user's favorite venues. The neighborhood data related to those clusters are merged with the corresponding data for NewYork neighborhoods. Specifically, Manhattan neighborhoods are used here since the target address is in Manhattan area. The data sources, merging logic and data description are presented in the data section.

The merged dataset is then used to solve the problem using machine learning techniques as explained below:

In particular, k-means clustering technique is applied to segment the neighborhoods into clusters with similar features. The cluster containing the two selected Toronto neighborhood will provide the Newyork neighborhoods matching with them, in terms of the amenities. From this reduced list of Newyork neighborhoods, the user can identify the neighborhood closest to his potential office. This will find the right neighborhood for him and satisfy all his conditions.


Download all the dependencies needed.

In [1]:


import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if needed
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if needed
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


Let's get the geographical coordinates of Manhattan.

In [2]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


#### Cluster the combined set of Neighborhoods
Run *k*-means to cluster the neighborhood into 5 clusters.

In [3]:
# set number of clusters
kclusters = 5

mh_toronto=pd.read_csv('manhattan_toronto_grouped_clustering.csv')
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(mh_toronto)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 3, 3, 3, 3, 3, 3, 3, 1, 1])

Let's get the dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [4]:
neighborhoods_venues_sorted=pd.read_csv('mh_tor_neighborhoods_venues_sorted.csv')

Add clustering labels and make the manhattan_merged dataframe

In [5]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = pd.read_csv('mh_tor_data.csv')

# merge the data to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged # check the last 2 rows!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,2,Coffee Shop,Gym,Sandwich Place,Discount Store,Diner,Pizza Place,Steakhouse,Supplement Shop,Big Box Store,Seafood Restaurant
1,Manhattan,Chinatown,40.715618,-73.994279,3,Chinese Restaurant,Cocktail Bar,Bakery,Dessert Shop,Spa,Salon / Barbershop,American Restaurant,Optical Shop,Vietnamese Restaurant,Noodle House
2,Manhattan,Washington Heights,40.851903,-73.9369,0,Café,Bakery,Chinese Restaurant,Deli / Bodega,Mobile Phone Shop,Grocery Store,Sandwich Place,Bank,Tapas Restaurant,Coffee Shop
3,Manhattan,Inwood,40.867684,-73.92121,1,Mexican Restaurant,Restaurant,Café,Lounge,Spanish Restaurant,Bakery,Park,Pizza Place,Chinese Restaurant,Caribbean Restaurant
4,Manhattan,Hamilton Heights,40.823604,-73.949688,1,Pizza Place,Café,Coffee Shop,Mexican Restaurant,Yoga Studio,Park,Caribbean Restaurant,School,Chinese Restaurant,Sandwich Place
5,Manhattan,Manhattanville,40.816934,-73.957385,2,Coffee Shop,Seafood Restaurant,Mexican Restaurant,Sushi Restaurant,Italian Restaurant,Chinese Restaurant,Ramen Restaurant,Café,Boutique,Diner
6,Manhattan,Central Harlem,40.815976,-73.943211,3,African Restaurant,Cosmetics Shop,Chinese Restaurant,Fried Chicken Joint,French Restaurant,Bar,American Restaurant,Seafood Restaurant,Boutique,Bookstore
7,Manhattan,East Harlem,40.792249,-73.944182,3,Mexican Restaurant,Bakery,Thai Restaurant,Latin American Restaurant,Deli / Bodega,Sandwich Place,Spanish Restaurant,Liquor Store,Gas Station,Taco Place
8,Manhattan,Upper East Side,40.775639,-73.960508,0,Italian Restaurant,Bakery,Exhibit,Coffee Shop,Gym / Fitness Center,Yoga Studio,Hotel,Juice Bar,French Restaurant,Cosmetics Shop
9,Manhattan,Yorkville,40.77593,-73.947118,0,Italian Restaurant,Gym,Coffee Shop,Bar,Sushi Restaurant,Deli / Bodega,Wine Shop,Diner,Japanese Restaurant,Bagel Shop


Finally, let's visualize the resulting clusters
#### (Tip :zoom down the map to see clusters in Toronto and Manhattan)

In [6]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Examine the Clusters 
Now, let us examine each cluster and determine the discriminating venue categories that distinguish each cluster.

#### Cluster 0

In [7]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Washington Heights,Café,Bakery,Chinese Restaurant,Deli / Bodega,Mobile Phone Shop,Grocery Store,Sandwich Place,Bank,Tapas Restaurant,Coffee Shop
8,Upper East Side,Italian Restaurant,Bakery,Exhibit,Coffee Shop,Gym / Fitness Center,Yoga Studio,Hotel,Juice Bar,French Restaurant,Cosmetics Shop
9,Yorkville,Italian Restaurant,Gym,Coffee Shop,Bar,Sushi Restaurant,Deli / Bodega,Wine Shop,Diner,Japanese Restaurant,Bagel Shop
12,Upper West Side,Italian Restaurant,Bar,Café,Wine Bar,Coffee Shop,Vegetarian / Vegan Restaurant,Indian Restaurant,Ice Cream Shop,Mediterranean Restaurant,Thai Restaurant
24,West Village,Italian Restaurant,New American Restaurant,American Restaurant,Cocktail Bar,Wine Bar,Park,Jazz Club,Theater,Pizza Place,Coffee Shop
35,Turtle Bay,Italian Restaurant,Coffee Shop,Sushi Restaurant,Deli / Bodega,Japanese Restaurant,Park,Ramen Restaurant,French Restaurant,Seafood Restaurant,Karaoke Bar
40,"Kensington Market, Chinatown, Grange Park",Café,Vegetarian / Vegan Restaurant,Coffee Shop,Vietnamese Restaurant,Bar,Mexican Restaurant,Park,Grocery Store,Pizza Place,Dumpling Restaurant
41,"Toronto Dominion Centre, Design Exchange",Coffee Shop,Hotel,Café,Restaurant,American Restaurant,Seafood Restaurant,Gastropub,Salad Place,Italian Restaurant,Japanese Restaurant


#### Cluster 1

In [8]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Inwood,Mexican Restaurant,Restaurant,Café,Lounge,Spanish Restaurant,Bakery,Park,Pizza Place,Chinese Restaurant,Caribbean Restaurant
4,Hamilton Heights,Pizza Place,Café,Coffee Shop,Mexican Restaurant,Yoga Studio,Park,Caribbean Restaurant,School,Chinese Restaurant,Sandwich Place
10,Lenox Hill,Italian Restaurant,Sushi Restaurant,Coffee Shop,Pizza Place,Cocktail Bar,Café,Gym,Gym / Fitness Center,Burger Joint,Thai Restaurant
18,Greenwich Village,Italian Restaurant,Sushi Restaurant,Clothing Store,Café,Ice Cream Shop,Indian Restaurant,Gym,Boutique,Chinese Restaurant,Burger Joint
19,East Village,Bar,Pizza Place,Mexican Restaurant,Wine Bar,Ice Cream Shop,Cocktail Bar,Italian Restaurant,Vietnamese Restaurant,Speakeasy,Korean Restaurant
27,Gramercy,Bar,Italian Restaurant,American Restaurant,Bagel Shop,Thai Restaurant,Coffee Shop,Pizza Place,Cocktail Bar,Diner,Playground
29,Financial District,Coffee Shop,Bar,Hotel,Cocktail Bar,Pizza Place,Steakhouse,Gym,Café,Park,American Restaurant
38,Flatiron,Cycle Studio,New American Restaurant,Italian Restaurant,Japanese Restaurant,Furniture / Home Store,Gym,Mediterranean Restaurant,American Restaurant,Sporting Goods Shop,Mexican Restaurant
39,Hudson Yards,Hotel,Gym / Fitness Center,Café,American Restaurant,Thai Restaurant,Italian Restaurant,Gym,Park,Dog Run,Nightclub


#### Cluster 2

In [9]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Marble Hill,Coffee Shop,Gym,Sandwich Place,Discount Store,Diner,Pizza Place,Steakhouse,Supplement Shop,Big Box Store,Seafood Restaurant
5,Manhattanville,Coffee Shop,Seafood Restaurant,Mexican Restaurant,Sushi Restaurant,Italian Restaurant,Chinese Restaurant,Ramen Restaurant,Café,Boutique,Diner
13,Lincoln Square,Plaza,Café,Concert Hall,Performing Arts Venue,Theater,Indie Movie Theater,American Restaurant,Gym / Fitness Center,Wine Shop,Italian Restaurant
15,Midtown,Hotel,Bakery,Coffee Shop,Clothing Store,Sporting Goods Shop,Theater,Steakhouse,Sandwich Place,Bookstore,Japanese Restaurant
20,Lower East Side,Chinese Restaurant,Bakery,Park,Pizza Place,Ramen Restaurant,Art Gallery,Coffee Shop,Café,Japanese Restaurant,Yoga Studio
22,Little Italy,Bakery,Coffee Shop,Café,Mediterranean Restaurant,Chinese Restaurant,Italian Restaurant,Bubble Tea Shop,Cocktail Bar,Ice Cream Shop,Pizza Place
25,Manhattan Valley,Bar,Coffee Shop,Playground,Mexican Restaurant,Yoga Studio,Pizza Place,Park,Fried Chicken Joint,Korean Restaurant,Latin American Restaurant
26,Morningside Heights,Park,Bookstore,American Restaurant,Coffee Shop,Deli / Bodega,Burger Joint,Seafood Restaurant,Frozen Yogurt Shop,Supermarket,Mediterranean Restaurant
33,Midtown South,Korean Restaurant,Hotel,Japanese Restaurant,Burger Joint,Dessert Shop,Café,Gym / Fitness Center,Bakery,American Restaurant,Cosmetics Shop


#### Cluster 3

In [10]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Chinatown,Chinese Restaurant,Cocktail Bar,Bakery,Dessert Shop,Spa,Salon / Barbershop,American Restaurant,Optical Shop,Vietnamese Restaurant,Noodle House
6,Central Harlem,African Restaurant,Cosmetics Shop,Chinese Restaurant,Fried Chicken Joint,French Restaurant,Bar,American Restaurant,Seafood Restaurant,Boutique,Bookstore
7,East Harlem,Mexican Restaurant,Bakery,Thai Restaurant,Latin American Restaurant,Deli / Bodega,Sandwich Place,Spanish Restaurant,Liquor Store,Gas Station,Taco Place
14,Clinton,Theater,American Restaurant,Gym / Fitness Center,Coffee Shop,Sandwich Place,Italian Restaurant,Gym,Wine Shop,Spa,Hotel
17,Chelsea,Coffee Shop,American Restaurant,Art Gallery,Italian Restaurant,Bakery,Ice Cream Shop,French Restaurant,Cycle Studio,Theater,Market
28,Battery Park City,Park,Hotel,Gym,Coffee Shop,Memorial Site,Sandwich Place,Playground,Plaza,Burger Joint,Gourmet Shop
30,Carnegie Hill,Coffee Shop,Café,Yoga Studio,Bookstore,Gym / Fitness Center,Gym,Italian Restaurant,French Restaurant,Cosmetics Shop,Pizza Place
32,Civic Center,Coffee Shop,Yoga Studio,Gym / Fitness Center,Spa,Cocktail Bar,Hotel,Wine Shop,French Restaurant,Park,American Restaurant


#### Cluster 4

In [11]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Roosevelt Island,Park,Food & Drink Shop,Gym,Greek Restaurant,Kosher Restaurant,Coffee Shop,Dry Cleaner,Liquor Store,Sandwich Place,Scenic Lookout
16,Murray Hill,Sandwich Place,Coffee Shop,Hotel,Gym / Fitness Center,American Restaurant,Japanese Restaurant,Pizza Place,Gym,Restaurant,Sushi Restaurant
21,Tribeca,Park,American Restaurant,Italian Restaurant,Spa,Wine Shop,Wine Bar,Café,Greek Restaurant,Coffee Shop,Men's Store
23,Soho,Clothing Store,Italian Restaurant,Coffee Shop,Boutique,Bakery,Mediterranean Restaurant,Shoe Store,Sporting Goods Shop,Salon / Barbershop,Pizza Place
31,Noho,Italian Restaurant,Hotel,Pizza Place,Yoga Studio,Grocery Store,Coffee Shop,Mexican Restaurant,Wine Bar,Wine Shop,Bookstore
34,Sutton Place,Italian Restaurant,Park,Coffee Shop,Furniture / Home Store,Gym / Fitness Center,Pizza Place,Grocery Store,Bagel Shop,Bakery,Bar
36,Tudor City,Park,Café,Mexican Restaurant,Coffee Shop,Deli / Bodega,Diner,Garden,Thai Restaurant,Greek Restaurant,Seafood Restaurant
37,Stuyvesant Town,Park,Baseball Field,Heliport,Gas Station,Farmers Market,Boat or Ferry,Bistro,Gym / Fitness Center,Bar,Cocktail Bar


## 4. Results
Five clusters are formed from the **k-means clustering** performed in previous section. The neighborhoods selected from Source city (Toronto) are observed to have clustered with a few Target City (NewYork) neighborhoods in the **Cluster 0**. Now let us check from the map to see which of the neighborhoods in the cluster 0 is located closer to the potential job location.

It turns out that the **West Village neigborhood** is closest to the user's Manhattan office address (West Houston Street, Manhattan, NewYork) from all the Manhattan neighborhoods clustered along with his favorite Toronto neighborhoods. Hence it is selected as the prefereed neighborhood  to solve this problem

## 5. Discussion

In this report, the location data derived from https://foursquare.com/ has been used effectively to solve a data science problem. The neighborhood data from two different cities are merged to define a new dataset. The solution can be generalised for use of wider audience.  

The system is developed in such a way that **user just needs to give the source address and target address**. Then the **favorite amenities of the user are automatically identified by the system** using the location data from the source city. These amenities are matched against the target city neighborhoods to choose the right target city neighborhood for the user. The right one is that neighborhood which satisfies all the conditions of the user inclusing the proximity to the target address.

A quick comparison between the neighborhoods in the selected cluster (Cluster0) reveals that the **user gets most of his favorite amenities in his new city** as well. User can decide to go with any of the neighborhoods in cluster0. But **West village** is chosen due to its proximity to the target address. 

Here we have used first 10 of the user's favorite venues to reach the conclusion. This approach **can be extended to include more venues** in the future. This could make the system more robust. Also, there can be a **provision for the user to add a few extra amenities** to the list of his favorites. This will enhance the **user experience**.


## 6. Conclusion

To conclude this report, a **Machine learning** based approach has been described to identify similar neighborhoods from two different cities. Location data derived from **https://foursquare.com/** has been used to solve this as a **data science problem.**  It could help people moving from one city to another to locate a neighborhood with all their favorite amenities. This is just one application of this solution and this can have a **variety of other usecases involving international travellers**.