<h1>Capstone Project - The Battle of Neighborhoods Moncton, NB </h1>
<i>Author: Philip Igwubor</i>

<h2><hr>1.0 Introduction: Background</h2>
 
Canada is the one of the largest country by landmass and it is divided into seven main regions namely; <b>Atlantic, Canadian Shield, Cordillera, Great Lakes, Prairies, North, St. Lawrence River.</b> The Atlantic region which comprises of New Brunswick, Nova Scotia, Prince Edward Island and Newfound Land, are accepting large amount of immigrants to settle in their provinces. 

<b>New Brunswick (NB)</b> is my major focus in this analysis, reason is because it is one of the fastest growing province in the Atlantic and it is the only officially bilingual province that is giving a lot of opportunities to immigrants to settle in its province, while creating enabling environment for new companies to establish its business via moderate tax, mortgage and better amenities and social life to its new inhabitants. Many of the new immigrants which I am also a beneficiary are always faced with the decision of which city is the best to settle-in in terms of better education, amenities, community gathering, social life, multicultural society for bilingual speakers and so on. 

<b>Moncton</b> is the city of choice because it is multicultural, diverse and serves as the financial hub of New Brunswick It ranks as major cities thriving to boost the province financial wealth and economy GDP of Canada at large.


<h3>1.1 Problem to be resolved:</h3>
    
The challenge is to check how this cities compare with other big cities (Toronto, New York etc.) in terms of settlement, amenities, business, entertainment services and also find-out if it offers similar characteristics and benefits like other cities: 

- Check the educational system as immigrants are more concerned about availability of quality education for their offspring
- Community settlements and how viable it is for business minded immigrants
- Amenities and venues similar to the cities such as Toronto, New York, etc

<h3>1.2 Targeted Audience</h3>

This project is relevant for immigrants considering moving to the Atlantic region of Canada since the approach and methodologies used here is centered on a major cities in the region. The use of FourSquare data and mapping techniques combined with data analysis will help resolve the key questions listed in the problem description. 


<h2>2.0 Data Source & Description</h2>

The data needed to carry-out my analysis is sourced from the Moncton website. The data file can be download from www.monton.ca in CSV format alongside with geospatial coordinate of the areas that have all the categories of activities that needs to be analyzed in the report.  Based on definition of the problem, factors that will influence my decision are:

1. The Demographics of Moncton and its neighborhood (Dieppe, Riverview, LSD)
2. Location of schools and educational facilities for students and working immigrants alike?
3. Which borough is the most vibrant neighborhood for settlers
4. Segmentation of the Borough
5. Are there any venues like Gyms, Entertainment zones, Parks etc. for recreational activities 
6. Untapped resources for business minded immigrants etc.

Following data sources will be needed to extract/generate the required information:
- Part 1: http://open.moncton.ca/datasets/points-of-interest : This dataset contains points representing specific locations within the City of Moncton such as municipal facilities, schools and hospitals

- Part 3: https://catalogue-moncton.opendata.arcgis.com/datasets/points-of-interest/data.   : Co-ordinate of neighborhood download in JSON file (https://opendata.arcgis.com/datasets/f13ec4ddadde46f6b0875047b13ec333_0.geojson)


<h2>3.0 Methodology</h2>

#### Download the dependencies needed to expore and analysis our data

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    scikit-learn-0.20.1        |   py36h22eb022_0         5.7 MB
    liblapack-3.8.0            |      11_openblas          10 KB  conda-forge
    liblapacke-3.8.0           |      11_openblas          10 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    libopenblas-0.3.6          |       h5a2b251_2         7.7 MB
    scipy-1.4.1                |   py36h921218d_0        18.9 MB  conda-forge
    libcblas-3.8.0             |      11_openblas        

<h4>DownLoad and Explore the Dataset</h4>

In [2]:
!wget -q -O 'moncton_data.json' https://opendata.arcgis.com/datasets/f13ec4ddadde46f6b0875047b13ec333_0.geojson
print('Data downloaded!')

Data downloaded!


In [3]:
with open('moncton_data.json') as json_data:
    moncton_data = json.load(json_data)

In [4]:
#A preview of how the data look
#moncton_data

In [5]:
neighborhoods_data = moncton_data['features']

In [6]:
#take a look at the first item in this list.
neighborhoods_data[0]

{'type': 'Feature',
 'properties': {'OBJECTID': 1,
  'CATEGORY': 'EDUCATION',
  'SUBCATEGORY': 'SCHOOL ELEMENTARY D-2',
  'NAME': 'Edith Cavell School',
  'NAMEFR': 'École Édith Cavell',
  'ADDRESS': '125 PARK ST',
  'ADDRESSFR': None,
  'POSTCODE': 'E1C 2B4',
  'PHONENUM': '506-856-3473',
  'JURISDICTION': 'Moncton',
  'GlobalID': '5e04d169-0e84-4984-bb28-825a60daa069'},
 'geometry': {'type': 'Point',
  'coordinates': [-64.78761599156698, 46.09177539919614]}}

#### Tranform the data into a *pandas* dataframe

In [7]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [8]:
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['JURISDICTION'] 
    neighborhood_name = data['properties']['NAME']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

Quickly examine the resulting dataframe to have a clear picture of the dataset we are working on.

In [9]:
# Quickly examine the resulting dataframe.
neighborhoods.tail()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
340,Moncton,CapitolTheatre Lot,46.088927,-64.779495
341,Moncton,Victoria Lot,46.091027,-64.776185
342,Moncton,Riverfront Park Lot,46.086284,-64.775204
343,Moncton,Moncton Market Lot,46.087236,-64.778162
344,Moncton,Robinson Lot,46.087432,-64.778823


Confirm the dataset has all 4 boroughs and 345 neighborhoods.

In [10]:
# Confirm the dataset has all 4 boroughs and 345 neighborhoods.

print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 4 boroughs and 345 neighborhoods.


#### Use geopy library to get the latitude and longitude values of Moncton City

In [11]:
address = 'Moncton City, NB'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Moncton City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Moncton City are 46.097995, -64.80011.


#### Create a map of Moncton with neighborhoods superimposed on top.

In [12]:
# create map of Moncton using latitude and longitude values
map_moncton = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_moncton)  
    
map_moncton

**Folium** a visualization library I will be use to simplify the above map and segment and cluster only the neighborhoods in Dieppe. So let's slice the original dataframe and create a new dataframe of the Dieppe data.

In [13]:
Dieppe_data = neighborhoods[neighborhoods['Borough'] == 'Dieppe'].reset_index(drop=True)
Dieppe_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Dieppe,École Anna-Malenfant,46.085724,-64.71785
1,Dieppe,Lou Macnarin School,46.097661,-64.732614
2,Dieppe,École Mathieu-Martin,46.100219,-64.73382
3,Dieppe,École Amirault,46.069517,-64.717828
4,Dieppe,CCNB Dieppe,46.100411,-64.739181


In [14]:
address = 'Dieppe, NB'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Dieppe are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Dieppe are 46.0945258, -64.7354772.


In [15]:
# create map of Dieppe using latitude and longitude values
map_Dieppe = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(Dieppe_data['Latitude'], Dieppe_data['Longitude'], Dieppe_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Dieppe)  
    
map_Dieppe

#### Define Foursquare Credentials and Version to explore the neighborhoods and segment them

In [17]:
CLIENT_ID = 'DxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxT' # your Foursquare ID
CLIENT_SECRET = '5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: xxxxxxxxx')
print('CLIENT_SECRET:xxxxxxxxxxx')

Your credentails:
CLIENT_ID: xxxxxxxxx
CLIENT_SECRET:xxxxxxxxxxx


Let's explore the fifteen neighborhood in our dataframe.¶
Get the neighborhood's name.

In [18]:
Dieppe_data.loc[15, 'Neighborhood']

'Dieppe Market'

In [19]:
# Get the neighborhood's latitude and longitude values

neighborhood_latitude = Dieppe_data.loc[15, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = Dieppe_data.loc[15, 'Longitude'] # neighborhood longitude value

neighborhood_name = Dieppe_data.loc[15, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Dieppe Market are 46.09416663082347, -64.74625446549567.


#### Now, let's get the top 100 venues that are in Dieppe Market within a radius of 2000 meters.
First, let's create the GET request URL.

In [20]:

LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 2000 # define radius


# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
#url # display URL

 I Send the GET request and examine the results ( N.B the json file result is long so I have to disable it display)

In [21]:
results = requests.get(url).json()
#results

Getting all the information in the *items* key. Before we proceed, I borrow the **get_category_type** function from the Foursquare.

In [22]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now ready to clean the json and structure it into a pandas dataframe.

In [23]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Marché de Dieppe,Farmers Market,46.094012,-64.746237
1,Cafe Archibald,Café,46.094318,-64.747311
2,Alcool NB Liquor,Liquor Store,46.098184,-64.757415
3,Sports Rock,Restaurant,46.095047,-64.756737
4,Starbucks,Coffee Shop,46.097181,-64.741218


In [None]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

#### Let's create a function to repeat the same process to all the neighborhoods in Dieppe

In [24]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the above function on each neighborhood and create a new dataframe called *Dieppe_venues*.

In [25]:
Dieppe_venues = getNearbyVenues(names=Dieppe_data['Neighborhood'],
                                   latitudes=Dieppe_data['Latitude'],
                                   longitudes=Dieppe_data['Longitude']
                                  )


École Anna-Malenfant
Lou Macnarin School
École Mathieu-Martin
École Amirault
CCNB Dieppe
École Sainte-Thérèse
École Carrefour de l'Acadie
Dieppe Public Works Department
Dieppe Fire Station #1
Codiac RCMP Community Policing Office
Dieppe Fire Station # 2
Champlain Place
Dieppe Library
Fox Creek Golf Course
Dieppe City Hall
Dieppe Market
Greater Moncton International Airport
J-Albert-Cormier Ball Field
Our Lady of Cavalry Cemetery
Tiferes Israel
Club Rotary Lodge
Club d'age d'or de St.-Anselme
Dieppe Boys and Girls Club
Club d'age d'or de Dieppe
Aquatic and Sports Centre
Arthur-J.-LeBlanc Centre
Aréna Centenaire
Rotary Park Splash pad
Vélodrome de Dieppe
Skateboard Park
Tennis Club
St. Anselme Rotary Park
Place 1604
Dover Park
Lakeburn
Doreen
Dolbeau
Cimes
Thaddée
Horizon
Anna-Malenfant
Bahama - Juniper
Lavoie - Frédéric
Amand
Beauséjour
Gaspé
Cousteau
Avant-Garde
Centrale
Golf
Du Moulin
Domaine du faisan
Rita-McNeil (Parc Rotary)
Copains
Pélagie
Dover
Bayview
JF Bourgeois
Yvonne
Outdoor

In [28]:
#Check the size of resulting Dataframe

print(Dieppe_venues.shape)
Dieppe_venues.head()

(384, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,École Anna-Malenfant,46.085724,-64.71785,Riverview Rec center.,46.084455,-64.723929,Bar
1,Lou Macnarin School,46.097661,-64.732614,Dairy Queen,46.098747,-64.733476,Fast Food Restaurant
2,Lou Macnarin School,46.097661,-64.732614,Tim Hortons,46.09871,-64.731075,Coffee Shop
3,Lou Macnarin School,46.097661,-64.732614,Korean Restaurant/Acadian Pizza & Donair,46.098352,-64.732883,Korean Restaurant
4,Lou Macnarin School,46.097661,-64.732614,Pura Vida Yoga Dieppe,46.097186,-64.731399,Yoga Studio


In [29]:
#Let's check how many venues were returned for each neighborhood

Dieppe_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Aquatic and Sports Centre,5,5,5,5,5,5
Arthur-J.-LeBlanc Centre,2,2,2,2,2,2
Aréna Centenaire,10,10,10,10,10,10
Avant-Garde,1,1,1,1,1,1
Blue Olive Grocery Store (Middle Eastern),34,34,34,34,34,34
CCNB Dieppe,4,4,4,4,4,4
Centrale,2,2,2,2,2,2
Champlain Place,39,39,39,39,39,39
Chez Bernard Beauty Academy,9,9,9,9,9,9
Cimes,4,4,4,4,4,4


In [30]:
#Let's find out how many unique categories can be curated from all the returned venues

print('There are {} uniques categories.'.format(len(Dieppe_venues['Venue Category'].unique())))

There are 65 uniques categories.




## 3.3 Analyze Each Neighborhood

In [31]:
# one hot encoding
Dieppe_onehot = pd.get_dummies(Dieppe_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Dieppe_onehot['Neighborhood'] = Dieppe_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Dieppe_onehot.columns[-1]] + list(Dieppe_onehot.columns[:-1])
Dieppe_onehot = Dieppe_onehot[fixed_columns]

Dieppe_onehot.tail()

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Art Gallery,Bagel Shop,Bakery,Bank,Bar,Bike Shop,Bookstore,Breakfast Spot,Business Service,Café,Chinese Restaurant,Clothing Store,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground,Dessert Shop,Discount Store,Farmers Market,Fast Food Restaurant,Food,Gas Station,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Home Service,Hotel,Ice Cream Shop,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Lingerie Store,Liquor Store,Mediterranean Restaurant,Men's Store,Movie Theater,Multiplex,Optical Shop,Park,Pharmacy,Photography Studio,Pizza Place,Rental Car Location,Restaurant,Sandwich Place,Shopping Mall,Skating Rink,Spa,Sporting Goods Shop,Sushi Restaurant,Tea Room,Thai Restaurant,Tourist Information Center,Toy / Game Store,Yoga Studio
379,Medes College,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
380,Medes College,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
381,Medes College,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
382,Medes College,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
383,Medes College,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [32]:
#Examine the size of the new dataframe
Dieppe_onehot.shape

(384, 66)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [33]:
Dieppe_grouped = Dieppe_onehot.groupby('Neighborhood').mean().reset_index()
Dieppe_grouped

Unnamed: 0,Neighborhood,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Art Gallery,Bagel Shop,Bakery,Bank,Bar,Bike Shop,Bookstore,Breakfast Spot,Business Service,Café,Chinese Restaurant,Clothing Store,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground,Dessert Shop,Discount Store,Farmers Market,Fast Food Restaurant,Food,Gas Station,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Home Service,Hotel,Ice Cream Shop,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Lingerie Store,Liquor Store,Mediterranean Restaurant,Men's Store,Movie Theater,Multiplex,Optical Shop,Park,Pharmacy,Photography Studio,Pizza Place,Rental Car Location,Restaurant,Sandwich Place,Shopping Mall,Skating Rink,Spa,Sporting Goods Shop,Sushi Restaurant,Tea Room,Thai Restaurant,Tourist Information Center,Toy / Game Store,Yoga Studio
0,Aquatic and Sports Centre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Arthur-J.-LeBlanc Centre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aréna Centenaire,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.2,0.0,0.0,0.1,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Avant-Garde,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Blue Olive Grocery Store (Middle Eastern),0.0,0.0,0.0,0.0,0.029412,0.0,0.029412,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.029412,0.088235,0.0,0.0,0.0,0.029412,0.0,0.029412,0.0,0.0,0.147059,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.029412,0.029412,0.029412,0.0,0.029412,0.0,0.029412,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.088235,0.0,0.0,0.0,0.029412,0.029412,0.029412,0.0,0.0,0.029412,0.0,0.029412,0.0,0.0,0.029412,0.0
5,CCNB Dieppe,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Centrale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Champlain Place,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.025641,0.025641,0.0,0.025641,0.0,0.0,0.0,0.025641,0.025641,0.102564,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.128205,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.051282,0.025641,0.025641,0.0,0.025641,0.0,0.025641,0.025641,0.025641,0.0,0.025641,0.025641,0.0,0.0,0.051282,0.0,0.0,0.0,0.025641,0.025641,0.025641,0.0,0.0,0.051282,0.0,0.025641,0.0,0.0,0.025641,0.0
8,Chez Bernard Beauty Academy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0
9,Cimes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.25,0.0,0.0,0.25,0.0,0.0


#### Lets Confrim the size of the new grouping

In [34]:
Dieppe_grouped.shape

(52, 66)

#### Let's print each neighborhood along with the top 5 most common venues

In [35]:
num_top_venues = 5

for hood in Dieppe_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Dieppe_grouped[Dieppe_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aquatic and Sports Centre----
                  venue  freq
0  Gym / Fitness Center   0.2
1        Discount Store   0.2
2                   Gym   0.2
3        Ice Cream Shop   0.2
4              Pharmacy   0.2


----Arthur-J.-LeBlanc Centre----
            venue  freq
0            Park   0.5
1    Skating Rink   0.5
2         Airport   0.0
3           Hotel   0.0
4  Ice Cream Shop   0.0


----Aréna Centenaire----
          venue  freq
0          Café   0.2
1   Coffee Shop   0.1
2        Bakery   0.1
3  Concert Hall   0.1
4         Hotel   0.1


----Avant-Garde----
                     venue  freq
0             Home Service   1.0
1  Health & Beauty Service   0.0
2                    Hotel   0.0
3           Ice Cream Shop   0.0
4       Italian Restaurant   0.0


----Blue Olive Grocery Store (Middle Eastern)----
                  venue  freq
0  Fast Food Restaurant  0.15
1              Pharmacy  0.09
2           Coffee Shop  0.09
3         Grocery Store  0.06
4          Liquor Store  0

#### Let's put that into a *pandas* dataframe

 write a function to sort the venues in descending order.

In [36]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [37]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Dieppe_grouped['Neighborhood']

for ind in np.arange(Dieppe_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Dieppe_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Aquatic and Sports Centre,Gym / Fitness Center,Pharmacy,Ice Cream Shop,Discount Store,Gym,American Restaurant,Concert Hall,Golf Course,Gas Station,Food
1,Arthur-J.-LeBlanc Centre,Park,Skating Rink,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Café,Cricket Ground
2,Aréna Centenaire,Café,Skating Rink,Concert Hall,Park,Bike Shop,Japanese Restaurant,Bakery,Hotel,Coffee Shop,Dessert Shop
3,Avant-Garde,Home Service,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop
4,Blue Olive Grocery Store (Middle Eastern),Fast Food Restaurant,Pharmacy,Coffee Shop,Grocery Store,Bank,Italian Restaurant,Chinese Restaurant,Juice Bar,Clothing Store,Lingerie Store


## 4. Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [38]:
# set number of clusters
kclusters = 5

Dieppe_grouped_clustering = Dieppe_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Dieppe_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 2, 1, 3, 1, 1, 3, 1, 1, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [39]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Dieppe_merged = Dieppe_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
Dieppe_merged = Dieppe_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

Dieppe_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Dieppe,École Anna-Malenfant,46.085724,-64.71785,1.0,Bar,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop
1,Dieppe,Lou Macnarin School,46.097661,-64.732614,1.0,Yoga Studio,Korean Restaurant,Coffee Shop,Fast Food Restaurant,Spa,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Dessert Shop
2,Dieppe,École Mathieu-Martin,46.100219,-64.73382,1.0,Yoga Studio,Coffee Shop,Fast Food Restaurant,Korean Restaurant,Chinese Restaurant,American Restaurant,Art Gallery,Golf Course,Gas Station,Food
3,Dieppe,École Amirault,46.069517,-64.717828,2.0,Skating Rink,Dessert Shop,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground,Yoga Studio,Chinese Restaurant
4,Dieppe,CCNB Dieppe,46.100411,-64.739181,1.0,Coffee Shop,Fast Food Restaurant,Japanese Restaurant,Bike Shop,Cricket Ground,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Yoga Studio


Finally, let's visualize the resulting clusters

In [40]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Dieppe_merged['Latitude'], Dieppe_merged['Longitude'], Dieppe_merged['Neighborhood'], Dieppe_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color="black",
        fill=True,
        fill_color=rainbow[4],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 5.0 Examine the Clusters

Now, I examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, I assign a name to each cluster.

#### Cluster 1

In [41]:
Dieppe_merged.loc[Dieppe_merged['Cluster Labels'] == 0, Dieppe_merged.columns[[1] + list(range(5, Dieppe_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,J-Albert-Cormier Ball Field,Gas Station,Yoga Studio,Gym,Golf Course,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Cricket Ground
22,Dieppe Boys and Girls Club,Gas Station,Yoga Studio,Gym,Golf Course,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Cricket Ground
34,Lakeburn,Gas Station,Yoga Studio,Gym,Golf Course,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Cricket Ground


#### Cluster 2

In [42]:
Dieppe_merged.loc[Dieppe_merged['Cluster Labels'] == 1, Dieppe_merged.columns[[1] + list(range(5, Dieppe_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,École Anna-Malenfant,Bar,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop
1,Lou Macnarin School,Yoga Studio,Korean Restaurant,Coffee Shop,Fast Food Restaurant,Spa,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Dessert Shop
2,École Mathieu-Martin,Yoga Studio,Coffee Shop,Fast Food Restaurant,Korean Restaurant,Chinese Restaurant,American Restaurant,Art Gallery,Golf Course,Gas Station,Food
4,CCNB Dieppe,Coffee Shop,Fast Food Restaurant,Japanese Restaurant,Bike Shop,Cricket Ground,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Yoga Studio
5,École Sainte-Thérèse,Café,Pizza Place,Skating Rink,Concert Hall,Pharmacy,Bakery,Hotel,Discount Store,Dessert Shop,Clothing Store
6,École Carrefour de l'Acadie,Yoga Studio,Korean Restaurant,Coffee Shop,Fast Food Restaurant,Spa,Chinese Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop
8,Dieppe Fire Station #1,Yoga Studio,Fast Food Restaurant,Korean Restaurant,Coffee Shop,Clothing Store,Golf Course,Gas Station,Food,Farmers Market,Discount Store
9,Codiac RCMP Community Policing Office,Fast Food Restaurant,Coffee Shop,Grocery Store,Sporting Goods Shop,Pharmacy,Hotel,Mediterranean Restaurant,Italian Restaurant,Liquor Store,Lingerie Store
11,Champlain Place,Fast Food Restaurant,Coffee Shop,Grocery Store,Sporting Goods Shop,Pharmacy,Hotel,Mediterranean Restaurant,Italian Restaurant,Liquor Store,Lingerie Store
12,Dieppe Library,Café,Pizza Place,Fast Food Restaurant,Farmers Market,Concert Hall,Hotel,Skating Rink,Bakery,Pharmacy,Restaurant


#### Cluster 3

In [43]:
Dieppe_merged.loc[Dieppe_merged['Cluster Labels'] == 2, Dieppe_merged.columns[[1] + list(range(5, Dieppe_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,École Amirault,Skating Rink,Dessert Shop,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cricket Ground,Yoga Studio,Chinese Restaurant
20,Club Rotary Lodge,Park,Construction & Landscaping,Business Service,Skating Rink,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Chinese Restaurant,Cricket Ground
25,Arthur-J.-LeBlanc Centre,Park,Skating Rink,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Café,Cricket Ground
28,Vélodrome de Dieppe,Park,Skating Rink,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Café,Cricket Ground
31,St. Anselme Rotary Park,Park,Construction & Landscaping,Skating Rink,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Café,Cricket Ground
45,Gaspé,Park,Skating Rink,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop,Café,Cricket Ground


#### Cluster 4

In [44]:
Dieppe_merged.loc[Dieppe_merged['Cluster Labels'] == 3, Dieppe_merged.columns[[1] + list(range(5, Dieppe_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Club d'age d'or de St.-Anselme,Park,Home Service,Art Gallery,Convenience Store,Cricket Ground,Coffee Shop,Concert Hall,Construction & Landscaping,Cosmetics Shop,Discount Store
27,Rotary Park Splash pad,Park,Business Service,Construction & Landscaping,Cricket Ground,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dessert Shop,Chinese Restaurant
33,Dover Park,Park,Home Service,Art Gallery,Convenience Store,Cricket Ground,Coffee Shop,Concert Hall,Construction & Landscaping,Cosmetics Shop,Discount Store
38,Thaddée,Convenience Store,Yoga Studio,Gym,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop
47,Avant-Garde,Home Service,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop
48,Centrale,Home Service,Construction & Landscaping,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store
52,Rita-McNeil (Parc Rotary),Park,Business Service,Construction & Landscaping,Cricket Ground,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dessert Shop,Chinese Restaurant
53,Copains,Construction & Landscaping,Business Service,Yoga Studio,Clothing Store,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store
55,Dover,Park,Home Service,Art Gallery,Convenience Store,Cricket Ground,Coffee Shop,Concert Hall,Construction & Landscaping,Cosmetics Shop,Discount Store


#### Cluster 5

In [45]:
Dieppe_merged.loc[Dieppe_merged['Cluster Labels'] == 4, Dieppe_merged.columns[[1] + list(range(5, Dieppe_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
36,Dolbeau,Cricket Ground,Yoga Studio,Gym,Golf Course,Gas Station,Food,Fast Food Restaurant,Farmers Market,Discount Store,Dessert Shop


<h3> 5.1 Insight into Neighborhood</h3>

- <b>Cluster1</b>: The first three most common venue are the Gas station, Yoga Studio & Gym. This means the neighborhood has a sense of life for recreational activities and ample gas station to have ones vehicle re-filled for mobility. While the food industry and farmer market are in the 5th, 6th, 7th most common venue, Restaurant business have 50% or more chances of surviving in this neighborhood.

- <b>Cluster 2</b>: Interesting to note this neighborhood has a thriving business in the Fast-Moving Consumer Goods Industry. Restaurant, Fast-Food, Coffee Shop seems to dominate. If an entrepreneur desires to establish a restaurant in this neighborhood, he would likely meet stifle competition in the food industry. This is the business hub of Dieppe town as most common venue shows all form of categorical activities occurring in this neighborhoods.


- <b>Cluster 3</b>: This is the Disney land of Moncton. It is glaring there are many recreational activities here such as parks and skating rink for family hang-out. Those that love to play cricket can have a playing ground.

- <b>Cluster 4</b>: There is even distribution of activities in this neighborhood. From my perspective a lot of people would prefer an area that has all equal amenities for social life. All neighborhood in the fourth cluster has it all.

- <b>Cluster 5</b>: Very few activities in this neighborhood


 <h2> 6. Results and Discussion</h2>
 
The aim of this project is to help people who want to relocate to the safest borough in Moncton, New Brunswick. Immigrants as well as citizen aspiring to relocate to this area to choose the neighborhoods to which they want to settle based on the most common venues in it. For example if a person is looking for a neighborhood with Fast-Moving Consumer Good Industry and all activities we can see that Clusters 2 and 4 have restaurant, Cuisine and café as the most common venues. If a person is looking for a neighborhood with recreational and gas station in a close proximity then the neighborhoods in the first cluster is suitable. For a family I feel that the neighborhoods in Cluster 4 are more suitable as it has the most activities, these neighborhoods have common venues such as Parks, Gym/Fitness centers, Construction, Restaurants, Groceries stores and Cricket fields which is ideal for a family.



<h2>7. Conclusion</h2>

This project helps to have an insight into better understanding of the neighborhoods with respect to the most common venues in that neighborhood. It is always key in the use of technology to have idea and clear view of a place without having to travel miles to have a clear picture of the area i.e. finding out more about places before moving into a neighborhood. The predictor of this project includes taking other factors such as community integration to shortlist the borough based on safety and a predefined welfare of the dwellers.


