# Battle of the Neighborhoods

## Business problem

Dubai is a very dynamic city with many different neighborhoods both in the old/historic part of the city, which existed before discovery of oil and subsequent boost of economy, and the new part of the city full of magnificent masterminds of world renown architects. A lot of international workers or so called expats are moving to Dubai every year, some bringing along their families and loved ones. And this is when the key challenge arises - where to rent/buy an apartment/house? 
Each person or a family will have a certain set of criteria for choosing a home, starting from availability of a supermarket in a close proximity, presence of playgrounds/parks and nearby schools/nurseries, to variety of restaurants and cafes around the corner. 

The business problem that I have chosen for the project is identification of the most suitable neighborhood based on criteria provided by the client, who could be a real estate agent trying to serve the client or the prospective tenants/owners themselves.


## Data to be used

The main component of data to be used for this project is the data provided by Foursquare API about the neighborhoods in Dubai along with venues of various categories. Neighborhoods will be considered as family friendly if for instance there are nurseries/schools, playgrounds or parks, grocery stores, gyms, recreational facilities located in them. 

## Preparation and cleaning of data

In [1]:
import pandas as pd # library for data analsysis
import requests

In [2]:
# import dataset from Dubai Government with geospatial data about neighborhoods

#use beautifulSoup for reading the xml
from bs4 import BeautifulSoup as Soup

with open('Community.kml') as data:
    kml_soup = Soup(data, 'lxml-xml') # Parse as XML

l = []

placeMarks = kml_soup.find_all('Placemark')  
for placeMark in placeMarks:
    d = {}
    #for each neighborhood take the first longitude and latitude coordinates
    coordinates = placeMark.find_all('coordinates')
    for coordinate in coordinates:
        coords = coordinate.text.split(",")
        d["Longitude"] =coords[0]
        d["Latitude"] =coords[1]
    
    # for each neighborhood take the english name
    ExtendedDatas = placeMark.find_all('ExtendedData')
    for ExtendedData in ExtendedDatas:
        simpleDatas = ExtendedData.find_all('SimpleData')
        for simpleData in simpleDatas:
            simpleDataAttrs  = dict(simpleData.attrs)
            if simpleDataAttrs[u'name'] == "CNAME_E":
                d["Neighborhood"] = simpleData.text
    l.append(d)
# create a dataframe based on the list generated from xml    
neighborhoods = pd.DataFrame(l) 
neighborhoods.head()     

Unnamed: 0,Latitude,Longitude,Neighborhood
0,25.2651156252942,55.3011216866639,AL HAMRIYA
1,25.2595208149389,55.287777388341,AL SOUQ AL KABEER
2,25.1473131093492,55.2876875487575,NADD AL SHIBA FIRST
3,25.3119978356916,55.349689019013,AL MAMZAR
4,25.1884458878827,55.2410061879357,JUMEIRA THIRD


In [3]:
neighborhoods.shape

(226, 3)

In [4]:
neighborhoods

Unnamed: 0,Latitude,Longitude,Neighborhood
0,25.2651156252942,55.3011216866639,AL HAMRIYA
1,25.2595208149389,55.287777388341,AL SOUQ AL KABEER
2,25.1473131093492,55.2876875487575,NADD AL SHIBA FIRST
3,25.3119978356916,55.349689019013,AL MAMZAR
4,25.1884458878827,55.2410061879357,JUMEIRA THIRD
5,25.1473131070924,55.2876875487583,HADAEQ SHEIKH MOHAMMED BIN RASHID
6,25.0523737867498,55.4076823230994,UMM NAHAD THIRD
7,24.8975228512717,54.9407412384163,SAIH SHUAIB 1
8,25.0568630048816,55.4053403994602,UMM NAHAD FIRST
9,25.0424225344569,55.314737341017,UMM NAHAD SECOND


so overall we have 226 neighborhoods in our dataset that we will use for analysis

convert the coordinates to float numbers

In [5]:
neighborhoods["Latitude"] = pd.to_numeric(neighborhoods["Latitude"])
neighborhoods["Longitude"] = pd.to_numeric(neighborhoods["Longitude"])

In [6]:
neighborhoods.dtypes

Latitude        float64
Longitude       float64
Neighborhood     object
dtype: object

Import all required librabries for further analysis

In [7]:
import numpy as np # library to handle data in a vectorized manner

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [8]:
address = 'Dubai, UAE'

geolocator = Nominatim(user_agent="tnt_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Dubai are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Dubai are 25.0750095, 55.1887608818332.


In [9]:
# create map of Dubai using latitude and longitude values
map_tnt = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_tnt)  
    
map_tnt

setup client required for Foursquare

In [88]:
CLIENT_ID = '0BAJUB5WCRBTWR5P555BBPFQKHH5M4PVV4U5MQR5435QFIRH' # your Foursquare ID
CLIENT_SECRET = 'KTI0ETJNVRA0O4R4TCO1KZZKIBYSRMW1HXY3PLNO5YAFBTNW' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 20

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 0BAJUB5WCRBTWR5P555BBPFQKHH5M4PVV4U5MQR5435QFIRH
CLIENT_SECRET:KTI0ETJNVRA0O4R4TCO1KZZKIBYSRMW1HXY3PLNO5YAFBTNW


In [89]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [90]:
def getNearbyVenues(names, latitudes, longitudes, radius=400):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [94]:
dubai_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )

AL HAMRIYA
AL SOUQ AL KABEER
NADD AL SHIBA FIRST
AL MAMZAR
JUMEIRA THIRD
HADAEQ SHEIKH MOHAMMED BIN RASHID
UMM NAHAD THIRD
SAIH SHUAIB 1
UMM NAHAD FIRST
UMM NAHAD SECOND
AL YALAYIS 1
UMM NAHAD FOURTH
JUMEIRA SECOND
RAS AL KHOR IND. THIRD
AL O'SHOOSH
AL LAYAN 2
AL NAHDA SECOND
LE HEMAIRA
SAIH SHUA'ALAH
AL GARHOUD
AL MANARA
NADD SHAMMA
AL SAFOUH FIRST
MUGATRAH
ENKHALI
WADI AL SAFA 4
AL SHINDAGHA
AL TWAR FIRST
ME'AISEM FIRST
MUHAISNAH FIRST
RAS AL KHOR
AL TWAR SECOND
AL HUDAIBA
LEHBAB FIRST
AL BARSHA THIRD
AL MUTEENA
UMM ESELAY
NAIF
TRADE CENTER FIRST
AL SATWA
LEHBAB SECOND
UMM HURAIR SECOND
BU KADRA
MANKHOOL
AL MIZHAR SECOND
WADI AL SAFA 6
AL MERYAL
UMM SUQEIM THIRD
AL THANYAH  FOURTH
NADD AL SHIBA THIRD
AYAL NASIR
OUD METHA
JABAL ALI FIRST
AL AWIR FIRST
HEFAIR
AL WARQA'A SECOND
AL QUSAIS IND. SECOND
OUD AL MUTEENA FIRST
WADI AL SAFA 5
AL JAFILIYA
JABAL ALI SECOND
AL HAMRIYA PORT
AL ROWAIYAH SECOND
AL YALAYIS 2
WARSAN SECOND
AL QUSAIS IND. FIRST
AL SAFOUH SECOND
HESSYAN SECOND
HOR AL ANZ

In [95]:
dubai_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,AL HAMRIYA,25.265116,55.301122,Sheikh Mohamed Centre For Cultural Understandi...,25.264091,55.300795,Historic Site
1,AL HAMRIYA,25.265116,55.301122,Coffee Museum,25.263633,55.300122,Museum
2,AL HAMRIYA,25.265116,55.301122,Baskin Robbins,25.26445,55.299994,Ice Cream Shop
3,AL HAMRIYA,25.265116,55.301122,Arabian Tea House Cafe (كافية بيت الشاي العربي),25.263399,55.299695,Tea Room
4,AL HAMRIYA,25.265116,55.301122,Sikka Art Fair,25.264001,55.300027,Art Gallery


In [96]:
print(dubai_venues.shape)

(1119, 7)


In [97]:
#Count the venues per each neighborhood
dubai_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ABU HAIL,4,4,4,4,4,4
AL BADA',20,20,20,20,20,20
AL BARAHA,4,4,4,4,4,4
AL BARSHA FIRST,18,18,18,18,18,18
AL BARSHA SECOND,7,7,7,7,7,7
AL BARSHA SOUTH FIFTH,1,1,1,1,1,1
AL BARSHA SOUTH FIRST,1,1,1,1,1,1
AL BARSHA SOUTH THIRD,2,2,2,2,2,2
AL BUTEEN,11,11,11,11,11,11
AL CORNICHE,8,8,8,8,8,8


In [98]:
print('There are {} uniques categories.'.format(len(dubai_venues['Venue Category'].unique())))

There are 200 uniques categories.


In [99]:
# one hot encoding
dubai_onehot = pd.get_dummies(dubai_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
dubai_onehot['Neighborhood_Name'] = dubai_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [dubai_onehot.columns[-1]] + list(dubai_onehot.columns[:-1])
dubai_onehot = dubai_onehot[fixed_columns]

dubai_onehot.head()

Unnamed: 0,Neighborhood_Name,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Baby Store,Badminton Court,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Bike Trail,Bistro,Boat or Ferry,Bookstore,Boutique,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Bus Station,Business Service,Cafeteria,Café,Campground,Canal,Carpet Store,Castle,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Country Dance Club,Currency Exchange,Dance Studio,Department Store,Dessert Shop,Diner,Dive Bar,Dog Run,Donut Shop,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fishing Spot,Fishing Store,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden Center,Gastropub,General Entertainment,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,Hobby Shop,Hockey Field,Home Service,Hong Kong Restaurant,Hookah Bar,Hotel,Hotel Bar,Housing Development,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Indoor Play Area,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mobility Store,Molecular Gastronomy Restaurant,Motel,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Neighborhood,Nightclub,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Gym,Outdoor Supply Store,Pakistani Restaurant,Park,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Photography Studio,Pier,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pub,Public Art,Punjabi Restaurant,Racetrack,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Sake Bar,Salon / Barbershop,Sandwich Place,Science Museum,Seafood Restaurant,Shawarma Place,Shoe Store,Shop & Service,Shopping Mall,Ski Area,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Stables,Stadium,Steakhouse,Supermarket,Tea Room,Thai Restaurant,Theater,Theme Park,Toy / Game Store,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Volleyball Court,Warehouse Store,Water Park,Wine Bar,Women's Store,Yoga Studio
0,AL HAMRIYA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,AL HAMRIYA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,AL HAMRIYA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,AL HAMRIYA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
4,AL HAMRIYA,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [100]:
# let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
dubai_grouped = dubai_onehot.groupby('Neighborhood_Name').mean().reset_index()
dubai_grouped.head()

Unnamed: 0,Neighborhood_Name,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,American Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Baby Store,Badminton Court,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Bike Trail,Bistro,Boat or Ferry,Bookstore,Boutique,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Bus Station,Business Service,Cafeteria,Café,Campground,Canal,Carpet Store,Castle,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Convenience Store,Cosmetics Shop,Country Dance Club,Currency Exchange,Dance Studio,Department Store,Dessert Shop,Diner,Dive Bar,Dog Run,Donut Shop,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fishing Spot,Fishing Store,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden Center,Gastropub,General Entertainment,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,Hobby Shop,Hockey Field,Home Service,Hong Kong Restaurant,Hookah Bar,Hotel,Hotel Bar,Housing Development,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Indoor Play Area,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Mobility Store,Molecular Gastronomy Restaurant,Motel,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Neighborhood,Nightclub,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Gym,Outdoor Supply Store,Pakistani Restaurant,Park,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Photography Studio,Pier,Pizza Place,Playground,Plaza,Pool,Pool Hall,Pub,Public Art,Punjabi Restaurant,Racetrack,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Roof Deck,Sake Bar,Salon / Barbershop,Sandwich Place,Science Museum,Seafood Restaurant,Shawarma Place,Shoe Store,Shop & Service,Shopping Mall,Ski Area,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Sporting Goods Shop,Sports Bar,Stables,Stadium,Steakhouse,Supermarket,Tea Room,Thai Restaurant,Theater,Theme Park,Toy / Game Store,Tunnel,Turkish Restaurant,Vegetarian / Vegan Restaurant,Volleyball Court,Warehouse Store,Water Park,Wine Bar,Women's Store,Yoga Studio
0,ABU HAIL,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,AL BADA',0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.15,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.1,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,AL BARAHA,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,AL BARSHA FIRST,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,AL BARSHA SECOND,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [101]:
#print each neighborhood along with the top 5 most common venues
num_top_venues = 5

for hood in dubai_grouped['Neighborhood_Name']:
    print("----"+hood+"----")
    temp = dubai_grouped[dubai_grouped['Neighborhood_Name'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----ABU HAIL----
                  venue  freq
0           Pizza Place  0.25
1                  Park  0.25
2            Hookah Bar  0.25
3             Cafeteria  0.25
4  Other Great Outdoors  0.00


----AL BADA'----
                       venue  freq
0           Asian Restaurant  0.15
1                       Café  0.10
2  Middle Eastern Restaurant  0.10
3                Pizza Place  0.05
4                      Hotel  0.05


----AL BARAHA----
             venue  freq
0     Dessert Shop  0.25
1       Sports Bar  0.25
2             Park  0.25
3           Bakery  0.25
4  Organic Grocery  0.00


----AL BARSHA FIRST----
                 venue  freq
0          Pizza Place  0.11
1                  Gym  0.11
2       Clothing Store  0.06
3          Coffee Shop  0.06
4  Sporting Goods Shop  0.06


----AL BARSHA SECOND----
                       venue  freq
0                Art Gallery  0.14
1        Arts & Crafts Store  0.14
2                  BBQ Joint  0.14
3  Middle Eastern Restaurant  0.14
4 

In [103]:
#a function to sort the venues in descending order.
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [134]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood_Name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood_Name'] = dubai_grouped['Neighborhood_Name']

for ind in np.arange(dubai_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(dubai_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood_Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ABU HAIL,Pizza Place,Hookah Bar,Park,Cafeteria,Farm,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
1,AL BADA',Asian Restaurant,Middle Eastern Restaurant,Café,Convenience Store,Cafeteria,Hotel,Juice Bar,Fried Chicken Joint,Pakistani Restaurant,Dessert Shop
2,AL BARAHA,Dessert Shop,Park,Bakery,Sports Bar,Yoga Studio,Fast Food Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop
3,AL BARSHA FIRST,Gym,Pizza Place,Indian Restaurant,Coffee Shop,Clothing Store,Fast Food Restaurant,Café,Burger Joint,Sandwich Place,Food Truck
4,AL BARSHA SECOND,Ice Cream Shop,BBQ Joint,Middle Eastern Restaurant,Art Gallery,Arts & Crafts Store,Gym / Fitness Center,Health & Beauty Service,Food Truck,Food Court,Food & Drink Shop


In [135]:
# Create clusters for the neghborhoods based on the nearby venues
# set number of clusters
kclusters = 8

dubai_grouped_clustering = dubai_grouped.drop('Neighborhood_Name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(dubai_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 0, 3, 0, 0, 0, 0], dtype=int32)

In [136]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

dubai_merged = neighborhoods

# merge dubai_grouped with dubai_data to add latitude/longitude for each neighborhood
dubai_merged = dubai_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood_Name'), on='Neighborhood',how='inner')

dubai_merged.head() # check the last columns!

Unnamed: 0,Latitude,Longitude,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,25.265116,55.301122,AL HAMRIYA,0,Art Gallery,Café,Historic Site,Hotel,Coffee Shop,Museum,Tea Room,Middle Eastern Restaurant,Boat or Ferry,Donut Shop
1,25.259521,55.287777,AL SOUQ AL KABEER,0,Hotel,Nightclub,Persian Restaurant,Asian Restaurant,Roof Deck,Gastropub,Electronics Store,Korean Restaurant,Chinese Restaurant,Supermarket
2,25.147313,55.287688,NADD AL SHIBA FIRST,0,Bike Trail,IT Services,Spa,Gym / Fitness Center,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
3,25.311998,55.349689,AL MAMZAR,0,Beach,Tea Room,Fried Chicken Joint,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot
4,25.188446,55.241006,JUMEIRA THIRD,0,Café,Coffee Shop,Ice Cream Shop,Gym / Fitness Center,Department Store,Plaza,Donut Shop,Burger Joint,French Restaurant,Grocery Store


In [137]:
dubai_merged.dtypes

Latitude                  float64
Longitude                 float64
Neighborhood               object
Cluster Labels              int32
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
dtype: object

In [138]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(dubai_merged['Latitude'], dubai_merged['Longitude'], dubai_merged['Neighborhood'], dubai_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[(cluster)-1],
        fill=True,
        fill_color=rainbow[(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [147]:
#Go through each of the clusters to identify the basis for grouping them together
#CLUSTER 0
dubai_merged.loc[dubai_merged['Cluster Labels'] == 0, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of very diverse neighborhoods, which could be used by the real estate agents to filter based on customer preferences

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,AL HAMRIYA,Art Gallery,Café,Historic Site,Hotel,Coffee Shop,Museum,Tea Room,Middle Eastern Restaurant,Boat or Ferry,Donut Shop
1,AL SOUQ AL KABEER,Hotel,Nightclub,Persian Restaurant,Asian Restaurant,Roof Deck,Gastropub,Electronics Store,Korean Restaurant,Chinese Restaurant,Supermarket
2,NADD AL SHIBA FIRST,Bike Trail,IT Services,Spa,Gym / Fitness Center,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
3,AL MAMZAR,Beach,Tea Room,Fried Chicken Joint,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot
4,JUMEIRA THIRD,Café,Coffee Shop,Ice Cream Shop,Gym / Fitness Center,Department Store,Plaza,Donut Shop,Burger Joint,French Restaurant,Grocery Store
5,HADAEQ SHEIKH MOHAMMED BIN RASHID,Bike Trail,IT Services,Spa,Gym / Fitness Center,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
7,SAIH SHUAIB 1,Comfort Food Restaurant,Yoga Studio,Farmers Market,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
12,JUMEIRA SECOND,Beach,Spa,Café,Yoga Studio,Fried Chicken Joint,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
13,RAS AL KHOR IND. THIRD,Department Store,Dessert Shop,Toy / Game Store,Cafeteria,Gym / Fitness Center,BBQ Joint,Yoga Studio,Filipino Restaurant,Food Truck,Food Court
16,AL NAHDA SECOND,Indian Restaurant,Convenience Store,Dessert Shop,Other Nightlife,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market


In [139]:
#CLUSTER 1
dubai_merged.loc[dubai_merged['Cluster Labels'] == 1, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of neighborhoods suitable for lovers of yoga, camping, fishing, food trucks and flowers

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,UMM NAHAD FOURTH,Campground,Yoga Studio,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot
70,AL WARQA'A FIFTH,Intersection,Campground,Yoga Studio,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
134,AL KHWANEEJ FIRST,Campground,Yoga Studio,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


In [140]:
#CLUSTER 2
dubai_merged.loc[dubai_merged['Cluster Labels'] == 2, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of only one neighborhood with pet store and yoga studio as most common venue

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
194,NADD AL SHIBA FOURTH,Pet Store,Yoga Studio,Farm,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


In [141]:
#CLUSTER 3
dubai_merged.loc[dubai_merged['Cluster Labels'] == 3, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of neighborhoods suitable for lovers of pools, yoga, farm life and food courts

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,WADI AL SAFA 6,Pool,Yoga Studio,Farm,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot
207,AL BARSHA SOUTH FIFTH,Pool,Yoga Studio,Farm,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


In [142]:
#CLUSTER 4
dubai_merged.loc[dubai_merged['Cluster Labels'] == 4, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists only one neighborhood suitable for barbeques and primarily restaurants/food courts as common venues

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
205,MIRDIF,BBQ Joint,Yoga Studio,Fast Food Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


In [143]:
#CLUSTER 5
dubai_merged.loc[dubai_merged['Cluster Labels'] == 5, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of neighborhoods suitable for lovers of mountains, indian cuisine, flea market and food courts

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
153,AL TTAY,Indian Restaurant,Mountain,Dance Studio,Fast Food Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
156,AL QUSAIS IND. FIFTH,Mountain,Shop & Service,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot
173,AL QOUZ THIRD,Indian Restaurant,Farmers Market,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


In [144]:
#CLUSTER 6
dubai_merged.loc[dubai_merged['Cluster Labels'] == 6, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of neighborhoods mainly with cafes, yoga studios, fast food restaurants and fishing stores

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
42,BU KADRA,Café,Yoga Studio,Fast Food Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
108,AL HEBIAH FIRST,Café,Housing Development,Gym,Yoga Studio,Fast Food Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
155,AL QUSAIS IND. FOURTH,Coffee Shop,Café,Yoga Studio,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
172,NADD AL SHIBA SECOND,Café,Yoga Studio,Fast Food Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store


In [148]:
#CLUSTER 7
dubai_merged.loc[dubai_merged['Cluster Labels'] == 7, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]

# this cluster consists of only one neighborhood with rental car location and yoga studios as most common venues

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
161,UMM RAMOOL,Rental Car Location,Yoga Studio,Farm,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


In [184]:
def filt(df, cols, venues):
    df_result = df.loc[(df[cols[0]].isin(venues))]
    columns = cols[1:]
    for col1 in columns:
        df_small = df.loc[(df[col1].isin(venues))]
        df_result = df_result.append(df_small, ignore_index=True)
    
    return df_result

In [185]:
# define criteria for search
CLUSTER = 0
VENUES = ['Playground','Indoor Play Area','Gym','Yoga Studio','Pool','Park']
COLS = ["1st Most Common Venue","2nd Most Common Venue","3rd Most Common Venue","4th Most Common Venue","5th Most Common Venue","6th Most Common Venue","7th Most Common Venue","8th Most Common Venue","9th Most Common Venue","10th Most Common Venue"]

dubai_cluster = dubai_merged.loc[dubai_merged['Cluster Labels'] == CLUSTER, dubai_merged.columns[[2] + list(range(4, dubai_merged.shape[1]))]]
filtered_cluster = filt(dubai_cluster,COLS,VENUES )

In [186]:
filtered_cluster

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,MUHAISNAH FIRST,Indoor Play Area,Chocolate Shop,Yoga Studio,Farmers Market,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
1,JABAL ALI FIRST,Pool,Carpet Store,Stadium,Restaurant,Yoga Studio,Farm,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
2,JABAL ALI SECOND,Pool,Carpet Store,Stadium,Restaurant,Yoga Studio,Farm,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
3,AL BARSHA FIRST,Gym,Pizza Place,Indian Restaurant,Coffee Shop,Clothing Store,Fast Food Restaurant,Café,Burger Joint,Sandwich Place,Food Truck
4,SAIH SHUAIB 1,Comfort Food Restaurant,Yoga Studio,Farmers Market,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
5,NADD AL SHIBA THIRD,Cafeteria,Yoga Studio,Fast Food Restaurant,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store
6,MUHAISANAH THIRD,Clothing Store,Gym,Café,Yoga Studio,Fast Food Restaurant,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market
7,AL KHWANEEJ SECOND,Health Food Store,Yoga Studio,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot,Filipino Restaurant
8,AL HEBIAH FOURTH,Soccer Field,Pool,Gym / Fitness Center,Bistro,Beer Garden,Outdoor Gym,Hockey Field,Sports Bar,Farmers Market,Food & Drink Shop
9,ZAA'BEEL SECOND,Punjabi Restaurant,Yoga Studio,Farm,Food Truck,Food Court,Food & Drink Shop,Flower Shop,Flea Market,Fishing Store,Fishing Spot


## as a results, we have identified 78 neighborhoods that have venues matching the criteria given by the client