# Capstone Project - The Battle of the Neighborhoods 
### Optimal location for my healthy food brand in Doha

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

This project is based on a real life problem. I have a home-run healthy snacks company and I am looking for potential spot to set up shop here in Doha, Qatar. I'm going to try and be as practical as I can with this decision seeing how I am personally invested.

The idea of healthy eating is gaining popularity here in Qatar. Slowly but surely, people are making an effort to eat responsibly, but it is important for people to easily find a healthy option near to wherever they are. In this project we want to focus on localities that we popular in Doha since that means most foot-traffic for our business.

Using all my data science experience, I set forth to discover the best location possible for this investment.

## Data <a name="data"></a>

The most important factors to consider are:
* popular spots aroud Doha

Qatar is divided into eight municipalities, and each municipality is further divided into zones.
We will focus this research on the zones under the Doha municipality, which will serve as our defined neighbourhoods.

I am relying on the following sources to get the data I will need:
* coordinates of Doha using Google Maps
* coordinates of the zones using latlong.net
* Nearby popular spots using Foursquare API

In [53]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')


Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.


### Defining the neighbourhoods

Let's create a dataframe with the latitudes and longitudes of the zones we want to focus on

In [84]:
from bs4 import BeautifulSoup

res = requests.get("https://en.wikipedia.org/wiki/Zones_of_Qatar")
soup = BeautifulSoup(res.content,'lxml')

tables = soup.find_all('table', class_='wikitable')

d = {'Zone': [3,4,14,15,16,17,22,24,25,26,27,30,31,32,33,34,35,36,37,38,40,41,42,45,46,61,63,64,67,68],'Districts': ['Fereej Mohamed Bin Jasim','Mushayrib','Fereej Abdel Aziz','Ad Dawhah al Jadidah','Old Al Ghanim','Al Rufaa','Fereej Bin Mahmoud','Rawdat Al Khail','Fereej Bin Durham','Najma','Umm Ghuwailina','Duhail','Umm Lekhba','Madinat Khalifa North','Al Markhiya','Madinat Khalifa South','Fereej Kulaib','Al Messila','Fereej Bin Omran','Al Sadd','New Salatah','Nuaija','Al Hilal','Old Airport','Al Thumama','Al Dafna','Onaiza','Lejbailat','Hazm Al Markhiya','Jelaiah'], 'Population': [4886,28069,15706,15920,16334,6026,28327,18200,37082,28228,33262,7705,11897,12364,6242,38247,6507,6803,26121,41673,16086,33379,11671,48525,21367,4022,37461,4151,8967,5521], 'Latitude':[25.2865,25.2818,25.2777,25.2776,25.28,25.2853,25.2803,25.286,25.2693,25.2683,25.2766,25.3477,25.3477,25.329,25.3388,25.3156,25.3138,25.3006,25.3038,25.2838,25.2623,25.2467,25.2599,25.2481,25.2316,25.3077,25.3469,25.3212,25.3388,25.3522], 'Longitude':[51.5296,51.5275,51.5242,51.5321,51.54,51.5444,51.5124,51.5142,51.5295,51.5387,51.5492,51.4675,51.4675,51.4756,51.4992,51.4808,51.4914,51.4808,51.4953,51.4914,51.5094,51.5334,51.5439,51.5544,51.5413,51.5163,51.5176,51.5032,51.4992,51.4861]}
df = pd.DataFrame(data=d)
df



Unnamed: 0,Zone,Districts,Population,Latitude,Longitude
0,3,Fereej Mohamed Bin Jasim,4886,25.2865,51.5296
1,4,Mushayrib,28069,25.2818,51.5275
2,14,Fereej Abdel Aziz,15706,25.2777,51.5242
3,15,Ad Dawhah al Jadidah,15920,25.2776,51.5321
4,16,Old Al Ghanim,16334,25.28,51.54
5,17,Al Rufaa,6026,25.2853,51.5444
6,22,Fereej Bin Mahmoud,28327,25.2803,51.5124
7,24,Rawdat Al Khail,18200,25.286,51.5142
8,25,Fereej Bin Durham,37082,25.2693,51.5295
9,26,Najma,28228,25.2683,51.5387


We will now print the neihgbourhoods on a map

In [100]:
import folium

print('Folium installed and imported!')

latitude = 25.2856329
longitude = 51.5264162
# create map and display it
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=10)

Folium installed and imported!


In [103]:
# instantiate a feature group for the incidents in the dataframe
incidents = folium.map.FeatureGroup()


for lat, lng, station in zip(df.Latitude, df.Longitude,df.Districts):
    incidents.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=5, # define how big you want the circle markers to be
            color='yellow',
            fill=True,
            fill_color='blue',
            fill_opacity=0.6
        )
    )

# add incidents to map
sanfran_map.add_child(incidents)

### Foursquare
Now that we have the latitudes and longitudes of each neighbourhood, lets use foursquare to check for fitness centers nearby which is our priority since we expect most of our business to be generated by health professionals.

In [112]:
CLIENT_ID = 'WBOFWONU5OGJYAT2FA5MCFEFGRTX1VYNQTCFMK2HKJVHONFB' # your Foursquare ID
CLIENT_SECRET = 'WWJELALMLDLOWY5RK1AIIIB3SMWWBOIU4ECBE4045IZCWAK0' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WBOFWONU5OGJYAT2FA5MCFEFGRTX1VYNQTCFMK2HKJVHONFB
CLIENT_SECRET:WWJELALMLDLOWY5RK1AIIIB3SMWWBOIU4ECBE4045IZCWAK0


In [169]:
address = 'Doha'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

25.2856329 51.5264162


In [170]:
def getNearbyVenues(names, latitudes, longitudes, radius=3000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
          # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [171]:
Venues = getNearbyVenues(names=df['Districts'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Fereej Mohamed Bin Jasim
Mushayrib
Fereej Abdel Aziz
Ad Dawhah al Jadidah
Old Al Ghanim
Al Rufaa
Fereej Bin Mahmoud
Rawdat Al Khail
Fereej Bin Durham
Najma
Umm Ghuwailina
Duhail
Umm Lekhba
Madinat Khalifa North
Al Markhiya
Madinat Khalifa South
Fereej Kulaib
Al Messila
Fereej Bin Omran
Al Sadd
New Salatah
Nuaija
Al Hilal
Old Airport
Al Thumama
Al Dafna
Onaiza
Lejbailat
Hazm Al Markhiya
Jelaiah


In [172]:
print(Venues.shape)
Venues

(900, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Fereej Mohamed Bin Jasim,25.2865,51.5296,Usta Turkish Kebap & Doner,25.286076,51.531224,Turkish Restaurant
1,Fereej Mohamed Bin Jasim,25.2865,51.5296,Souq Waqif (سوق واقف),25.286988,51.533154,Flea Market
2,Fereej Mohamed Bin Jasim,25.2865,51.5296,Souk Waqif Art Center,25.286446,51.532142,Art Gallery
3,Fereej Mohamed Bin Jasim,25.2865,51.5296,Jasmine Thai Restaurant,25.288038,51.532121,Thai Restaurant
4,Fereej Mohamed Bin Jasim,25.2865,51.5296,% Arabica,25.285486,51.530014,Coffee Shop
5,Fereej Mohamed Bin Jasim,25.2865,51.5296,حلويات العكر,25.286898,51.532861,Dessert Shop
6,Fereej Mohamed Bin Jasim,25.2865,51.5296,Beirut Restaurant,25.286164,51.531401,Middle Eastern Restaurant
7,Fereej Mohamed Bin Jasim,25.2865,51.5296,شاي و رقاق,25.288565,51.531535,Tea Room
8,Fereej Mohamed Bin Jasim,25.2865,51.5296,Al bidda Hotel (فندق البدع),25.289460,51.532763,Boarding House
9,Fereej Mohamed Bin Jasim,25.2865,51.5296,Zaatar W Zeit (زعتر وٓ زيت),25.286681,51.532546,Mediterranean Restaurant


In [173]:
Venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Ad Dawhah al Jadidah,30,30,30,30,30,30
Al Dafna,30,30,30,30,30,30
Al Hilal,30,30,30,30,30,30
Al Markhiya,30,30,30,30,30,30
Al Messila,30,30,30,30,30,30
Al Rufaa,30,30,30,30,30,30
Al Sadd,30,30,30,30,30,30
Al Thumama,30,30,30,30,30,30
Duhail,30,30,30,30,30,30
Fereej Abdel Aziz,30,30,30,30,30,30


In [145]:
print('There are {} uniques categories.'.format(len(Venues['Venue Category'].unique())))

There are 114 uniques categories.


Now we will analyze each neighbourhood

In [146]:
# one hot encoding
hot= pd.get_dummies(Venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
hot['Neighborhood'] = Venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [hot.columns[-1]] + list(hot.columns[:-1])
hot = hot[fixed_columns]

print(hot.shape)
hot.head()

(900, 115)


Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,...,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront
0,Fereej Mohamed Bin Jasim,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
1,Fereej Mohamed Bin Jasim,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Fereej Mohamed Bin Jasim,0,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Fereej Mohamed Bin Jasim,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
4,Fereej Mohamed Bin Jasim,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [147]:
hot_grouped = hot.groupby('Neighborhood').mean().reset_index()
hot_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,...,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront
0,Ad Dawhah al Jadidah,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,...,0.0,0.0,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0
1,Al Dafna,0.0,0.033333,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,...,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333
2,Al Hilal,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,...,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0
3,Al Markhiya,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Al Messila,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Al Rufaa,0.0,0.0,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,...,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0
6,Al Sadd,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,...,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0
7,Al Thumama,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,...,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Duhail,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Fereej Abdel Aziz,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,...,0.0,0.0,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0


In [148]:
hot_grouped.shape

(30, 115)

In [151]:
num_top_venues = 5

for hood in hot_grouped['Neighborhood']:
    print(hood)
    temp = hot_grouped[hot_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

Ad Dawhah al Jadidah
                           venue  freq
0                          Hotel  0.13
1       Mediterranean Restaurant  0.07
2                           Café  0.07
3                     Restaurant  0.07
4  Vegetarian / Vegan Restaurant  0.03


Al Dafna
                 venue  freq
0          Coffee Shop  0.17
1                 Café  0.17
2  American Restaurant  0.07
3                 Park  0.07
4   Athletics & Sports  0.07


Al Hilal
                  venue  freq
0                  Café  0.20
1           Coffee Shop  0.10
2                 Hotel  0.07
3            Restaurant  0.03
4  Brazilian Restaurant  0.03


Al Markhiya
                  venue  freq
0                  Café  0.20
1           Coffee Shop  0.17
2  Fast Food Restaurant  0.07
3           Supermarket  0.07
4         Grocery Store  0.07


Al Messila
                  venue  freq
0                  Café  0.13
1            Restaurant  0.10
2                 Hotel  0.10
3  Fast Food Restaurant  0.07
4    Persian

In [152]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = hot_grouped['Neighborhood']

for ind in np.arange(hot_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(hot_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Ad Dawhah al Jadidah,Hotel,Café,Mediterranean Restaurant,Restaurant,Gym,Dessert Shop,Coffee Shop,Middle Eastern Restaurant,Fast Food Restaurant,Burger Joint
1,Al Dafna,Café,Coffee Shop,American Restaurant,Athletics & Sports,Park,Waterfront,Snack Place,African Restaurant,Food Court,Food Truck
2,Al Hilal,Café,Coffee Shop,Hotel,Italian Restaurant,Pizza Place,Brazilian Restaurant,Restaurant,Fast Food Restaurant,Malay Restaurant,Lounge
3,Al Markhiya,Café,Coffee Shop,Grocery Store,Fast Food Restaurant,Supermarket,Spa,Hotel,Kebab Restaurant,Donut Shop,Park
4,Al Messila,Café,Hotel,Restaurant,Persian Restaurant,Middle Eastern Restaurant,Fast Food Restaurant,Hobby Shop,Snack Place,Park,Convenience Store
5,Al Rufaa,Hotel,Restaurant,Middle Eastern Restaurant,Art Museum,Dessert Shop,Coffee Shop,Mediterranean Restaurant,Flea Market,Café,Boarding House
6,Al Sadd,Hotel,Middle Eastern Restaurant,Café,Coffee Shop,French Restaurant,Seafood Restaurant,Italian Restaurant,Spa,Asian Restaurant,Bar
7,Al Thumama,Café,Coffee Shop,Shopping Mall,Grocery Store,Pizza Place,Spa,Park,Middle Eastern Restaurant,Restaurant,Market
8,Duhail,Coffee Shop,Café,Shopping Mall,Dessert Shop,Restaurant,Breakfast Spot,Supermarket,Italian Restaurant,French Restaurant,Park
9,Fereej Abdel Aziz,Hotel,Pakistani Restaurant,Mediterranean Restaurant,Chaat Place,Bar,Coffee Shop,Middle Eastern Restaurant,Café,Fast Food Restaurant,Pool Hall


Next we want to run cluster the neighbourhood into 5 clusters

In [154]:
from sklearn.cluster import KMeans

kclusters = 5

Cluster = hot_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Cluster)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 3, 3, 3, 4, 2, 0, 3, 1, 0], dtype=int32)

In [156]:
Merged = df

# add clustering labels
Merged['Cluster Labels'] = kmeans.labels_

# merge hot_grouped with df_data-1 to add latitude/longitude for each neighborhood
Merged = Merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Districts')

Merged # check the last columns!

Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3,Fereej Mohamed Bin Jasim,4886,25.2865,51.5296,2,Café,Hotel,Middle Eastern Restaurant,Bakery,Waterfront,Mediterranean Restaurant,Flea Market,Boarding House,Fast Food Restaurant,Dessert Shop
1,4,Mushayrib,28069,25.2818,51.5275,3,Hotel,Mediterranean Restaurant,Middle Eastern Restaurant,Café,Bakery,Jewelry Store,Fried Chicken Joint,Flea Market,Filipino Restaurant,Fast Food Restaurant
2,14,Fereej Abdel Aziz,15706,25.2777,51.5242,3,Hotel,Pakistani Restaurant,Mediterranean Restaurant,Chaat Place,Bar,Coffee Shop,Middle Eastern Restaurant,Café,Fast Food Restaurant,Pool Hall
3,15,Ad Dawhah al Jadidah,15920,25.2776,51.5321,3,Hotel,Café,Mediterranean Restaurant,Restaurant,Gym,Dessert Shop,Coffee Shop,Middle Eastern Restaurant,Fast Food Restaurant,Burger Joint
4,16,Old Al Ghanim,16334,25.28,51.54,4,Hotel,Middle Eastern Restaurant,Café,Restaurant,Boarding House,Gym,Jewelry Store,Lebanese Restaurant,Mediterranean Restaurant,Flea Market
5,17,Al Rufaa,6026,25.2853,51.5444,2,Hotel,Restaurant,Middle Eastern Restaurant,Art Museum,Dessert Shop,Coffee Shop,Mediterranean Restaurant,Flea Market,Café,Boarding House
6,22,Fereej Bin Mahmoud,28327,25.2803,51.5124,0,Café,Italian Restaurant,Coffee Shop,Hotel,Seafood Restaurant,Department Store,Middle Eastern Restaurant,French Restaurant,Mexican Restaurant,Korean Restaurant
7,24,Rawdat Al Khail,18200,25.286,51.5142,3,Italian Restaurant,Café,Hotel,Department Store,Coffee Shop,Seafood Restaurant,Burger Joint,Bar,Bakery,Flea Market
8,25,Fereej Bin Durham,37082,25.2693,51.5295,1,Hotel,Turkish Restaurant,Café,Coffee Shop,Filipino Restaurant,Burger Joint,Pool Hall,Department Store,Fast Food Restaurant,Flea Market
9,26,Najma,28228,25.2683,51.5387,0,Café,Hotel,Turkish Restaurant,Mediterranean Restaurant,Coffee Shop,Italian Restaurant,Flea Market,Brazilian Restaurant,Pool Hall,Malay Restaurant


In [158]:
import matplotlib.cm as cm
import matplotlib.colors as colors
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Merged['Latitude'], Merged['Longitude'], Merged['Districts'], Merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Lastly, we will examine each cluster

In [159]:
##Cluster1
test0=Merged.loc[Merged['Cluster Labels'] == 0]
print(test0.shape)
test0

(6, 16)


Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,22,Fereej Bin Mahmoud,28327,25.2803,51.5124,0,Café,Italian Restaurant,Coffee Shop,Hotel,Seafood Restaurant,Department Store,Middle Eastern Restaurant,French Restaurant,Mexican Restaurant,Korean Restaurant
9,26,Najma,28228,25.2683,51.5387,0,Café,Hotel,Turkish Restaurant,Mediterranean Restaurant,Coffee Shop,Italian Restaurant,Flea Market,Brazilian Restaurant,Pool Hall,Malay Restaurant
11,30,Duhail,7705,25.3477,51.4675,0,Coffee Shop,Café,Shopping Mall,Dessert Shop,Restaurant,Breakfast Spot,Supermarket,Italian Restaurant,French Restaurant,Park
22,42,Al Hilal,11671,25.2599,51.5439,0,Café,Coffee Shop,Hotel,Italian Restaurant,Pizza Place,Brazilian Restaurant,Restaurant,Fast Food Restaurant,Malay Restaurant,Lounge
26,63,Onaiza,37461,25.3469,51.5176,0,Café,Beach,Italian Restaurant,Restaurant,Chinese Restaurant,Opera House,Office,Nightclub,Music Venue,Cupcake Shop
27,64,Lejbailat,4151,25.3212,51.5032,0,Café,Coffee Shop,Indian Restaurant,Athletics & Sports,African Restaurant,Spa,Pastry Shop,Department Store,Pizza Place,Donut Shop


In [160]:
##Cluster2
test1=Merged.loc[Merged['Cluster Labels'] == 1]
print(test1.shape)
test1

(4, 16)


Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,25,Fereej Bin Durham,37082,25.2693,51.5295,1,Hotel,Turkish Restaurant,Café,Coffee Shop,Filipino Restaurant,Burger Joint,Pool Hall,Department Store,Fast Food Restaurant,Flea Market
16,35,Fereej Kulaib,6507,25.3138,51.4914,1,Café,Middle Eastern Restaurant,Spa,Fast Food Restaurant,Restaurant,Coffee Shop,Department Store,Persian Restaurant,Pizza Place,Donut Shop
18,37,Fereej Bin Omran,26121,25.3038,51.4953,1,Restaurant,Coffee Shop,Hotel,Indian Restaurant,Café,Office,Department Store,Fast Food Restaurant,Donut Shop,Middle Eastern Restaurant
29,68,Jelaiah,5521,25.3522,51.4861,1,Coffee Shop,Café,Supermarket,Grocery Store,Dessert Shop,Shopping Mall,Gym / Fitness Center,Playground,Plaza,Cafeteria


In [161]:
##Cluster3
test2=Merged.loc[Merged['Cluster Labels'] == 2]
print(test2.shape)
test2

(8, 16)


Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3,Fereej Mohamed Bin Jasim,4886,25.2865,51.5296,2,Café,Hotel,Middle Eastern Restaurant,Bakery,Waterfront,Mediterranean Restaurant,Flea Market,Boarding House,Fast Food Restaurant,Dessert Shop
5,17,Al Rufaa,6026,25.2853,51.5444,2,Hotel,Restaurant,Middle Eastern Restaurant,Art Museum,Dessert Shop,Coffee Shop,Mediterranean Restaurant,Flea Market,Café,Boarding House
10,27,Umm Ghuwailina,33262,25.2766,51.5492,2,Hotel,Restaurant,Café,Middle Eastern Restaurant,Flea Market,Burger Joint,Chinese Restaurant,Park,Fast Food Restaurant,Italian Restaurant
14,33,Al Markhiya,6242,25.3388,51.4992,2,Café,Coffee Shop,Grocery Store,Fast Food Restaurant,Supermarket,Spa,Hotel,Kebab Restaurant,Donut Shop,Park
20,40,New Salatah,16086,25.2623,51.5094,2,Hotel,Coffee Shop,Department Store,Sushi Restaurant,Italian Restaurant,Movie Theater,Café,Pizza Place,Dessert Shop,Diner
21,41,Nuaija,33379,25.2467,51.5334,2,Coffee Shop,Café,Pizza Place,Italian Restaurant,Soccer Stadium,Hotel,Ice Cream Shop,Indian Restaurant,Volleyball Court,Market
25,61,Al Dafna,4022,25.3077,51.5163,2,Café,Coffee Shop,American Restaurant,Athletics & Sports,Park,Waterfront,Snack Place,African Restaurant,Food Court,Food Truck
28,67,Hazm Al Markhiya,8967,25.3388,51.4992,2,Café,Coffee Shop,Grocery Store,Fast Food Restaurant,Supermarket,Spa,Hotel,Kebab Restaurant,Donut Shop,Park


In [162]:
##Cluster4
test3=Merged.loc[Merged['Cluster Labels'] == 3]
print(test3.shape)
test3

(8, 16)


Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,4,Mushayrib,28069,25.2818,51.5275,3,Hotel,Mediterranean Restaurant,Middle Eastern Restaurant,Café,Bakery,Jewelry Store,Fried Chicken Joint,Flea Market,Filipino Restaurant,Fast Food Restaurant
2,14,Fereej Abdel Aziz,15706,25.2777,51.5242,3,Hotel,Pakistani Restaurant,Mediterranean Restaurant,Chaat Place,Bar,Coffee Shop,Middle Eastern Restaurant,Café,Fast Food Restaurant,Pool Hall
3,15,Ad Dawhah al Jadidah,15920,25.2776,51.5321,3,Hotel,Café,Mediterranean Restaurant,Restaurant,Gym,Dessert Shop,Coffee Shop,Middle Eastern Restaurant,Fast Food Restaurant,Burger Joint
7,24,Rawdat Al Khail,18200,25.286,51.5142,3,Italian Restaurant,Café,Hotel,Department Store,Coffee Shop,Seafood Restaurant,Burger Joint,Bar,Bakery,Flea Market
15,34,Madinat Khalifa South,38247,25.3156,51.4808,3,Café,Fast Food Restaurant,Middle Eastern Restaurant,Restaurant,Persian Restaurant,Pharmacy,Shawarma Place,Office,Spa,Donut Shop
17,36,Al Messila,6803,25.3006,51.4808,3,Café,Hotel,Restaurant,Persian Restaurant,Middle Eastern Restaurant,Fast Food Restaurant,Hobby Shop,Snack Place,Park,Convenience Store
23,45,Old Airport,48525,25.2481,51.5544,3,Café,Coffee Shop,Fast Food Restaurant,Chinese Restaurant,Spa,Convenience Store,Pizza Place,Middle Eastern Restaurant,Restaurant,Mediterranean Restaurant
24,46,Al Thumama,21367,25.2316,51.5413,3,Café,Coffee Shop,Shopping Mall,Grocery Store,Pizza Place,Spa,Park,Middle Eastern Restaurant,Restaurant,Market


In [163]:
##Cluster5
test4=Merged.loc[Merged['Cluster Labels'] == 4]
print(test4.shape)
test4

(4, 16)


Unnamed: 0,Zone,Districts,Population,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,16,Old Al Ghanim,16334,25.28,51.54,4,Hotel,Middle Eastern Restaurant,Café,Restaurant,Boarding House,Gym,Jewelry Store,Lebanese Restaurant,Mediterranean Restaurant,Flea Market
12,31,Umm Lekhba,11897,25.3477,51.4675,4,Coffee Shop,Café,Shopping Mall,Dessert Shop,Restaurant,Breakfast Spot,Supermarket,Italian Restaurant,French Restaurant,Park
13,32,Madinat Khalifa North,12364,25.329,51.4756,4,Café,Coffee Shop,Shopping Mall,Clothing Store,Dessert Shop,Restaurant,Electronics Store,Burger Joint,Shawarma Place,French Restaurant
19,38,Al Sadd,41673,25.2838,51.4914,4,Hotel,Middle Eastern Restaurant,Café,Coffee Shop,French Restaurant,Seafood Restaurant,Italian Restaurant,Spa,Asian Restaurant,Bar
