<a href="https://cognitiveclass.ai"><img src = "https://ibm.box.com/shared/static/9gegpsmnsoo25ikkbl4qzlvlyjbgxs5x.png" width = 400> </a>

<h1 align=center><font size = 5>Segmenting and Clustering Neighborhoods in Doha, Qatar</font></h1>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Importing Libraries</a>

2. <a href="#item2">Web Scraping and Data Wrangling</a>

3. <a href="#item3">Adding Longitude and Latitude to DataFrame</a>

4. <a href="#item4">Exploring Data and Clustering Neighborhoods</a>

5. <a href="#item5">Examine Clusters</a>    
</font>
</div>

## 1. Importing Libraries

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

!conda install -c anaconda beautifulsoup4 --yes
from bs4 import BeautifulSoup

!pip install html5lib

!pip install lxml

!pip install geocoder

import geocoder

print('All Libraries imported!')

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          91 KB

The following NEW packages will be INSTALLED:

    geographiclib: 1.50-py_0   conda-forge
    geopy:         1.20.0-py_0 conda-forge


Downloading and Extracting Packages
geopy-1.20.0         | 57 KB     | ##################################### | 100% 
geographiclib-1.50   | 34 KB     | ##

<a id='item1'></a>

## 2. Web Scraping and Data Wrangling

There is the wikipedia page that has list of communities in Doha. Let's use BeautifulSoup for webscraping and then convert table into pandas DataFrame.

In [3]:
url = "https://en.wikipedia.org/wiki/List_of_communities_in_Doha"

In [4]:
res = requests.get(url)

In [5]:
soup = BeautifulSoup(res.content,'html5lib')
table = soup.find_all('table')[0] 

In [6]:
df = pd.read_html(str(table))[0]

In [7]:
df.head()

Unnamed: 0,Community,Area(km2),Population (2010),Population density(/km2)
0,Al Bidda,0.8 km²,1067.0,"1,398.0/km²"
1,Al Dafna,1.1 km²,19.0,17.7/km²
2,Ad Dawhah al Jadidah,0.5 km²,13059.0,"27,358.5/km²"
3,Al Egla,,,
4,Al Hilal,1.8 km²,11257.0,"6,393.4/km²"


In [8]:
df.shape

(60, 4)

Let's add City and country name alon with Community to make sure that we may find coordinates of these communities.

In [9]:
df["Address"] = df["Community"] + ", Doha, Qatar"

In [10]:
df.head()

Unnamed: 0,Community,Area(km2),Population (2010),Population density(/km2),Address
0,Al Bidda,0.8 km²,1067.0,"1,398.0/km²","Al Bidda, Doha, Qatar"
1,Al Dafna,1.1 km²,19.0,17.7/km²,"Al Dafna, Doha, Qatar"
2,Ad Dawhah al Jadidah,0.5 km²,13059.0,"27,358.5/km²","Ad Dawhah al Jadidah, Doha, Qatar"
3,Al Egla,,,,"Al Egla, Doha, Qatar"
4,Al Hilal,1.8 km²,11257.0,"6,393.4/km²","Al Hilal, Doha, Qatar"


Lets find coordinates of all communities.

In [12]:
from geopy.extra.rate_limiter import RateLimiter

In [13]:
locator = Nominatim(user_agent="myGeocoder")

In [14]:
# 1 - conveneint function to delay between geocoding calls
geocode = RateLimiter(locator.geocode, min_delay_seconds=1)
# 2- - create location column
df['location'] = df['Address'].apply(geocode)
# 3 - create longitude, laatitude and altitude from location column (returns tuple)
df['point'] = df['location'].apply(lambda loc: tuple(loc.point) if loc else None)
# 4 - split point column into latitude, longitude and altitude columns
df[['latitude', 'longitude', 'altitude']] = pd.DataFrame(df['point'].tolist(), index=df.index)

In [15]:
df.head()

Unnamed: 0,Community,Area(km2),Population (2010),Population density(/km2),Address,location,point,latitude,longitude,altitude
0,Al Bidda,0.8 km²,1067.0,"1,398.0/km²","Al Bidda, Doha, Qatar","(البدع, الدوحة, ‏قطر‎, (25.2902432, 51.526697))","(25.2902432, 51.526697, 0.0)",25.290243,51.526697,0.0
1,Al Dafna,1.1 km²,19.0,17.7/km²,"Al Dafna, Doha, Qatar","(Al Dafna (61), الدوحة, 00000, ‏قطر‎, (25.3195...","(25.3195836, 51.536284, 0.0)",25.319584,51.536284,0.0
2,Ad Dawhah al Jadidah,0.5 km²,13059.0,"27,358.5/km²","Ad Dawhah al Jadidah, Doha, Qatar",,,,,
3,Al Egla,,,,"Al Egla, Doha, Qatar",,,,,
4,Al Hilal,1.8 km²,11257.0,"6,393.4/km²","Al Hilal, Doha, Qatar","(الهلال, الدوحة, +974, ‏قطر‎, (25.2603758, 51....","(25.2603758, 51.5464884, 0.0)",25.260376,51.546488,0.0


In [164]:
df

Unnamed: 0,Community,Area(km2),Population (2010),Population density(/km2),Address,location,point,latitude,longitude,altitude
0,Al Bidda,0.8 km²,1067.0,"1,398.0/km²","Al Bidda, Doha, Qatar","(البدع, الدوحة, ‏قطر‎, (25.2902432, 51.526697))","(25.2902432, 51.526697, 0.0)",25.290243,51.526697,0.0
1,Al Dafna,1.1 km²,19.0,17.7/km²,"Al Dafna, Doha, Qatar","(Al Dafna (61), الدوحة, 00000, ‏قطر‎, (25.3195...","(25.3195836, 51.536284, 0.0)",25.319584,51.536284,0.0
2,Ad Dawhah al Jadidah,0.5 km²,13059.0,"27,358.5/km²","Ad Dawhah al Jadidah, Doha, Qatar",,,,,
3,Al Egla,,,,"Al Egla, Doha, Qatar",,,,,
4,Al Hilal,1.8 km²,11257.0,"6,393.4/km²","Al Hilal, Doha, Qatar","(الهلال, الدوحة, +974, ‏قطر‎, (25.2603758, 51....","(25.2603758, 51.5464884, 0.0)",25.260376,51.546488,0.0
5,Al Jasrah,0.4 km²,240.0,672.7/km²,"Al Jasrah, Doha, Qatar",,,,,
6,Al Kharayej,,,,"Al Kharayej, Doha, Qatar","(الخرايج, حدائق العين, الريان, 40466, ‏قطر‎, (...","(25.2098364, 51.4549741, 0.0)",25.209836,51.454974,0.0
7,Al Khulaifat,,,,"Al Khulaifat, Doha, Qatar","(الشرق - الخليفات, الدوحة, 1911, ‏قطر‎, (25.28...","(25.2850614, 51.5527151, 0.0)",25.285061,51.552715,0.0
8,Al Mansoura,,,,"Al Mansoura, Doha, Qatar","(المنصورة, الدوحة, 10849, ‏قطر‎, (25.2656514, ...","(25.2656514, 51.5321106, 0.0)",25.265651,51.532111,0.0
9,Al Markhiya,2.7 km²,5197.0,"1,894.2/km²","Al Markhiya, Doha, Qatar","(Al Markhiya (33), الدوحة, ‏قطر‎, (25.3281657,...","(25.3281657, 51.4936494, 0.0)",25.328166,51.493649,0.0


In [16]:
df_doha = df[["Community", "latitude", "longitude"]]

In [17]:
df_doha.head()

Unnamed: 0,Community,latitude,longitude
0,Al Bidda,25.290243,51.526697
1,Al Dafna,25.319584,51.536284
2,Ad Dawhah al Jadidah,,
3,Al Egla,,
4,Al Hilal,25.260376,51.546488


In [18]:
df_doha.rename(columns={"Community":"Neighborhood", "latitude": "Latitude", "longitude":"Longitude"}, inplace = True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(**kwargs)


In [19]:
df_doha.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Al Bidda,25.290243,51.526697
1,Al Dafna,25.319584,51.536284
2,Ad Dawhah al Jadidah,,
3,Al Egla,,
4,Al Hilal,25.260376,51.546488


In [20]:
df_doha.dropna(axis = 0, inplace = True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [21]:
df_doha.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Al Bidda,25.290243,51.526697
1,Al Dafna,25.319584,51.536284
4,Al Hilal,25.260376,51.546488
6,Al Kharayej,25.209836,51.454974
7,Al Khulaifat,25.285061,51.552715


In [22]:
df_doha.reset_index(drop=True, inplace = True)

In [23]:
df_doha.shape

(51, 3)

In [24]:
print(df_doha.shape)
df_doha.head()

(51, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Al Bidda,25.290243,51.526697
1,Al Dafna,25.319584,51.536284
2,Al Hilal,25.260376,51.546488
3,Al Kharayej,25.209836,51.454974
4,Al Khulaifat,25.285061,51.552715


Finalized Dataframe is being saved as CSV file to be able to use it for further analysis and to avoid scraping and cleaning again.

In [25]:
df_doha.to_csv("Neighborhoods in Doha.csv", index = False)

## 4. Exploring Data and Clustering Neighborhoods

Next, let's load the CSV file that was create in previous part..

In [26]:
df_new = pd.read_csv("Neighborhoods in Doha.csv")
df_new.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Al Bidda,25.290243,51.526697
1,Al Dafna,25.319584,51.536284
2,Al Hilal,25.260376,51.546488
3,Al Kharayej,25.209836,51.454974
4,Al Khulaifat,25.285061,51.552715


In [27]:
df_new.shape

(51, 3)

#### Use geopy library to get the latitude and longitude values of Doha.

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>to_explorer</em>, as shown below.

In [28]:
address = 'Doha, Qatar'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Doha, Qatar are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Doha, Qatar are 25.30132655, 51.4957047073798.


In [29]:
print(location)

الدوحة, قطر Qatar


Here are coordinates of Ooredoo HQ2.

In [31]:
hq2_lat = 25.2665
hq2_lon = 51.5535

#### Create a map of Doha with neighborhoods superimposed on top. ALso, Ooredoo HQ2 is being marked as red to highlight.

In [33]:
# create map of Doha using latitude and longitude values
map_doha = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, neighborhood in zip(df_new['Latitude'], df_new['Longitude'], df_new['Neighborhood']):
    
    label = folium.Popup(neighborhood, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_doha)  
    
label = folium.Popup("Ooredoo HQ2", parse_html=True)
folium.CircleMarker(
    [hq2_lat, hq2_lon],
    radius=10,
    popup=label,
    color='red',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_doha)  

map_doha

Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Define Foursquare Credentials and Version

In [34]:
CLIENT_ID = 'KJFDYN4KQ10PWG33LUKASFSY3KUP0FNSMNKW4XIK2XPG50WE' # your Foursquare ID
CLIENT_SECRET = '5OGC3AHRCNYJLVS2VTVMKIKLOIPRL35Q3QBCMUSW435OYHXU' # your Foursquare Secret
VERSION = '20191021' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KJFDYN4KQ10PWG33LUKASFSY3KUP0FNSMNKW4XIK2XPG50WE
CLIENT_SECRET:5OGC3AHRCNYJLVS2VTVMKIKLOIPRL35Q3QBCMUSW435OYHXU


#### Let's explore the first neighborhood in our dataframe.

Get the neighborhood's name.

In [35]:
df_new.loc[0, 'Neighborhood']

'Al Bidda'

Get the neighborhood's latitude and longitude values.

In [36]:
neighborhood_latitude = df_new.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df_new.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = df_new.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Al Bidda are 25.2902432, 51.526697.


#### Now, let's get the top 100 venues that are in Al Bidda within a radius of 1000 meters.

First, let's create the GET request URL. Name your URL **url**.

In [37]:
LIMIT = 100
radius = 1000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url




'https://api.foursquare.com/v2/venues/explore?&client_id=KJFDYN4KQ10PWG33LUKASFSY3KUP0FNSMNKW4XIK2XPG50WE&client_secret=5OGC3AHRCNYJLVS2VTVMKIKLOIPRL35Q3QBCMUSW435OYHXU&v=20191021&ll=25.2902432,51.526697&radius=1000&limit=100'

Double-click __here__ for the solution.
<!-- The correct answer is:
LIMIT = 100 # limit of number of venues returned by Foursquare API
-->

<!--
radius = 500 # define radius
-->

<!--
\\ # create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL
--> 

Send the GET request and examine the resutls

In [38]:
results = requests.get(url).json()
#results

From the Foursquare lab in the previous module, we know that all the information is in the *items* key. Before we proceed, let's borrow the **get_category_type** function from the Foursquare lab.

In [39]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a *pandas* dataframe.

In [40]:
venues = results['response']['groups'][0]['items']


In [41]:
venues[0]

{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '4bee0a21a40fc9285bf9820c',
  'name': 'Corniche (الكورنيش)',
  'location': {'address': 'Al Corniche St.',
   'crossStreet': 'Corniche | شارع الكورنيش',
   'lat': 25.294656745035823,
   'lng': 51.529692987933366,
   'distance': 576,
   'cc': 'QA',
   'city': 'الدوحة',
   'state': 'الدوحة',
   'country': 'قطر',
   'formattedAddress': ['Al Corniche St. (Corniche | شارع الكورنيش)',
    'الدوحة',
    'قطر']},
  'categories': [{'id': '56aa371be4b08b9a8d5734c3',
    'name': 'Waterfront',
    'pluralName': 'Waterfronts',
    'shortName': 'Waterfront',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/river_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referralId': 'e-0-4bee0a21a40fc9285bf9820c-0'}

In [42]:
nearby_venues = json_normalize(venues) # flatten JSON
nearby_venues.head()

Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.crossStreet,venue.location.lat,venue.location.lng,venue.location.distance,venue.location.cc,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.location.labeledLatLngs,venue.location.neighborhood,venue.location.postalCode
0,e-0-4bee0a21a40fc9285bf9820c-0,0,"[{'summary': 'This spot is popular', 'type': '...",4bee0a21a40fc9285bf9820c,Corniche (الكورنيش),Al Corniche St.,Corniche | شارع الكورنيش,25.294657,51.529693,576,QA,الدوحة,الدوحة,قطر,"[Al Corniche St. (Corniche | شارع الكورنيش), ا...","[{'id': '56aa371be4b08b9a8d5734c3', 'name': 'W...",0,[],,,
1,e-0-4cfa59462d80a1435f0444d8-1,0,"[{'summary': 'This spot is popular', 'type': '...",4cfa59462d80a1435f0444d8,Jasmine Thai Restaurant,Souq Waqif,Souq Waqif,25.288038,51.532121,598,QA,الدوحة,الدوحة,قطر,"[Souq Waqif (Souq Waqif), الدوحة, قطر]","[{'id': '4bf58dd8d48988d149941735', 'name': 'T...",0,[],"[{'label': 'display', 'lat': 25.28803842598862...",,
2,e-0-4cfa4c4a20fe37043a4f4cf8-2,0,"[{'summary': 'This spot is popular', 'type': '...",4cfa4c4a20fe37043a4f4cf8,Souq Waqif (سوق واقف),Al Souq St.,Abdullah Bin Jassim St.,25.287797,51.533051,695,QA,الدوحة,الدوحة,قطر,"[Al Souq St. (Abdullah Bin Jassim St.), الدوحة...","[{'id': '4bf58dd8d48988d1f7941735', 'name': 'F...",0,[],,الجسرة,
3,e-0-5c88d217ccad6b002cc43a2e-3,0,"[{'summary': 'This spot is popular', 'type': '...",5c88d217ccad6b002cc43a2e,Usta Turkish Kebap & Doner,,,25.286076,51.531224,650,QA,,,قطر,[قطر],"[{'id': '4f04af1f2fb6e1c99f3db0bb', 'name': 'T...",0,[],"[{'label': 'display', 'lat': 25.28607626847948...",,
4,e-0-517289e8e4b0d686752ade50-4,0,"[{'summary': 'This spot is popular', 'type': '...",517289e8e4b0d686752ade50,Argan Restaurant,Al Jasra Hotel,,25.289311,51.531295,474,QA,,,قطر,"[Al Jasra Hotel, قطر]","[{'id': '4bf58dd8d48988d1c3941735', 'name': 'M...",0,[],"[{'label': 'display', 'lat': 25.28931146542933...",,


In [43]:
# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
nearby_venues.head()

Unnamed: 0,venue.name,venue.categories,venue.location.lat,venue.location.lng
0,Corniche (الكورنيش),"[{'id': '56aa371be4b08b9a8d5734c3', 'name': 'W...",25.294657,51.529693
1,Jasmine Thai Restaurant,"[{'id': '4bf58dd8d48988d149941735', 'name': 'T...",25.288038,51.532121
2,Souq Waqif (سوق واقف),"[{'id': '4bf58dd8d48988d1f7941735', 'name': 'F...",25.287797,51.533051
3,Usta Turkish Kebap & Doner,"[{'id': '4f04af1f2fb6e1c99f3db0bb', 'name': 'T...",25.286076,51.531224
4,Argan Restaurant,"[{'id': '4bf58dd8d48988d1c3941735', 'name': 'M...",25.289311,51.531295


In [44]:
# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

In [45]:
# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Corniche (الكورنيش),Waterfront,25.294657,51.529693
1,Jasmine Thai Restaurant,Thai Restaurant,25.288038,51.532121
2,Souq Waqif (سوق واقف),Flea Market,25.287797,51.533051
3,Usta Turkish Kebap & Doner,Turkish Restaurant,25.286076,51.531224
4,Argan Restaurant,Moroccan Restaurant,25.289311,51.531295


In [46]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

46 venues were returned by Foursquare.


<a id='item2'></a>

#### Let's create a function to repeat the same process to all the selected neighborhoods in Doha

In [65]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    #print(venues_list[0])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the above function on each neighborhood and created a new dataframe called *doha_venues*.

In [66]:
doha_venues = getNearbyVenues(names=df_new['Neighborhood'],
                                   latitudes=df_new['Latitude'],
                                   longitudes=df_new['Longitude']
                                  )



Al Bidda
Al Dafna
Al Hilal
Al Kharayej
Al Khulaifat
Al Mansoura
Al Markhiya
Al Messila
Al Mirqab
Al Najada
Al Qassar
Al Rufaa
Al Sadd
Al Souq
Al Tarfa
Al Thumama
Barahat Al Jufairi
Doha International Airport
Doha Port
Duhail
Fereej Abdul Aziz
Fereej Al Nasr
Fereej Bin Mahmoud
Fereej Bin Omran
Fereej Kulaib
Hamad Medical City
Hazm Al Markhiya
Industrial Area
Jabal Thuaileb
Jelaiah
Jeryan Nejaima
Lejbailat
Lekhwair
Madinat Khalifa North
Madinat Khalifa South
Musheireb
Najma
New Al Hitmi
New Al Mirqab
New Salata
Nuaija
Old Airport
Old Al Ghanim
Old Al Hitmi
Onaiza
Rawdat Al Khail
Salata
Umm Ghuwailina
Umm Lekhba
Wadi Al Banat
Wadi Al Sail


#### Let's check the size of the resulting dataframe

In [67]:
print(doha_venues.shape)
doha_venues.head()

(2398, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Al Bidda,25.290243,51.526697,Corniche (الكورنيش),25.294657,51.529693,Waterfront
1,Al Bidda,25.290243,51.526697,Jasmine Thai Restaurant,25.288038,51.532121,Thai Restaurant
2,Al Bidda,25.290243,51.526697,Souq Waqif (سوق واقف),25.287797,51.533051,Flea Market
3,Al Bidda,25.290243,51.526697,Usta Turkish Kebap & Doner,25.286076,51.531224,Turkish Restaurant
4,Al Bidda,25.290243,51.526697,Argan Restaurant,25.289311,51.531295,Moroccan Restaurant


Let's check how many venues were returned for each neighborhood

In [68]:
doha_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Al Bidda,46,46,46,46,46,46
Al Dafna,100,100,100,100,100,100
Al Hilal,27,27,27,27,27,27
Al Kharayej,3,3,3,3,3,3
Al Khulaifat,78,78,78,78,78,78
Al Mansoura,74,74,74,74,74,74
Al Markhiya,44,44,44,44,44,44
Al Messila,100,100,100,100,100,100
Al Mirqab,100,100,100,100,100,100
Al Najada,100,100,100,100,100,100


#### Let's find out how many unique categories can be curated from all the returned venues

In [69]:
print('There are {} uniques categories.'.format(len(doha_venues['Venue Category'].unique())))

There are 185 uniques categories.


<a id='item3'></a>

### Analyze Each Neighborhood

In [70]:
# one hot encoding
doha_onehot = pd.get_dummies(doha_venues[['Venue Category']], prefix="", prefix_sep="")
#doha_onehot["Total"] = doha_onehot.sum(axis = 1)
doha_onehot.head()

Unnamed: 0,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Beach,Bed & Breakfast,Beer Garden,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Station,Business Service,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Currency Exchange,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Empanada Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Laser Tag,Latin American Restaurant,Lebanese Restaurant,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Moroccan Restaurant,Motel,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Other Repair Shop,Outdoors & Recreation,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Pool,Pool Hall,Pub,Residential Building (Apartment / Condo),Resort,Restaurant,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Stationery Store,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Tailor Shop,Tea Room,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theme Restaurant,Toy / Game Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront,Whisky Bar,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [71]:
# add neighborhood column back to dataframe
doha_onehot['Neighborhood'] = doha_venues['Neighborhood']
doha_onehot.head()

Unnamed: 0,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Beach,Bed & Breakfast,Beer Garden,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Station,Business Service,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Currency Exchange,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Empanada Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Laser Tag,Latin American Restaurant,Lebanese Restaurant,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Moroccan Restaurant,Motel,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Other Repair Shop,Outdoors & Recreation,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Pool,Pool Hall,Pub,Residential Building (Apartment / Condo),Resort,Restaurant,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Stationery Store,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Tailor Shop,Tea Room,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theme Restaurant,Toy / Game Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront,Whisky Bar,Women's Store,Neighborhood
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,Al Bidda
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,Al Bidda
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Al Bidda
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,Al Bidda
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Al Bidda


In [72]:
doha_onehot.columns[-1]

'Neighborhood'

In [73]:
doha_onehot.columns[:-1]

Index(['Afghan Restaurant', 'African Restaurant', 'Airport',
       'American Restaurant', 'Arcade', 'Art Gallery', 'Art Museum',
       'Arts & Crafts Store', 'Asian Restaurant', 'Athletics & Sports',
       ...
       'Thai Restaurant', 'Theme Restaurant', 'Toy / Game Store',
       'Turkish Restaurant', 'Vegetarian / Vegan Restaurant',
       'Video Game Store', 'Volleyball Court', 'Waterfront', 'Whisky Bar',
       'Women's Store'],
      dtype='object', length=185)

In [74]:
# move neighborhood column to the first column
fixed_columns = [doha_onehot.columns[-1]] + list(doha_onehot.columns[:-1])

In [75]:
doha_onehot = doha_onehot[fixed_columns]

doha_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Beach,Bed & Breakfast,Beer Garden,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Station,Business Service,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Currency Exchange,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Empanada Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Laser Tag,Latin American Restaurant,Lebanese Restaurant,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Moroccan Restaurant,Motel,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Other Repair Shop,Outdoors & Recreation,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Pool,Pool Hall,Pub,Residential Building (Apartment / Condo),Resort,Restaurant,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Stationery Store,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Tailor Shop,Tea Room,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theme Restaurant,Toy / Game Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront,Whisky Bar,Women's Store
0,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
2,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
4,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [66]:
doha_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bar,Baseball Field,Baseball Stadium,Beach,Boarding House,Boat or Ferry,Bookstore,Brazilian Restaurant,Breakfast Spot,Buffet,Building,Burger Joint,Bus Station,Cafeteria,Café,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Currency Exchange,Department Store,Dessert Shop,Dog Run,Electronics Store,Ethiopian Restaurant,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food Court,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Garden,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Hookah Bar,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Moroccan Restaurant,Motel,Movie Theater,Museum,Nightclub,North Indian Restaurant,Office,Optical Shop,Other Repair Shop,Pakistani Restaurant,Palace,Park,Pastry Shop,Persian Restaurant,Pet Store,Pizza Place,Playground,Pool,Pool Hall,Pub,Rental Car Location,Residential Building (Apartment / Condo),Resort,Restaurant,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shawarma Place,Shoe Store,Shop & Service,Shopping Mall,South Indian Restaurant,Spa,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Waterfront
0,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Al Bidda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [76]:
doha_onehot.shape

(2398, 186)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [77]:
doha_grouped = doha_onehot.groupby('Neighborhood').mean().reset_index()

#### Let's confirm the new size

In [78]:
doha_grouped.shape

(50, 186)

#### Let's print each neighborhood along with the top 5 most common venues

In [80]:
num_top_venues = 10

for hood in doha_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = doha_grouped[doha_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Al Bidda----
                       venue  freq
0                       Café  0.17
1  Middle Eastern Restaurant  0.15
2                      Hotel  0.11
3         Italian Restaurant  0.04
4                 Restaurant  0.04
5         Turkish Restaurant  0.04
6               Dessert Shop  0.04
7                     Bakery  0.04
8                 Waterfront  0.02
9        Moroccan Restaurant  0.02


----Al Dafna----
                       venue  freq
0                Coffee Shop  0.12
1                      Hotel  0.11
2                       Café  0.09
3                        Bar  0.04
4  Middle Eastern Restaurant  0.04
5         Italian Restaurant  0.04
6                 Restaurant  0.04
7                     Lounge  0.03
8         Chinese Restaurant  0.03
9        Sporting Goods Shop  0.02


----Al Hilal----
                       venue  freq
0                       Café  0.07
1                Pizza Place  0.07
2                Coffee Shop  0.07
3                  Cafeteria  0.07


#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [81]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 5 venues for each neighborhood.

In [93]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = doha_grouped['Neighborhood']

for ind in np.arange(doha_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(doha_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Al Bidda,Café,Middle Eastern Restaurant,Hotel,Italian Restaurant,Turkish Restaurant
1,Al Dafna,Coffee Shop,Hotel,Café,Bar,Italian Restaurant
2,Al Hilal,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop
3,Al Kharayej,Middle Eastern Restaurant,Coffee Shop,Health & Beauty Service,Women's Store,Farmers Market
4,Al Khulaifat,Hotel,Restaurant,Café,Indian Restaurant,Middle Eastern Restaurant


<a id='item4'></a>

### Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 4 clusters.

In [94]:
# set number of clusters
kclusters = 4

doha_grouped_clustering = doha_grouped.drop('Neighborhood', 1)

In [95]:
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(doha_grouped_clustering)

In [96]:
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 1, 0, 0, 0, 0, 0, 0], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [97]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [98]:
doha_merged = df_new

In [99]:
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
doha_merged = doha_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [100]:
doha_merged.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Al Bidda,25.290243,51.526697,0.0,Café,Middle Eastern Restaurant,Hotel,Italian Restaurant,Turkish Restaurant
1,Al Dafna,25.319584,51.536284,0.0,Coffee Shop,Hotel,Café,Bar,Italian Restaurant
2,Al Hilal,25.260376,51.546488,0.0,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop
3,Al Kharayej,25.209836,51.454974,1.0,Middle Eastern Restaurant,Coffee Shop,Health & Beauty Service,Women's Store,Farmers Market
4,Al Khulaifat,25.285061,51.552715,0.0,Hotel,Restaurant,Café,Indian Restaurant,Middle Eastern Restaurant


In [101]:
doha_merged.shape

(51, 9)

In [102]:
doha_merged.dropna(axis = 0, inplace = True)
doha_merged.reset_index(drop = True, inplace = True)
doha_merged.shape


(50, 9)

Finally, let's visualize the resulting clusters

In [103]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(doha_merged['Latitude'], doha_merged['Longitude'], doha_merged['Neighborhood'], doha_merged['Cluster Labels']):
    cluster = int(cluster)
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

### Examine Clusters

Now, let us examine each cluster and determine the discriminating venue categories that distinguish each cluster.

#### Cluster 1

In [104]:
doha_merged.loc[doha_merged['Cluster Labels'] == 0, doha_merged.columns[[0] + list(range(4, doha_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Al Bidda,Café,Middle Eastern Restaurant,Hotel,Italian Restaurant,Turkish Restaurant
1,Al Dafna,Coffee Shop,Hotel,Café,Bar,Italian Restaurant
2,Al Hilal,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop
4,Al Khulaifat,Hotel,Restaurant,Café,Indian Restaurant,Middle Eastern Restaurant
5,Al Mansoura,Fast Food Restaurant,Café,Hotel,Indian Restaurant,Convenience Store
6,Al Markhiya,Café,Coffee Shop,Asian Restaurant,Pharmacy,Middle Eastern Restaurant
7,Al Messila,Hotel,Middle Eastern Restaurant,Café,Restaurant,Coffee Shop
8,Al Mirqab,Hotel,Café,Middle Eastern Restaurant,Restaurant,Coffee Shop
9,Al Najada,Hotel,Middle Eastern Restaurant,Café,Restaurant,Coffee Shop
10,Al Qassar,Beach,Coffee Shop,Hotel,Restaurant,Café


#### Cluster 2

In [105]:
doha_merged.loc[doha_merged['Cluster Labels'] == 1, doha_merged.columns[[0] + list(range(4, doha_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
3,Al Kharayej,Middle Eastern Restaurant,Coffee Shop,Health & Beauty Service,Women's Store,Farmers Market
14,Al Tarfa,Coffee Shop,Food Court,Café,Food & Drink Shop,Food
15,Al Thumama,Coffee Shop,Boutique,Middle Eastern Restaurant,Convenience Store,Shopping Mall


#### Cluster 3

In [106]:
doha_merged.loc[doha_merged['Cluster Labels'] == 2, doha_merged.columns[[0] + list(range(4, doha_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
27,Industrial Area,Electronics Store,Business Service,Women's Store,Fast Food Restaurant,Food Court


#### Cluster 4

In [107]:
doha_merged.loc[doha_merged['Cluster Labels'] == 3, doha_merged.columns[[0] + list(range(4, doha_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
29,Jelaiah,Coffee Shop,Convenience Store,Intersection,Clothing Store,Tennis Stadium
30,Jeryan Nejaima,Auto Garage,Intersection,Convenience Store,Coffee Shop,Farmers Market


#### Cluster 5

In [108]:
doha_merged.loc[doha_merged['Cluster Labels'] == 4, doha_merged.columns[[1] + list(range(5, doha_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue


Let's repeat the same process again for top 20 venues but only for Al Dafna and 3 selected neighborhoods near Ooredoo HQ2. 

In [115]:
num_top_venues = 20

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = doha_grouped['Neighborhood']

for ind in np.arange(doha_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(doha_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Al Bidda,Café,Middle Eastern Restaurant,Hotel,Italian Restaurant,Turkish Restaurant,Bakery,Restaurant,Dessert Shop,BBQ Joint,Pet Store,Waterfront,Moroccan Restaurant,Art Gallery,Steakhouse,Mediterranean Restaurant,Pakistani Restaurant,Boarding House,Resort,Persian Restaurant,Tea Room
1,Al Dafna,Coffee Shop,Hotel,Café,Bar,Italian Restaurant,Middle Eastern Restaurant,Restaurant,Lounge,Chinese Restaurant,Asian Restaurant,Buffet,Beach,Sporting Goods Shop,Steakhouse,Spa,French Restaurant,Gym,Ice Cream Shop,Brazilian Restaurant,Cocktail Bar
2,Al Hilal,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop,Middle Eastern Restaurant,Business Service,Sandwich Place,Restaurant,Malay Restaurant,Filipino Restaurant,Fast Food Restaurant,Music Venue,Supermarket,Department Store,Clothing Store,Gym,Gym / Fitness Center,Asian Restaurant,Electronics Store
3,Al Kharayej,Middle Eastern Restaurant,Coffee Shop,Health & Beauty Service,Women's Store,Farmers Market,Food & Drink Shop,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Food Truck,Exhibit,Empanada Restaurant,Electronics Store,Donut Shop,Diner
4,Al Khulaifat,Hotel,Restaurant,Café,Indian Restaurant,Middle Eastern Restaurant,Harbor / Marina,Lounge,Fast Food Restaurant,Beach,Chinese Restaurant,Athletics & Sports,Hotel Bar,Hookah Bar,Nightclub,Pizza Place,Palace,Pier,Pool,Clothing Store,Cocktail Bar


In [158]:
df_final = neighborhoods_venues_sorted

df_final = df_final.loc[(df_final.Neighborhood == "Al Dafna") | (df_final.Neighborhood == "Najma") | (df_final.Neighborhood == "Al Hilal") | (df_final.Neighborhood == "Umm Ghuwailina") ]

df_final.reset_index(drop = True, inplace = True)
df_final

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Al Dafna,Coffee Shop,Hotel,Café,Bar,Italian Restaurant,Middle Eastern Restaurant,Restaurant,Lounge,Chinese Restaurant,Asian Restaurant,Buffet,Beach,Sporting Goods Shop,Steakhouse,Spa,French Restaurant,Gym,Ice Cream Shop,Brazilian Restaurant,Cocktail Bar
1,Al Hilal,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop,Middle Eastern Restaurant,Business Service,Sandwich Place,Restaurant,Malay Restaurant,Filipino Restaurant,Fast Food Restaurant,Music Venue,Supermarket,Department Store,Clothing Store,Gym,Gym / Fitness Center,Asian Restaurant,Electronics Store
2,Najma,Hotel,Café,Asian Restaurant,Department Store,Gym,Fast Food Restaurant,Filipino Restaurant,Lounge,Malay Restaurant,Bookstore,Shopping Mall,Middle Eastern Restaurant,Sandwich Place,Seafood Restaurant,Fried Chicken Joint,Building,Restaurant,Burger Joint,Pizza Place,Cafeteria
3,Umm Ghuwailina,Hotel,Pizza Place,Middle Eastern Restaurant,Department Store,Indian Restaurant,Fast Food Restaurant,American Restaurant,Gym,Sandwich Place,Lounge,Café,Convenience Store,Food Truck,Burger Joint,Seafood Restaurant,Restaurant,Clothing Store,Fish & Chips Shop,Filipino Restaurant,Coffee Shop


In [159]:
df_final_grouped = doha_grouped
df_final_grouped = df_final_grouped.loc[(df_final_grouped.Neighborhood == "Al Dafna") | (df_final_grouped.Neighborhood == "Najma") | (df_final_grouped.Neighborhood == "Al Hilal") | (df_final_grouped.Neighborhood == "Umm Ghuwailina") ]

df_final_grouped.reset_index(drop = True, inplace = True)
df_final_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,Airport,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auditorium,Auto Garage,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Beach,Bed & Breakfast,Beer Garden,Boarding House,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Bubble Tea Shop,Buffet,Building,Burger Joint,Bus Station,Business Service,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Chaat Place,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Currency Exchange,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Donut Shop,Electronics Store,Empanada Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Harbor / Marina,Health & Beauty Service,Health Food Store,History Museum,Hobby Shop,Hookah Bar,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Laser Tag,Latin American Restaurant,Lebanese Restaurant,Lounge,Malay Restaurant,Market,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Moroccan Restaurant,Motel,Movie Theater,Museum,Music Venue,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Other Repair Shop,Outdoors & Recreation,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Pool,Pool Hall,Pub,Residential Building (Apartment / Condo),Resort,Restaurant,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shawarma Place,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Stationery Store,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Tailor Shop,Tea Room,Tennis Court,Tennis Stadium,Tex-Mex Restaurant,Thai Restaurant,Theme Restaurant,Toy / Game Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Volleyball Court,Waterfront,Whisky Bar,Women's Store
0,Al Dafna,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.0,0.0,0.0,0.03,0.01,0.01,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.11,0.0,0.02,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0
1,Al Hilal,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.074074,0.074074,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Najma,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.019231,0.019231,0.0,0.0,0.019231,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.019231,0.057692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.057692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.096154,0.0,0.0,0.019231,0.0,0.0,0.019231,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.019231,0.019231,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019231,0.0,0.0,0.019231,0.0,0.0,0.0,0.0,0.0,0.0
3,Umm Ghuwailina,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.016667,0.0,0.0,0.016667,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.016667,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.016667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.15,0.0,0.0,0.05,0.0,0.016667,0.016667,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.033333,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [160]:
# set number of clusters
kclusters = 2

doha_grouped_clustering = df_final_grouped.drop('Neighborhood', 1)

In [161]:
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(doha_grouped_clustering)

In [162]:
kmeans.labels_

array([0, 1, 0, 0], dtype=int32)

In [163]:
df_final.insert(0, 'Cluster Labels', kmeans.labels_)
df_final

Unnamed: 0,Cluster Labels,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,0,Al Dafna,Coffee Shop,Hotel,Café,Bar,Italian Restaurant,Middle Eastern Restaurant,Restaurant,Lounge,Chinese Restaurant,Asian Restaurant,Buffet,Beach,Sporting Goods Shop,Steakhouse,Spa,French Restaurant,Gym,Ice Cream Shop,Brazilian Restaurant,Cocktail Bar
1,1,Al Hilal,Pizza Place,Cafeteria,Café,Coffee Shop,Mobile Phone Shop,Middle Eastern Restaurant,Business Service,Sandwich Place,Restaurant,Malay Restaurant,Filipino Restaurant,Fast Food Restaurant,Music Venue,Supermarket,Department Store,Clothing Store,Gym,Gym / Fitness Center,Asian Restaurant,Electronics Store
2,0,Najma,Hotel,Café,Asian Restaurant,Department Store,Gym,Fast Food Restaurant,Filipino Restaurant,Lounge,Malay Restaurant,Bookstore,Shopping Mall,Middle Eastern Restaurant,Sandwich Place,Seafood Restaurant,Fried Chicken Joint,Building,Restaurant,Burger Joint,Pizza Place,Cafeteria
3,0,Umm Ghuwailina,Hotel,Pizza Place,Middle Eastern Restaurant,Department Store,Indian Restaurant,Fast Food Restaurant,American Restaurant,Gym,Sandwich Place,Lounge,Café,Convenience Store,Food Truck,Burger Joint,Seafood Restaurant,Restaurant,Clothing Store,Fish & Chips Shop,Filipino Restaurant,Coffee Shop


### Thank you