This notebook is used for Capstone project - clustering of neighbourhoods in city of 'New Orleans'

    1.This segment of code scrapes wiki page and convert it to a dataframe in required format.

        a. Below code reads wiki table and create raw dataframe 'df_wiki'

In [1]:
import pandas as pd #import pandas

url_wiki = 'https://en.wikipedia.org/wiki/Neighborhoods_in_New_Orleans' #define wiki url
df_wiki = pd.read_html(url_wiki)[0] # read table in dataframe using read_html
print(df_wiki.shape)
df_wiki.head()

(72, 3)


Unnamed: 0,Neighborhood,Longitude,Latitude
0,U.S. NAVAL BASE,-90.026093,29.946085
1,ALGIERS POINT,-90.051606,29.952462
2,WHITNEY,-90.042357,29.9472
3,AUDUBON,-90.12145,29.932994
4,OLD AURORA,-90.0,29.92444


        b. Below code checks if there are rows with missing/unusual value, if there is then discard those rows from the dataframe. New dataframe is named 'orleans_df'

In [2]:
print(df_wiki['Neighborhood'].isnull().sum(),df_wiki['Latitude'].isnull().sum(),df_wiki['Longitude'].isnull().sum())
orleans_df = df_wiki.copy()

0 0 0


       c. Import relevant libraries

In [4]:
import numpy as np
import json

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim

import requests 

from pandas.io.json import json_normalize 

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    openssl-1.1.1g             |       h516909a_1         2.1 MB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ------------------------------------------------------------
                       

      d. Get lat and long of 'New Orleans' and create a map

In [5]:
address = 'New Orleans'

geolocator = Nominatim(user_agent="no_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New Orleans are {}, {}.'.format(latitude, longitude))

map_no = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, neighborhood in zip(orleans_df['Latitude'], orleans_df['Longitude'], orleans_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat, lng], radius=4, popup=label, color='blue', fill=True, fill_color='#3186cc', fill_opacity=0.7,
        parse_html=False).add_to(map_no)  
    
map_no

The geograpical coordinate of New Orleans are 29.9499323, -90.0701156.


        c.Next we will get data from Foursquare. Credentials are defined in below se of codes. Also, define radius of venues and number of venues

In [6]:
CLIENT_ID = 'U34XN2WI0DBDFTIT4YBITSG2XG1PQ3TZ5OLGBOXQ5TPI3O5I' # Foursquare ID
CLIENT_SECRET = 'XJC2GU4JL3B41BNALVGPRNI0JDQRQDUPBNXRBNYVRZCUU5HE' # Foursquare Secret
VERSION = '20180605' # Foursquare API version
limit=100
radius=1000

        d. Function to explore venues around a neighbourhood and return a dataframe with details of venue e.g. category. Function makes call to foursquare API and reads json file. It cleans the json file into a dataframe. We have picked 100 venues within a radius of 1000 meters around a particular latitude and longitude.

In [7]:
def getNearbyVenues(names, la, lo):
    
    venues_list=[]
    for name, lat, lng in zip(names, la, lo):

        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng,  radius,  limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(name, lat, lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 'Neighborhood Latitude', 'Neighborhood Longitude', 
                             'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    
    return(nearby_venues)

        e. Call above function with all neghbourhoods names, latitude and longitude. Store all the venues around these neighbourhoods in a dataframe

In [8]:
orleans_venues = getNearbyVenues(names=orleans_df['Neighborhood'], la=orleans_df['Latitude'],lo=orleans_df['Longitude'])
print(orleans_venues.shape)
orleans_venues.head()

(2703, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,U.S. NAVAL BASE,29.946085,-90.026093,Behrman Stadium,29.939423,-90.030863,Stadium
1,U.S. NAVAL BASE,29.946085,-90.026093,Federal City Inn & Suites,29.947682,-90.032891,Hotel
2,U.S. NAVAL BASE,29.946085,-90.026093,Subway,29.950125,-90.034375,Sandwich Place
3,U.S. NAVAL BASE,29.946085,-90.026093,The Mighty Missisippi,29.949695,-90.02371,Boat or Ferry
4,U.S. NAVAL BASE,29.946085,-90.026093,A Beautiful Estate Inc.,29.940376,-90.023239,Construction & Landscaping


        f. Explore venues around all neighbourhoods. How many venues were returned for each neighborhood

In [9]:
print('There are {} uniques categories.'.format(len(orleans_venues['Venue Category'].unique())))
orleans_venues.groupby('Neighborhood').count()

There are 276 uniques categories.


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ALGIERS POINT,27,27,27,27,27,27
AUDUBON,28,28,28,28,28,28
B. W. COOPER,31,31,31,31,31,31
BAYOU ST. JOHN,69,69,69,69,69,69
BEHRMAN,16,16,16,16,16,16
BLACK PEARL,56,56,56,56,56,56
BROADMOOR,53,53,53,53,53,53
BYWATER,36,36,36,36,36,36
CENTRAL BUSINESS DISTRICT,100,100,100,100,100,100
CENTRAL CITY,41,41,41,41,41,41


        g. Analyze each neighbourhood. Convert previous data frame into neighbourhood by venue (rows are neighbourhood and columns are venue category)

In [10]:
# one hot encoding
orleans_onehot = pd.get_dummies(orleans_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
orleans_onehot['Neighborhood'] = orleans_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [orleans_onehot.columns[-1]] + list(orleans_onehot.columns[:-1])

orleans_onehot = orleans_onehot[fixed_columns]
print(orleans_onehot.shape)
orleans_onehot.head()

(2703, 276)


Unnamed: 0,Yoga Studio,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


        h. Group rows of neighborhood by taking mean of occurance

In [11]:
orleans_grouped = orleans_onehot.groupby(['Neighborhood']).mean().reset_index()
print(orleans_grouped.shape)
orleans_grouped

(70, 276)


Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store
0,ALGIERS POINT,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.000000,0.000000,0.037037,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.000000,0.000000,0.000000,0.037037,0.000000
1,AUDUBON,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.035714,0.000000,0.000000,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.035714,0.000000,0.000000,0.000000,0.000000
2,B. W. COOPER,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.032258,0.000000,0.000000,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.000000,0.032258,0.000000,0.032258,0.000000
3,BAYOU ST. JOHN,0.000000,0.000,0.0000,0.028986,0.000000,0.0,0.000000,0.000000,0.000000,...,0.014493,0.000000,0.0000,0.000000,0.000000,0.028986,0.014493,0.000000,0.000000,0.000000
4,BEHRMAN,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.000000,0.062500,0.000000,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
5,BLACK PEARL,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.017857,0.000000,0.000000,...,0.000000,0.000000,0.0000,0.035714,0.000000,0.000000,0.000000,0.000000,0.000000,0.017857
6,BROADMOOR,0.000000,0.000,0.0000,0.000000,0.000000,0.0,0.000000,0.000000,0.018868,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.000000,0.000000,0.000000,0.018868,0.000000
7,BYWATER,0.000000,0.000,0.0000,0.000000,0.027778,0.0,0.027778,0.000000,0.027778,...,0.027778,0.000000,0.0000,0.000000,0.000000,0.027778,0.000000,0.000000,0.000000,0.000000
8,CENTRAL BUSINESS DISTRICT,0.010000,0.000,0.0000,0.000000,0.000000,0.0,0.010000,0.000000,0.000000,...,0.000000,0.000000,0.0000,0.010000,0.000000,0.000000,0.010000,0.000000,0.000000,0.000000
9,CENTRAL CITY,0.000000,0.000,0.0000,0.048780,0.000000,0.0,0.000000,0.000000,0.048780,...,0.000000,0.000000,0.0000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000


        i. Create new dataframe with top 10 venues as columns

In [34]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = orleans_grouped['Neighborhood']

for ind in np.arange(orleans_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(orleans_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,ALGIERS POINT,Coffee Shop,Bar,Cupcake Shop,Grocery Store,Seafood Restaurant,Dance Studio,Scenic Lookout,Food & Drink Shop,Sandwich Place,Spa
1,AUDUBON,Coffee Shop,Grocery Store,College Arts Building,College Auditorium,Gym,Skate Park,Smoothie Shop,Café,Sandwich Place,Speakeasy
2,B. W. COOPER,Brewery,Bakery,Fast Food Restaurant,Basketball Stadium,Bus Stop,Gas Station,Furniture / Home Store,Light Rail Station,Fried Chicken Joint,Food Truck
3,BAYOU ST. JOHN,Coffee Shop,Grocery Store,Bar,Cajun / Creole Restaurant,Southern / Soul Food Restaurant,Seafood Restaurant,Café,Light Rail Station,Sandwich Place,American Restaurant
4,BEHRMAN,Shopping Mall,Martial Arts Dojo,Breakfast Spot,Auto Garage,Café,Seafood Restaurant,Massage Studio,Furniture / Home Store,Stadium,Clothing Store


        j. Run k-means for 5 cluster.

In [35]:
# set number of clusters
kclusters = 5

orleans_grouped_clustering = orleans_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(orleans_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 0, 1, 0, 1, 0, 1, 1, 0], dtype=int32)

        k. Create a new data frame to include cluster as well as top 10 venues.

In [36]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

orleans_merged = orleans_df

# merge orleans_grouped with orleans_df to add latitude/longitude for each neighborhood
orleans_merged = orleans_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
orleans_merged.dropna(subset=['Cluster Labels'],inplace=True)
orleans_merged.reset_index(inplace=True)

orleans_merged

Unnamed: 0,index,Neighborhood,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,U.S. NAVAL BASE,-90.026093,29.946085,1.0,Hotel,Credit Union,Basketball Court,Auto Garage,Sandwich Place,Stadium,Boat or Ferry,Construction & Landscaping,Flea Market,Fish Market
1,1,ALGIERS POINT,-90.051606,29.952462,1.0,Coffee Shop,Bar,Cupcake Shop,Grocery Store,Seafood Restaurant,Dance Studio,Scenic Lookout,Food & Drink Shop,Sandwich Place,Spa
2,2,WHITNEY,-90.042357,29.947200,1.0,Hotel,Sandwich Place,Park,Dance Studio,Seafood Restaurant,Spa,Scenic Lookout,Boat or Ferry,New American Restaurant,Bed & Breakfast
3,3,AUDUBON,-90.121450,29.932994,1.0,Coffee Shop,Grocery Store,College Arts Building,College Auditorium,Gym,Skate Park,Smoothie Shop,Café,Sandwich Place,Speakeasy
4,4,OLD AURORA,-90.000000,29.924440,0.0,Ice Cream Shop,Pool,American Restaurant,Pizza Place,Home Service,Donut Shop,Sandwich Place,Discount Store,Fast Food Restaurant,Eye Doctor
5,5,B. W. COOPER,-90.091753,29.951774,0.0,Brewery,Bakery,Fast Food Restaurant,Basketball Stadium,Bus Stop,Gas Station,Furniture / Home Store,Light Rail Station,Fried Chicken Joint,Food Truck
6,6,BAYOU ST. JOHN,-90.086517,29.976071,1.0,Coffee Shop,Grocery Store,Bar,Cajun / Creole Restaurant,Southern / Soul Food Restaurant,Seafood Restaurant,Café,Light Rail Station,Sandwich Place,American Restaurant
7,7,BEHRMAN,-90.026436,29.934817,0.0,Shopping Mall,Martial Arts Dojo,Breakfast Spot,Auto Garage,Café,Seafood Restaurant,Massage Studio,Furniture / Home Store,Stadium,Clothing Store
8,8,BLACK PEARL,-90.134883,29.935895,1.0,Bar,Café,Sandwich Place,Vietnamese Restaurant,Clothing Store,Thai Restaurant,Coffee Shop,Pizza Place,Bookstore,South American Restaurant
9,9,BROADMOOR,-90.103812,29.946568,0.0,Fried Chicken Joint,Brewery,Coffee Shop,Café,Discount Store,Fast Food Restaurant,Bar,Shoe Store,Liquor Store,Pizza Place


        l. Visualize the clusters.

In [37]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(orleans_merged['Latitude'], orleans_merged['Longitude'], orleans_merged['Neighborhood'], orleans_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster-1)],
        fill=True,
        fill_color=rainbow[int(cluster-1)],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

        m. Check size of each cluster based on discriminating venue categories.

In [38]:
print('Cluster 1: ',orleans_merged.loc[orleans_merged['Cluster Labels'] == 0, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].shape)
print('Cluster 2: ',orleans_merged.loc[orleans_merged['Cluster Labels'] == 1, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].shape)
print('Cluster 3: ',orleans_merged.loc[orleans_merged['Cluster Labels'] == 2, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].shape)
print('Cluster 4: ',orleans_merged.loc[orleans_merged['Cluster Labels'] == 3, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].shape)
print('Cluster 5: ',orleans_merged.loc[orleans_merged['Cluster Labels'] == 4, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].shape)

Cluster 1:  (28, 12)
Cluster 2:  (38, 12)
Cluster 3:  (2, 12)
Cluster 4:  (1, 12)
Cluster 5:  (1, 12)


Cluster 1 and 2 are dominant in the map. It means that mostly the neighborhood is homogeneous with exception of few neighbourhood.

        n. Name cluster 1

In [39]:
orleans_merged.loc[orleans_merged['Cluster Labels'] == 0, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].head()

Unnamed: 0,index,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,4,0.0,Ice Cream Shop,Pool,American Restaurant,Pizza Place,Home Service,Donut Shop,Sandwich Place,Discount Store,Fast Food Restaurant,Eye Doctor
5,5,0.0,Brewery,Bakery,Fast Food Restaurant,Basketball Stadium,Bus Stop,Gas Station,Furniture / Home Store,Light Rail Station,Fried Chicken Joint,Food Truck
7,7,0.0,Shopping Mall,Martial Arts Dojo,Breakfast Spot,Auto Garage,Café,Seafood Restaurant,Massage Studio,Furniture / Home Store,Stadium,Clothing Store
9,9,0.0,Fried Chicken Joint,Brewery,Coffee Shop,Café,Discount Store,Fast Food Restaurant,Bar,Shoe Store,Liquor Store,Pizza Place
10,10,0.0,Gas Station,Pizza Place,Breakfast Spot,Grocery Store,Bar,Café,Fried Chicken Joint,Southern / Soul Food Restaurant,Bowling Alley,Liquor Store


Above category of cluster shows 'Stores', 'Restaurants', 'Bar' ,'Cafe' etc. mostly places suitable for Family.

        o. Name cluster 2

In [40]:
orleans_merged.loc[orleans_merged['Cluster Labels'] == 1, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].head()

Unnamed: 0,index,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,1.0,Hotel,Credit Union,Basketball Court,Auto Garage,Sandwich Place,Stadium,Boat or Ferry,Construction & Landscaping,Flea Market,Fish Market
1,1,1.0,Coffee Shop,Bar,Cupcake Shop,Grocery Store,Seafood Restaurant,Dance Studio,Scenic Lookout,Food & Drink Shop,Sandwich Place,Spa
2,2,1.0,Hotel,Sandwich Place,Park,Dance Studio,Seafood Restaurant,Spa,Scenic Lookout,Boat or Ferry,New American Restaurant,Bed & Breakfast
3,3,1.0,Coffee Shop,Grocery Store,College Arts Building,College Auditorium,Gym,Skate Park,Smoothie Shop,Café,Sandwich Place,Speakeasy
6,6,1.0,Coffee Shop,Grocery Store,Bar,Cajun / Creole Restaurant,Southern / Soul Food Restaurant,Seafood Restaurant,Café,Light Rail Station,Sandwich Place,American Restaurant


Since this category has more variety, Mostly places which young people would prefer to go to. So, I will name this category "Youngster"

        p. Name cluster 3

In [41]:
orleans_merged.loc[orleans_merged['Cluster Labels'] == 2, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].head()

Unnamed: 0,index,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,59,2.0,Park,Food,Wings Joint,Plaza,Music Venue,Field,Eye Doctor,Falafel Restaurant,Farm,Farmers Market
58,60,2.0,Park,Music Venue,Wings Joint,Film Studio,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Women's Store


This category is suitable for Outing, so I am naming it "Leisure"

        q. Name cluster 4

In [42]:
orleans_merged.loc[orleans_merged['Cluster Labels'] == 3, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].head()

Unnamed: 0,index,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,17,3.0,Clothing Store,Field,Event Space,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Women's Store,Event Service


This category represents a country side, so I will call it "Country"

        r. Name cluster 5

In [43]:
orleans_merged.loc[orleans_merged['Cluster Labels'] == 4, orleans_merged.columns[[0] + list(range(4, orleans_merged.shape[1]))]].head()

Unnamed: 0,index,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,58,4.0,Skate Park,Park,Field,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Women's Store,Event Service


This category represents lot of open space so I will call it "Open"