# 1. Introduction

## Background

I was recently offered two jobs: one is based Miami, FL; the other is based in Columbia, SC. I'm really torn on which job to take given that: 
1. both jobs are with reputable companies;
2. both jobs are great fit for my background and experience, where I can continue doing what I've been trained for and becoming good at;
3. the people that I'd report to at both companies are very respectable and easy to get along with
4. one is a well-established multinational firm, the other is a regional leading firm but I will have more exposure to senior management
4. both offer competitive pays

Since I'm having a hard time making a decision solely based on the aspect of career progression between the two job offers, I'm going to leverage what I have learned in the machine learning module and the previous Capstone project to look at the dining and entertainment options in both locations. Hopefully, this can help me to make a more informed decision.

In [58]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


# 2. Data


In this section, I'm going to get the location for the two cities: Miami and Columbia.

## 2.1 Get neighborhood information for Miami, FL

### 2.1.1 Get coordinates for Miami

In [59]:
miami = pd.read_html('https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Miami')

miami = miami[0]
miami = miami.drop([11,25])
miami.head()

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates
0,Allapattah,,54289,4401,,25.815-80.224
1,Arts & Entertainment District,,11033,7948,,25.799-80.190
2,Brickell,Brickellite,31759,14541,West Brickell,25.758-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257


### 2.1.2 Split coordinates into latitude and longitude

In [60]:
for index, row in miami.iterrows():
    miami.loc[index,'Lat'] = miami.Coordinates[index].split('-')[0]
    miami.loc[index,'Lng'] = '-'+miami.Coordinates[index].split('-')[1]

miami.Lat = miami.Lat.astype(float)
miami.Lng = miami.Lng.astype(float)
miami.head()

Unnamed: 0,Neighborhood,Demonym,Population2010,Population/Km²,Sub-neighborhoods,Coordinates,Lat,Lng
0,Allapattah,,54289,4401,,25.815-80.224,25.815,-80.224
1,Arts & Entertainment District,,11033,7948,,25.799-80.190,25.799,-80.19
2,Brickell,Brickellite,31759,14541,West Brickell,25.758-80.193,25.758,-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192,25.813,-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257,25.712,-80.257


### 2.1.3 Generate a map for the neighborhoods with available coordiates for Miami, FL

In [61]:
# Note: the geolocator can be unstable sometimes, refresh the code multiple times if needed
address = 'Miami, FL'

geolocator = Nominatim(user_agent="hw")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Miami are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Miami are 25.7742658, -80.1936589.


In [62]:

map_miami = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(miami['Lat'], miami['Lng'], miami['Neighborhood']):
    
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_miami)  
    
map_miami

## 2.2 Get neighborhood data for Columbia, SC

### 2.2.1 Get coordinates for Columbia

Since I couldn't find any tabulated list of neighborhoods and corresponding coordiates for Columbia online, I instead found a article about the characteristics of the neighborhoods in Columbia, SC. In this subsection, I will manully type in the names of each neighborhood and use the *geolocator.geocdoe* function to find the coordinates. 
In addition, for some neighborhoods, the *geolocator.geocode* had trouble finding the correct coordinates, so I will include *try*, *except* in my code and focus only on the neighborhoods with useful coordinates.



Note: the geolocator can be unstable sometimes, refresh the code multiple times if needed
 

In [63]:
columbia_neighborhoods = ['Melrose Heights, SC','Cottontown, SC','Shandon, SC','Forest Acres, SC','Forest Hills, SC','Heathwood, SC','Rosewood, SC','Wildewood, SC','Lake Carolina,SC','Spring Valley,SC']

columbia = pd.DataFrame(columns = ['Neighborhood','Lat','Lng'])

for i,neigh in enumerate(columbia_neighborhoods):
    address = neigh
    try:
        geolocator = Nominatim(user_agent="hw")
        location = geolocator.geocode(address)
        latitude = location.latitude
        longitude = location.longitude
        print('The geograpical coordinate of',neigh, 'are {}, {}.'.format(latitude, longitude))
        columbia.loc[i,'Neighborhood'] = neigh
        columbia.loc[i,'Lat'] = latitude
        columbia.loc[i,'Lng'] = longitude
    except:
        pass
    
columbia

The geograpical coordinate of Melrose Heights, SC are 34.0065447, -81.002591.
The geograpical coordinate of Shandon, SC are 33.9982115, -81.0056467.
The geograpical coordinate of Forest Acres, SC are 34.0193221, -80.9898128.
The geograpical coordinate of Forest Hills, SC are 34.9793027, -81.2331298.
The geograpical coordinate of Heathwood, SC are 33.9990448, -80.9864797.
The geograpical coordinate of Rosewood, SC are 34.9126262, -81.8526023.
The geograpical coordinate of Wildewood, SC are 34.1043185, -80.8820322.
The geograpical coordinate of Lake Carolina,SC are 34.17484295, -80.8862830494948.
The geograpical coordinate of Spring Valley,SC are 34.9112573, -80.929798.


Unnamed: 0,Neighborhood,Lat,Lng
0,"Melrose Heights, SC",34.0065,-81.0026
2,"Shandon, SC",33.9982,-81.0056
3,"Forest Acres, SC",34.0193,-80.9898
4,"Forest Hills, SC",34.9793,-81.2331
5,"Heathwood, SC",33.999,-80.9865
6,"Rosewood, SC",34.9126,-81.8526
7,"Wildewood, SC",34.1043,-80.882
8,"Lake Carolina,SC",34.1748,-80.8863
9,"Spring Valley,SC",34.9113,-80.9298


In [64]:
# Note: the geolocator can be unstable sometimes, refresh the code multiple times if needed
address2 = 'Columbia, SC'

geolocator2 = Nominatim(user_agent="hw")
location2 = geolocator2.geocode(address2)
latitude2 = location2.latitude
longitude2 = location2.longitude
print('The geograpical coordinate of Columbia are {}, {}.'.format(latitude2, longitude2))

The geograpical coordinate of Columbia are 34.0007493, -81.0343313.


### 2.2. Generate a map for the neighborhoods with available coordiates for Columbia, SC

In [65]:

map_columbia = folium.Map(location=[latitude2, longitude2], zoom_start=8)

# add markers to map
for lat, lng, label in zip(columbia['Lat'], columbia['Lng'], columbia['Neighborhood']):
    
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_columbia)  
    
map_columbia

## 2.3 Foursquare API Data

In this subsection, I will use Foursquare API to get the venues in each neighborhood in both Miami and Columbia. 
The resultant data will be used to collect information on the number of venues available as well as the types of the venues. For example, how many restaurants or coffee shop or parks are present in each neighborhood; additionally, for restaurant-type venue, what types of cuisines are available; lastly, what are the reviews for these venues.
Once I collect all that information, I will summarize my findings and make a decision whether one city is more in line with my preference.

### 2.3.1 Define Foursquare Credentials and Version

In [66]:
CLIENT_ID = 'XYK2G5XLJP5KUOWHNW3JOGNPEWAN0JJPMIGWI4FR4J4PBOED' 
CLIENT_SECRET = 'N2JRIJN1FQX2FUKND1NUTNNQSBBUYKUINHKM0S4ZKV5X2R1Z' 
VERSION = '20180605'

print ('done')

done


#### 2.3.1 Define a function top 100 venues that are in the listed neighborhoods within a given radius for the target cities: Miami and Columbia (This was borrowed from the Lab with minor modification)

In [67]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        try:    
        # make the GET request
          results = requests.get(url).json()['response']['groups'][0]['items']
        
        # return only relevant information for each nearby venue
          venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
            
        except:    
            print('unable to fetch data')
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
       
    return(nearby_venues)

### 2.3.2 Run the above function on each neighborhood and create two new dataframes called *miami_venues*, and *columbia_venues* resectively.

Given that the venues are more spread out in Columbia, SC, I will set the radium to 1000 for Columbia, SC

In [68]:
LIMIT = 100

print("***** Miami ********")
miami_venues   = getNearbyVenues(names=miami['Neighborhood'],
                                   latitudes=miami['Lat'],
                                   longitudes=miami['Lng']
                                  )

print("***** Columbia *******")
columbia_venues = getNearbyVenues(names = columbia['Neighborhood'],
                                 latitudes = columbia['Lat'],
                                 longitudes = columbia['Lng'],
                                 radius = 1000)

***** Miami ********
Allapattah
Arts & Entertainment District
Brickell
Buena Vista
Coconut Grove
Coral Way
Design District
Downtown
Edgewater
Flagami
Grapeland Heights
Liberty City
Little Haiti
Little Havana
Lummus Park
Midtown
Overtown
Park West
The Roads
Upper Eastside
Venetian Islands
Virginia Key
West Flagler
Wynwood
***** Columbia *******
Melrose Heights, SC
Shandon, SC
Forest Acres, SC
Forest Hills, SC
Heathwood, SC
Rosewood, SC
Wildewood, SC
Lake Carolina,SC
Spring Valley,SC


### 2.3.3 Check the size of the resulting dataframe

In [69]:
print("Miami:",miami_venues.shape)
miami_venues.head()

Miami: (569, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Allapattah,25.815,-80.224,Three Fingers Liquor & Lounge,25.815523,-80.224406,Lounge
1,Allapattah,25.815,-80.224,Ross,25.81582,-80.221753,Department Store
2,Allapattah,25.815,-80.224,Showtime Boxing Gym,25.812364,-80.224504,Boxing Gym
3,Allapattah,25.815,-80.224,noor market,25.818165,-80.224197,Convenience Store
4,Arts & Entertainment District,25.799,-80.19,Bunnie Cakes,25.799544,-80.190953,Cupcake Shop


In [70]:
print("Columbia:",columbia_venues.shape)
columbia_venues.head()

Columbia: (128, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Melrose Heights, SC",34.006545,-81.002591,Henry's Grill & Bar,33.998214,-81.001353,American Restaurant
1,"Melrose Heights, SC",34.006545,-81.002591,Craft And Draft,33.998067,-81.005111,Beer Store
2,"Melrose Heights, SC",34.006545,-81.002591,Cantina 76,33.998153,-81.000935,Mexican Restaurant
3,"Melrose Heights, SC",34.006545,-81.002591,Anytime Fitness,33.99798,-81.00315,Gym / Fitness Center
4,"Melrose Heights, SC",34.006545,-81.002591,Nightcaps,33.998201,-81.004431,Bar


# 3. Methodology

In this section, I will use the venue data collected from Foursquare to analyze the dining and entertainment options in both cities. 

## 3.1 Check the number of venues were returned for each neighborhood in *Miami* and *Columbia*

In [71]:
print("Miami Venue Count")
miami_venues.groupby('Neighborhood').count()

Miami Venue Count


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Allapattah,4,4,4,4,4,4
Arts & Entertainment District,21,21,21,21,21,21
Brickell,60,60,60,60,60,60
Buena Vista,45,45,45,45,45,45
Coconut Grove,2,2,2,2,2,2
Coral Way,11,11,11,11,11,11
Design District,45,45,45,45,45,45
Downtown,63,63,63,63,63,63
Edgewater,46,46,46,46,46,46
Flagami,9,9,9,9,9,9


In [72]:
print("Columbia Venue Count")
columbia_venues.groupby('Neighborhood').count()

Columbia Venue Count


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Forest Acres, SC",28,28,28,28,28,28
"Forest Hills, SC",1,1,1,1,1,1
"Heathwood, SC",10,10,10,10,10,10
"Lake Carolina,SC",7,7,7,7,7,7
"Melrose Heights, SC",29,29,29,29,29,29
"Rosewood, SC",2,2,2,2,2,2
"Shandon, SC",42,42,42,42,42,42
"Spring Valley,SC",2,2,2,2,2,2
"Wildewood, SC",7,7,7,7,7,7


## 3.2 Find out how many unique categories can be curated from all the returned venues

In [73]:
print('There are {} uniques categories in Miami'.format(len(miami_venues['Venue Category'].unique())))

There are 149 uniques categories in Miami


In [74]:
print('There are {} uniques categories in Columbia'.format(len(columbia_venues['Venue Category'].unique())))

There are 75 uniques categories in Columbia


## 3.3 Examine the available venue category in both cities

In this subsection, I will first collect the unique venue types for both cities, then compare the list and figure out in what categories, one city is richer than the other

### 3.3.1 Get the list of unique venues in both cities

In [75]:
miami_list = miami_venues['Venue Category'].unique()

print('The available venue types in Miami include {}'.format(miami_list))

The available venue types in Miami include ['Lounge' 'Department Store' 'Boxing Gym' 'Convenience Store'
 'Cupcake Shop' 'Wine Shop' 'American Restaurant' 'Spa'
 'French Restaurant' 'Restaurant' 'Paper / Office Supplies Store'
 'Sandwich Place' 'Gym' 'Park' 'Cuban Restaurant' 'Ice Cream Shop'
 'Pizza Place' 'Salon / Barbershop' 'Tapas Restaurant' 'Smoothie Shop'
 'Breakfast Spot' 'Coffee Shop' 'Tennis Court' 'Hotel'
 'Argentinian Restaurant' 'Steakhouse' 'Japanese Restaurant'
 'Spanish Restaurant' 'Italian Restaurant' 'Dog Run' 'Café'
 'Latin American Restaurant' 'Athletics & Sports' 'Pharmacy'
 'Scenic Lookout' 'Seafood Restaurant' 'Gym / Fitness Center'
 'Mediterranean Restaurant' 'Burger Joint' 'Nightclub' 'Juice Bar'
 'Venezuelan Restaurant' 'Salad Place' 'Mexican Restaurant'
 'New American Restaurant' 'Bar' 'Playground' 'Bank' 'Harbor / Marina'
 'Neighborhood' 'Leather Goods Store' 'Boutique' 'Jewelry Store' 'Bakery'
 'Indian Restaurant' 'Furniture / Home Store' 'Art Museum' 'Cock

In [76]:
columbia_list = columbia_venues['Venue Category'].unique()

print('The available venue types in Columbia include {}'.format(columbia_list))

The available venue types in Columbia include ['American Restaurant' 'Beer Store' 'Mexican Restaurant'
 'Gym / Fitness Center' 'Bar' 'Mediterranean Restaurant' 'Café'
 'Burger Joint' 'Gourmet Shop' 'Pharmacy' 'Butcher' 'Organic Grocery'
 'Pet Store' 'Flower Shop' 'Boutique' 'Pet Service' 'Gym'
 'Electronics Store' 'Chinese Restaurant' 'Jewelry Store' 'Eye Doctor'
 'Kids Store' 'Furniture / Home Store' 'Dance Studio' 'Cosmetics Shop'
 'Gift Shop' 'Bakery' 'Pizza Place' 'Breakfast Spot'
 'Outdoor Supply Store' 'Italian Restaurant' 'Sandwich Place'
 'Middle Eastern Restaurant' 'Beer Garden' 'Ice Cream Shop'
 'Thai Restaurant' 'Nightclub' 'Mobile Phone Shop' "Men's Store"
 'Boarding House' 'Massage Studio' 'Music Store' 'Supplement Shop'
 'Liquor Store' 'Southern / Soul Food Restaurant' 'Bookstore'
 'Asian Restaurant' 'Japanese Restaurant' 'Optical Shop' 'Video Store'
 'Restaurant' 'Salon / Barbershop' 'Grocery Store' 'Theater' 'Coffee Shop'
 'Movie Theater' 'Department Store' 'Home Servic

### 3.3.2 Find the difference in venue categories between the two cities

In [77]:
len(set(miami_list.tolist()) - set(columbia_list.tolist()))

100

In [78]:
len(set(columbia_list.tolist())- set(miami_list.tolist()))

26

In [79]:
print(set(columbia_list.tolist())- set(miami_list.tolist()))

{'Butcher', 'Music Store', 'Flower Shop', 'Thai Restaurant', 'Eye Doctor', 'Pool', 'Food Service', 'Movie Theater', 'Kids Store', 'Massage Studio', 'Trail', 'Boarding House', 'Lake', 'Outdoor Supply Store', 'Supplement Shop', 'Bookstore', 'Video Store', 'Organic Grocery', 'Gourmet Shop', 'Dry Cleaner', 'Beer Store', 'Home Service', 'Garden Center', 'Baseball Field', 'Gift Shop', 'Electronics Store'}


### 3.3.3 Venue categories that are of particular interest to me

The above preliminary analysis shows that there are more dining/enetertainment available in Miami then in Columbia. In the subsection, I check the venue cateogires that I care most about.

In [80]:
my_list = ['Italian Restaurant','Chinese Restaurant','Coffee Shop','Golf Course','Pet Service']

In [81]:
miami_me = miami_venues[miami_venues['Venue Category'].isin(my_list) ]
columbia_me = columbia_venues[columbia_venues['Venue Category'].isin(my_list)]

In [82]:
miami_me.groupby('Venue Category')['Venue'].count()

Venue Category
Chinese Restaurant     4
Coffee Shop           20
Golf Course            2
Italian Restaurant    22
Pet Service            1
Name: Venue, dtype: int64

In [83]:
columbia_me.groupby('Venue Category')['Venue'].count()

Venue Category
Chinese Restaurant    1
Coffee Shop           1
Italian Restaurant    1
Pet Service           2
Name: Venue, dtype: int64

The above analysis demonstrates that not only Miamai has more venues categories available than does Columbia, it also has more to offer in the categories that I care most about.

Below is a quick peek of the venues that appeal to me.

In [84]:
miami_me[['Venue Category','Venue','Neighborhood']].sort_values(['Venue Category','Venue'])

Unnamed: 0,Venue Category,Venue,Neighborhood
118,Chinese Restaurant,Blackbrick Chinese,Buena Vista
178,Chinese Restaurant,Blackbrick Chinese,Design District
396,Chinese Restaurant,Blackbrick Chinese,Midtown
199,Chinese Restaurant,First Hong Kong Cafe,Downtown
105,Coffee Shop,Angelina's Coffee & Yogurt,Buena Vista
161,Coffee Shop,Angelina's Coffee & Yogurt,Design District
386,Coffee Shop,Angelina's Coffee & Yogurt,Midtown
88,Coffee Shop,Blue Bottle Coffee,Buena Vista
146,Coffee Shop,Blue Bottle Coffee,Design District
23,Coffee Shop,Bold Brew Cafe,Arts & Entertainment District


## 3.4 More about Miami

Given the above analysis shows that Miami seems to have more to offter in terms of 
dining/entertaining, in this subsection, I will focus on Miami and further explore the neighborhoods

In [85]:
# one hot encoding
miami_onehot = pd.get_dummies(miami_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
miami_onehot['Neighborhood'] = miami_venues['Neighborhood'] 

cols = list(miami_onehot)
cols.insert(0,cols.pop(cols.index('Neighborhood')))
miami_onehot = miami_onehot.loc[:,cols]

miami_onehot.head()


Unnamed: 0,Neighborhood,American Restaurant,Aquarium,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bank,Bar,Beach,Beer Garden,Big Box Store,Bistro,Boat or Ferry,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Bus Station,Café,Caribbean Restaurant,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cuban Restaurant,Cupcake Shop,Dance Studio,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Eastern European Restaurant,Empanada Restaurant,Event Space,Fast Food Restaurant,Fish Market,Flea Market,Food,Food Truck,Football Stadium,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Golf Course,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Historic Site,Hobby Shop,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Latin American Restaurant,Lawyer,Leather Goods Store,Lingerie Store,Liquor Store,Lounge,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Motel,Museum,New American Restaurant,Nightclub,Optical Shop,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Playground,Print Shop,Pub,Public Art,Record Shop,Residential Building (Apartment / Condo),Resort,Restaurant,River,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Smoothie Shop,Soccer Field,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Steakhouse,Sushi Restaurant,Taco Place,Tapas Restaurant,Tennis Court,Theater,Thrift / Vintage Store,Tiki Bar,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Water Park,Wine Bar,Wine Shop,Winery,Wings Joint,Yoga Studio
0,Allapattah,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Allapattah,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Allapattah,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Allapattah,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Arts & Entertainment District,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### 3.4.1 Group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [86]:
miami_grouped = miami_onehot.groupby('Neighborhood').mean().reset_index()
miami_grouped

Unnamed: 0,Neighborhood,American Restaurant,Aquarium,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bank,Bar,Beach,Beer Garden,Big Box Store,Bistro,Boat or Ferry,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Bus Station,Café,Caribbean Restaurant,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Cuban Restaurant,Cupcake Shop,Dance Studio,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Eastern European Restaurant,Empanada Restaurant,Event Space,Fast Food Restaurant,Fish Market,Flea Market,Food,Food Truck,Football Stadium,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Golf Course,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Historic Site,Hobby Shop,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Latin American Restaurant,Lawyer,Leather Goods Store,Lingerie Store,Liquor Store,Lounge,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Motel,Museum,New American Restaurant,Nightclub,Optical Shop,Paper / Office Supplies Store,Park,Performing Arts Venue,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Playground,Print Shop,Pub,Public Art,Record Shop,Residential Building (Apartment / Condo),Resort,Restaurant,River,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Smoothie Shop,Soccer Field,South American Restaurant,Southern / Soul Food Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Steakhouse,Sushi Restaurant,Taco Place,Tapas Restaurant,Tennis Court,Theater,Thrift / Vintage Store,Tiki Bar,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Water Park,Wine Bar,Wine Shop,Winery,Wings Joint,Yoga Studio
0,Allapattah,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Arts & Entertainment District,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0
2,Brickell,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.016667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.016667,0.016667,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.05,0.0,0.033333,0.016667,0.0,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.016667,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.016667,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.016667,0.033333,0.033333,0.016667,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.033333,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0
3,Buena Vista,0.022222,0.0,0.022222,0.0,0.044444,0.044444,0.022222,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.022222,0.022222,0.022222,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.022222,0.0,0.044444,0.022222,0.0,0.0,0.0,0.044444,0.0,0.044444,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0
4,Coconut Grove,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Coral Way,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Design District,0.022222,0.0,0.022222,0.0,0.044444,0.044444,0.022222,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.022222,0.022222,0.022222,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.0,0.022222,0.0,0.0,0.044444,0.022222,0.0,0.0,0.0,0.044444,0.0,0.044444,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0
7,Downtown,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.031746,0.063492,0.0,0.0,0.015873,0.031746,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.015873,0.0,0.079365,0.015873,0.015873,0.0,0.015873,0.0,0.0,0.015873,0.0,0.047619,0.0,0.0,0.0,0.0,0.015873,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.015873,0.0,0.047619,0.0,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.031746,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0
8,Edgewater,0.043478,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.043478,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.021739,0.0,0.0,0.021739,0.021739,0.021739,0.043478,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.065217,0.0,0.0,0.0,0.065217,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.021739
9,Flagami,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Print each neighborhood along with the top 5 most common venues

In [87]:
num_top_venues = 5

for hood in miami_grouped['Neighborhood']:
    print("****"+hood+"****")
    temp = miami_grouped[miami_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

****Allapattah****
                       venue  freq
0                 Boxing Gym  0.25
1                     Lounge  0.25
2          Convenience Store  0.25
3           Department Store  0.25
4  Latin American Restaurant  0.00


****Arts & Entertainment District****
                 venue  freq
0           Restaurant  0.10
1                  Gym  0.10
2  American Restaurant  0.05
3         Tennis Court  0.05
4        Smoothie Shop  0.05


****Brickell****
                 venue  freq
0                Hotel  0.10
1   Italian Restaurant  0.08
2                 Café  0.05
3           Restaurant  0.05
4  Japanese Restaurant  0.05


****Buena Vista****
                    venue  freq
0                Boutique  0.07
1             Coffee Shop  0.07
2             Pizza Place  0.04
3  Furniture / Home Store  0.04
4           Jewelry Store  0.04


****Coconut Grove****
                 venue  freq
0        Boat or Ferry   0.5
1                 Park   0.5
2  American Restaurant   0.0
3         

#### Put the above information into a *pandas* dataframe
Borrow the pre-defined function from Lab

In [43]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Next I'm going to create the new dataframe and display the top 10 venues for each neighborhood.

In [88]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = miami_grouped['Neighborhood']

for ind in np.arange(miami_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(miami_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allapattah,Department Store,Convenience Store,Lounge,Boxing Gym,Event Space,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant
1,Arts & Entertainment District,Gym,Restaurant,American Restaurant,French Restaurant,Pizza Place,Park,Paper / Office Supplies Store,Coffee Shop,Salon / Barbershop,Sandwich Place
2,Brickell,Hotel,Italian Restaurant,Restaurant,Japanese Restaurant,Café,Seafood Restaurant,Steakhouse,Juice Bar,Bar,Salon / Barbershop
3,Buena Vista,Boutique,Coffee Shop,Bakery,Italian Restaurant,Pizza Place,Nightclub,Furniture / Home Store,Jewelry Store,Café,Art Museum
4,Coconut Grove,Boat or Ferry,Park,French Restaurant,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant,Event Space,Empanada Restaurant


# 4. Clustering the Neighborhoods

## 4.1 K-means on neighborhoods in Miami
Run *k*-means to cluster the neighborhood into 5 clusters.


In [89]:
# set number of clusters
kclusters = 5

miami_grouped_clustering = miami_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(miami_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 1, 1, 2, 1, 1, 1, 1, 1], dtype=int32)

Next, I create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [93]:
miami_data = miami[['Neighborhood','Lat','Lng']]
miami_data.head()

Unnamed: 0,Neighborhood,Lat,Lng
0,Allapattah,25.815,-80.224
1,Arts & Entertainment District,25.799,-80.19
2,Brickell,25.758,-80.193
3,Buena Vista,25.813,-80.192
4,Coconut Grove,25.712,-80.257


In [None]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)


In [106]:
miami_merged = miami_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
miami_merged = miami_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

miami_merged.head()

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allapattah,25.815,-80.224,0,Department Store,Convenience Store,Lounge,Boxing Gym,Event Space,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant
1,Arts & Entertainment District,25.799,-80.19,1,Gym,Restaurant,American Restaurant,French Restaurant,Pizza Place,Park,Paper / Office Supplies Store,Coffee Shop,Salon / Barbershop,Sandwich Place
2,Brickell,25.758,-80.193,1,Hotel,Italian Restaurant,Restaurant,Japanese Restaurant,Café,Seafood Restaurant,Steakhouse,Juice Bar,Bar,Salon / Barbershop
3,Buena Vista,25.813,-80.192,1,Boutique,Coffee Shop,Bakery,Italian Restaurant,Pizza Place,Nightclub,Furniture / Home Store,Jewelry Store,Café,Art Museum
4,Coconut Grove,25.712,-80.257,2,Boat or Ferry,Park,French Restaurant,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant,Event Space,Empanada Restaurant


## 4.2 Visualize the clusters

In [102]:
# Note: the geolocator can be unstable sometimes, refresh the code multiple times if needed
address = 'Miami, FL'

geolocator = Nominatim(user_agent="hw")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Miami are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Miami are 25.7742658, -80.1936589.


In [103]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(miami_merged['Lat'], miami_merged['Lng'], miami_merged['Neighborhood'], miami_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters


# 5. Examine Clusters

In this section, I take a closer look at each cluster to see which neighborhoods are clustered together, and what are the featured venues in each cluster

#### Cluster 1

In [107]:
miami_merged.loc[miami_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allapattah,25.815,-80.224,0,Department Store,Convenience Store,Lounge,Boxing Gym,Event Space,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant


#### Cluster 2 

In [108]:
miami_merged.loc[miami_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Arts & Entertainment District,25.799,-80.19,1,Gym,Restaurant,American Restaurant,French Restaurant,Pizza Place,Park,Paper / Office Supplies Store,Coffee Shop,Salon / Barbershop,Sandwich Place
2,Brickell,25.758,-80.193,1,Hotel,Italian Restaurant,Restaurant,Japanese Restaurant,Café,Seafood Restaurant,Steakhouse,Juice Bar,Bar,Salon / Barbershop
3,Buena Vista,25.813,-80.192,1,Boutique,Coffee Shop,Bakery,Italian Restaurant,Pizza Place,Nightclub,Furniture / Home Store,Jewelry Store,Café,Art Museum
5,Coral Way,25.75,-80.283,1,Liquor Store,Mobile Phone Shop,Golf Course,Burger Joint,Historic Site,Pharmacy,Dive Bar,Café,Seafood Restaurant,Flea Market
6,Design District,25.813,-80.193,1,Boutique,Coffee Shop,Bakery,Italian Restaurant,Pizza Place,Nightclub,Furniture / Home Store,Jewelry Store,Café,Art Museum
7,Downtown,25.774,-80.193,1,Italian Restaurant,Coffee Shop,Peruvian Restaurant,Hotel,Pharmacy,Lounge,Cosmetics Shop,Brazilian Restaurant,American Restaurant,Cocktail Bar
8,Edgewater,25.802,-80.19,1,Restaurant,Sandwich Place,Cuban Restaurant,Pizza Place,Gym,American Restaurant,Coffee Shop,Art Gallery,Breakfast Spot,Italian Restaurant
9,Flagami,25.762,-80.316,1,Seafood Restaurant,Cuban Restaurant,Department Store,Spanish Restaurant,Restaurant,Fast Food Restaurant,Latin American Restaurant,Bakery,Empanada Restaurant,Flea Market
10,Grapeland Heights,25.792,-80.258,1,Bar,Hotel Bar,Hotel,Bus Station,Restaurant,Auto Garage,Gym,Golf Course,Yoga Studio,Eastern European Restaurant
13,Little Haiti,25.824,-80.191,1,Yoga Studio,Sushi Restaurant,Pilates Studio,Pub,Record Shop,Donut Shop,Clothing Store,Caribbean Restaurant,Mobile Phone Shop,Fast Food Restaurant


#### Cluster 3 

In [109]:
miami_merged.loc[miami_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Coconut Grove,25.712,-80.257,2,Boat or Ferry,Park,French Restaurant,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant,Event Space,Empanada Restaurant
21,Venetian Islands,25.791,-80.161,2,Park,Boat or Ferry,Shop & Service,Lounge,Football Stadium,Food,Flea Market,Fish Market,Fast Food Restaurant,Event Space


#### Cluster 4

In [110]:
miami_merged.loc[miami_merged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Virginia Key,25.736,-80.155,3,Beach,Yoga Studio,Event Space,Football Stadium,Food Truck,Food,Flea Market,Fish Market,Fast Food Restaurant,Empanada Restaurant


#### Cluster 5

In [111]:
miami_merged.loc[miami_merged['Cluster Labels'] == 4]

Unnamed: 0,Neighborhood,Lat,Lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Liberty City,25.832,-80.225,4,Park,Bar,Seafood Restaurant,Grocery Store,Football Stadium,Food,Flea Market,Fish Market,Fast Food Restaurant,Event Space


# Conclusion

By using the technique auquired through labs I first collected the neighborhood information for Columbia SC and Miami FL and generated maps for both cities. Then through Foursquare API, I pulled venue information for both cities. 

First, there are more venues in Miami than in Columbia, the finding holds true for both the number of venue categories as well as absolute venue numbers.
Second, when looking into specific venue categories that are of particular interest to me, Miami also outperforms Columbia.

Lastly, I focused on the neighborhoods in Miamia and examined the top venues in each neighborood and utilized k-means clustering to cluster neighborhoods into 5 clusters. It is interesting to see that one clusters include majority of the neighborhoods while the rest clusters only contain 1 to 2 neighborhood. It appears that most neighborhoods in Miami are quite similar with a few outliers.