# Coursera Capstone Final Project - Neighborhood Analysis 

## Background & Introduction

<p> This project will analyze northern Virginia using neighborhood clustering analysis. I moved to the town of Haymarket, which is part of Prince William County in Northern Virginia, and would like to explore the sorrounding area and find popular activities which are family friendly. I am going to use FourSquare and Neighborhood clustering analysis to explore Northern Virgnina, honing in on Prince William County. As part of this exercise, I am planning to confirm and/or dispel perceptions that I have about different parts of Northern Virginia. This analysis is useful for anyone who would like to learn more about Northern Virginia - whether they are new to the state or just wanting to explore their home state like me. This analysis can also be used as a template to explore another state.

## Data & Methodology

<p> The dataset that I am using is from Open Data Soft(https://public.opendatasoft.com/explore/dataset/us-zip-code-latitude-and-longitude/table/). This file contains US Cities, States, and their Longitude and Latitudes. I downloaded this data as an excel file and enhanced the data to include an identifier for Northern Virginia (NVA) and Prince William County (PWC). I also deleted columns I do not need. I will create subsets of this data to analyze Northern Virginia and Prince William County zipcodes. An issue that I will encounter is that Northern Virginia may be too broad for me to analyze and get meaningful results. 

When pulling in venue data from FourSquare, I will use a broader radius of 33,000 meters (roughly 20 miles). Unlike Manhattan or Toronto, Northern Virginia is a broader suburban area and people tend to drive to their destination. I will perform cluster analysis, using k-means clustering, to group Northern Virginia cities and towns into clusters. Because I am covering a broad area, I will use a higher cluster count - 10 clusters.

## Results

<p> I used 10 clusters in my k-means cluster analysis and the results made sense to me. Most of the popular venues for each cluster were aligned to my knowlede of Northern Virginia. I was delighted to discover new insights of places, particularly near to me, which I was unaware. I did not have to run a separate cluster analysis on Prince William County (PWC) as it naturally formed it's own cluster.
    
The cluster analysis that I peformed confirmed several perceptions that I had about Northern Virginia - which areas are heavy tourist areas due to their monuments (Alexandria and Arlington), which area is considered wine country (Middleburg / MarshalL), and that the area that I live has few diverse dining options, characterized by mostly fast food. 

Conversely, I also learned some things which I did not know and which will help me find fun activities for my family. The area that I live is characterized by Park venues beyond Massassa Bull Run (a famous Civil War battlefield). One cluster near me (Dumfries/Triangle) seems to have more international dining options (Greek, Japanese, French). Another cluster near me (Fredericksburg / Warrenton) seems to have some popular bakeries and coffee shops. Finally, I had one cluster made up of only one city and the top venue is "Scenic Lookout", something I definitely need to experience!

## Recommendation

<p> When I examine the cluster which contains Haymarket (Cluster 4), I was not surprised, but frustrated to see that it was characterized by fast food restaurants. This area has seen a tremendous amount of population and housing growth and this represents opportunity to add more diverse restaurant businesses to meet the needs of an expanding population.

## Conclusion

<p> I met my objectives in performing my neighborhood analysis using k-means clustering. My analysis generated new ideas of activities to try with my family, particularly in neighborhoods which are close to me.

## Project Code

In [1]:
import pandas as pd

<p> Read in table of US zip codes. I downloaded the data as an excel file from Open Data Soft, deleted the columns I did not need, and then saved as a .csv file. I uploaded the file as a data asset then added to my notebook via the "Insert to Code" and created a dataframe df_US.

In [2]:

import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.
client_7565fbc1252f4aec905db32b19ac504e = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='VGi1oph6uOzEwxq9IoHR3oU9OqO9jKa6KmBT66NhhFso',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

body = client_7565fbc1252f4aec905db32b19ac504e.get_object(Bucket='courseracapstone-donotdelete-pr-1zmnvyspu9rumd',Key='USZIPCODES2.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_US = pd.read_csv(body)
df_US.head()


Unnamed: 0,Zip,City,State,Latitude,Longitude,Timezone,NVA,PWC
0,67553,Liebenthal,KS,38.654948,-99.32062,-6,,
1,85743,Tucson,AZ,32.335122,-111.14888,-7,,
2,75016,Irving,TX,32.767268,-96.777626,-6,,
3,60401,Beecher,IL,41.350484,-87.62408,-6,,
4,80432,Como,CO,39.24344,-105.79431,-7,,


<p> Examine the shape of the raw table.

In [3]:
df_US.shape

(43191, 8)

<p> Create a subset dataframe of Virginia zipcodes.

In [4]:
df_VA = df_US[df_US['State'] =='VA'].reset_index(drop=True)
df_VA.head()

Unnamed: 0,Zip,City,State,Latitude,Longitude,Timezone,NVA,PWC
0,23181,West Point,VA,37.559878,-76.83018,-5,,
1,24440,Greenville,VA,37.996542,-79.15354,-5,,
2,23180,Water View,VA,37.710586,-76.63179,-5,,
3,23630,Hampton,VA,37.072658,-76.38992,-5,,
4,22701,Culpeper,VA,38.459521,-77.99875,-5,NVA,


In [5]:
df_VA.shape

(1275, 8)

<p> Create a dataframe for Northern Virginia cities and towns, df_NVA. Also create a dataframe for Prince William County cities and towns df_PWC.

In [6]:
df_NVA = df_VA[df_VA['NVA'] =='NVA'].reset_index(drop=True)
df_NVA.shape

(177, 8)

In [7]:
df_PWC = df_VA[df_VA['PWC'] =='PWC'].reset_index(drop=True)
df_PWC.shape

(23, 8)

## Perform Clustering Analysis

<p> Import remaining dependencies

In [8]:
import requests # library to handle requests
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize


! pip install folium==0.5.0
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Collecting folium==0.5.0
  Downloading folium-0.5.0.tar.gz (79 kB)
[K     |████████████████████████████████| 79 kB 4.4 MB/s eta 0:00:011
[?25hCollecting branca
  Downloading branca-0.4.1-py3-none-any.whl (24 kB)
Building wheels for collected packages: folium
  Building wheel for folium (setup.py) ... [?25ldone
[?25h  Created wheel for folium: filename=folium-0.5.0-py3-none-any.whl size=76240 sha256=753ae276daec27b3becb1479447d97ab23790cf6bf1a677e9721e45449f833ab
  Stored in directory: /tmp/wsuser/.cache/pip/wheels/b2/2f/2c/109e446b990d663ea5ce9b078b5e7c1a9c45cca91f377080f8
Successfully built folium
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.5.0
Folium installed
Libraries imported.


In [9]:
CLIENT_ID = 'YPMGHRXVI40BPTMEXTV5YXAVZYW5NXM3K21VBUZPAEDAUJOA' # your Foursquare ID
CLIENT_SECRET = 'ILLFATBXFUNTPQWRKVFWKB2EYUCH0UO3GJMDXFK3E14HCCML' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: YPMGHRXVI40BPTMEXTV5YXAVZYW5NXM3K21VBUZPAEDAUJOA
CLIENT_SECRET:ILLFATBXFUNTPQWRKVFWKB2EYUCH0UO3GJMDXFK3E14HCCML


In [10]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Libraries imported.


<p> Use geopy library to get the latitude and longitude values of Haymarket, Viriginia. Define a user_agent and name it va_explorer.

In [11]:
address = 'Haymarket, Virginia'

geolocator = Nominatim(user_agent="va_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Haymarket, Virginia are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Haymarket, Virginia are 38.8121398, -77.6368038.


<p> Create a map of Northern Virginia

In [12]:
# create map of Northern Virginia using latitude and longitude values
map_NVA = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, city in zip(df_NVA['Latitude'], df_NVA['Longitude'], df_NVA['City']):
    label = '{}'.format(city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_NVA)  
    
map_NVA

<p> Create a funtion to explore the neighborhoods of Northern Virginia. Unlike Manhattan or Toronto, Northern Virginia is made up of several towns and cities, not walkable neighborhoods. Therefore, I selected a radius of 33,000 meters, which is approximately 20 miles.

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=33000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
NVA_venues = getNearbyVenues(names=df_NVA['City'],
                                   latitudes=df_NVA['Latitude'],
                                   longitudes=df_NVA['Longitude']
                                  )

Culpeper
Alexandria
Mount Vernon
Alexandria
Alexandria
Fairfax
Merrifield
Arlington
Burke
Alexandria
Alexandria
Merrifield
Fairfax
Alexandria
Arlington
Woodbridge
Middleburg
Gainesville
Springfield
Fairfax
Manassas
Fredericksburg
Spotsylvania
Fredericksburg
Herndon
Springfield
Haymarket
Arlington
Falls Church
Fredericksburg
Herndon
Dulles
Manassas
Reston
Clifton Forge
Fredericksburg
Arlington
Nokesville
Alexandria
Arlington
Arlington
Alexandria
Fredericksburg
Springfield
Fredericksburg
Quantico
Herndon
Washington
Reston
Merrifield
Fort Belvoir
Arlington
Lorton
Alexandria
Bristow
Fairfax
Fairfax Station
Arlington
Arlington
Arlington
Nokesville
Purcellville
Springfield
Vienna
Vienna
Centreville
Annandale
Burke
Alexandria
Fairfax
Arlington
Vienna
Chantilly
Alexandria
Centreville
Alexandria
Arlington
Dulles
Reston
Manassas
Leesburg
Alexandria
Lorton
Herndon
Fairfax
Warrenton
Clifton
Falls Church
Marshall
Warrenton
West Mclean
Dulles
Manassas
Arlington
Chantilly
Arlington
Arlington
Alexandr

In [15]:
print(NVA_venues.shape)
NVA_venues.head()

(5310, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Culpeper,38.459521,-77.99875,It's About Thyme,38.473249,-77.995683,Mediterranean Restaurant
1,Culpeper,38.459521,-77.99875,Flavor On Main,38.472726,-77.996244,American Restaurant
2,Culpeper,38.459521,-77.99875,The Culpeper Cheese Company,38.473128,-77.995542,Cheese Shop
3,Culpeper,38.459521,-77.99875,Martin's,38.483629,-77.96544,Grocery Store
4,Culpeper,38.459521,-77.99875,Chick-fil-A,38.484767,-77.968726,Fast Food Restaurant


In [16]:
NVA_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alexandria,660,660,660,660,660,660
Annandale,30,30,30,30,30,30
Arlington,960,960,960,960,960,960
Bristow,30,30,30,30,30,30
Burke,60,60,60,60,60,60
Centreville,90,90,90,90,90,90
Chantilly,90,90,90,90,90,90
Clifton,30,30,30,30,30,30
Clifton Forge,30,30,30,30,30,30
Culpeper,30,30,30,30,30,30


In [17]:
print('There are {} uniques categories.'.format(len(NVA_venues['Venue Category'].unique())))

There are 137 uniques categories.


<p> Analyze the neighborhoods:

In [18]:
# one hot encoding
NVA_onehot = pd.get_dummies(NVA_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
NVA_onehot['Neighborhood'] = NVA_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [NVA_onehot.columns[-1]] + list(NVA_onehot.columns[:-1])
NVA_onehot = NVA_onehot[fixed_columns]

NVA_onehot.head()

Unnamed: 0,Zoo,Accessories Store,Afghan Restaurant,American Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Bed & Breakfast,Beer Garden,Bookstore,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Café,Campground,Cheese Shop,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distillery,Dog Run,Donut Shop,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Golf Driving Range,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Hotel,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Lake,Library,Lingerie Store,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Venue,National Park,Neighborhood,New American Restaurant,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plaza,Pool,Portuguese Restaurant,Resort,Restaurant,Road,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Ski Area,Smoothie Shop,South American Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Stables,State / Provincial Park,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tea Room,Tex-Mex Restaurant,Thai Restaurant,Tourist Information Center,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Winery,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,Culpeper,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Culpeper,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Culpeper,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Culpeper,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Culpeper,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [19]:
NVA_onehot.shape

(5310, 137)

In [20]:
NVA_grouped = NVA_onehot.groupby('Neighborhood').mean().reset_index()
NVA_grouped

Unnamed: 0,Neighborhood,Zoo,Accessories Store,Afghan Restaurant,American Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Baseball Stadium,Bed & Breakfast,Beer Garden,Bookstore,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Café,Campground,Cheese Shop,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distillery,Dog Run,Donut Shop,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Golf Driving Range,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Hotel,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Lake,Library,Lingerie Store,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Venue,National Park,New American Restaurant,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plaza,Pool,Portuguese Restaurant,Resort,Restaurant,Road,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shopping Mall,Ski Area,Smoothie Shop,South American Restaurant,Spa,Spanish Restaurant,Sporting Goods Shop,Stables,State / Provincial Park,Steakhouse,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tea Room,Tex-Mex Restaurant,Thai Restaurant,Tourist Information Center,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Vineyard,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Winery,Women's Store
0,Alexandria,0.0,0.0,0.0,0.042424,0.0,0.031818,0.012121,0.0,0.019697,0.0,0.00303,0.0,0.004545,0.0,0.0,0.0,0.0,0.033333,0.0,0.001515,0.0,0.0,0.001515,0.0,0.031818,0.0,0.0,0.0,0.0,0.043939,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031818,0.004545,0.0,0.0,0.0,0.022727,0.001515,0.0,0.019697,0.0,0.031818,0.0,0.0,0.0,0.0,0.0,0.0,0.031818,0.031818,0.0,0.004545,0.018182,0.0,0.018182,0.004545,0.004545,0.021212,0.004545,0.0,0.031818,0.0,0.0,0.015152,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.121212,0.0,0.027273,0.0,0.0,0.012121,0.0,0.0,0.072727,0.010606,0.0,0.0,0.05303,0.034848,0.0,0.00303,0.0,0.0,0.0,0.00303,0.015152,0.0,0.0,0.006061,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00303,0.0,0.013636,0.009091,0.0,0.0,0.0,0.0,0.0,0.0,0.001515,0.0,0.0,0.030303,0.0,0.006061,0.0,0.001515
1,Annandale,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.033333,0.0,0.033333,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333
2,Arlington,0.0,0.0,0.0,0.028125,0.011458,0.002083,0.0,0.0,0.0,0.0,0.023958,0.0,0.005208,0.0,0.0,0.0,0.0,0.003125,0.0,0.001042,0.0,0.0,0.0,0.0,0.009375,0.0,0.001042,0.0,0.0,0.08125,0.0,0.0,0.0,0.001042,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029167,0.0,0.0,0.021875,0.0,0.0,0.0,0.0,0.001042,0.002083,0.001042,0.0,0.0,0.0,0.007292,0.0,0.0,0.023958,0.030208,0.0,0.001042,0.026042,0.0,0.023958,0.0,0.007292,0.0,0.013542,0.028125,0.009375,0.021875,0.0,0.051042,0.0,0.010417,0.0,0.0,0.0,0.002083,0.0,0.0,0.0,0.21875,0.0,0.001042,0.0,0.0,0.007292,0.0,0.001042,0.078125,0.032292,0.0,0.0,0.027083,0.033333,0.0,0.001042,0.0,0.0,0.0,0.002083,0.0,0.0,0.00625,0.022917,0.001042,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.002083,0.0,0.003125,0.027083,0.001042,0.0,0.0,0.0,0.004167,0.0,0.0,0.0,0.0,0.023958,0.0,0.022917,0.0,0.001042
3,Bristow,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.1,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.033333,0.0,0.033333,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0
4,Burke,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.016667,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.016667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.033333,0.0,0.033333,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.066667,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.016667
5,Centreville,0.0,0.0,0.0,0.055556,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,0.0,0.011111,0.0,0.011111,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.011111,0.0,0.055556,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.044444,0.088889,0.0,0.022222,0.011111,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.077778,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.022222,0.0,0.0,0.0,0.033333,0.011111,0.0,0.0,0.022222,0.0,0.0,0.011111,0.0,0.0,0.0,0.011111,0.011111,0.0,0.0,0.022222,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.022222,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.044444,0.0,0.011111
6,Chantilly,0.0,0.0,0.0,0.044444,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.011111,0.0,0.0,0.011111,0.0,0.0,0.066667,0.0,0.011111,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.022222,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.022222,0.088889,0.0,0.022222,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.033333,0.022222,0.0,0.011111,0.0,0.0,0.011111,0.0,0.011111,0.0,0.0,0.0,0.022222,0.0,0.022222,0.0,0.0,0.0,0.033333,0.0,0.0,0.011111,0.022222,0.011111,0.0,0.011111,0.0,0.0,0.0,0.011111,0.011111,0.0,0.0,0.022222,0.011111,0.0,0.011111,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.033333,0.0,0.0,0.011111,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.044444,0.0,0.011111
7,Clifton,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.133333,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0
8,Clifton Forge,0.033333,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.066667,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Culpeper,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.066667,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [21]:
NVA_grouped.shape

(42, 137)

<p> Write a function to sort the venues in descending order.

In [22]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [23]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = NVA_grouped['Neighborhood']

for ind in np.arange(NVA_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(NVA_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Alexandria,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
1,Annandale,Pizza Place,Greek Restaurant,Taco Place,Brewery,Movie Theater
2,Arlington,Monument / Landmark,Coffee Shop,Park,Italian Restaurant,Plaza
3,Bristow,Brewery,Convenience Store,Fast Food Restaurant,Historic Site,Supermarket
4,Burke,Greek Restaurant,Grocery Store,Supermarket,Pizza Place,Movie Theater


<p> Perform clustering analysis, setting the number of clusters to 10

In [24]:
# set number of clusters
kclusters = 10

NVA_grouped_clustering = NVA_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(NVA_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([5, 2, 5, 4, 2, 1, 1, 1, 0, 7], dtype=int32)

<p> Create a new dataframe that includes the cluster as well as the top 5 venues for each neighborhood.

In [25]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

NVA_merged = df_NVA

# merge NVA_grouped with df_NVA to add latitude/longitude for each neighborhood
NVA_merged = NVA_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='City')

NVA_merged.head() # check the last columns!

Unnamed: 0,Zip,City,State,Latitude,Longitude,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,22701,Culpeper,VA,38.459521,-77.99875,-5,NVA,,7,Mexican Restaurant,Mediterranean Restaurant,Seafood Restaurant,Café,Burger Joint
1,22307,Alexandria,VA,38.774863,-77.0593,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
2,22121,Mount Vernon,VA,38.830912,-77.432252,-5,NVA,,1,Grocery Store,Italian Restaurant,American Restaurant,Fast Food Restaurant,Golf Course
3,22302,Alexandria,VA,38.829512,-77.08204,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
4,22305,Alexandria,VA,38.836779,-77.06418,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant


<p> Visualize the clusters on a map

In [26]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(NVA_merged['Latitude'], NVA_merged['Longitude'], NVA_merged['City'], NVA_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<p> Analyze the characteristics of each cluster.

In [27]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 0, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
34,Clifton Forge,-5,NVA,,0,American Restaurant,Gas Station,Sandwich Place,Resort,Zoo
50,Fort Belvoir,-5,NVA,,0,Park,American Restaurant,Historic Site,French Restaurant,Supermarket
56,Fairfax Station,-5,NVA,,0,American Restaurant,Greek Restaurant,Supermarket,Movie Theater,Grocery Store
113,Occoquan,-5,NVA,PWC,0,Historic Site,American Restaurant,Supermarket,Pool,Farmers Market


In [28]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 1, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
2,Mount Vernon,-5,NVA,,1,Grocery Store,Italian Restaurant,American Restaurant,Fast Food Restaurant,Golf Course
65,Centreville,-5,NVA,,1,Grocery Store,Italian Restaurant,Fast Food Restaurant,American Restaurant,Supermarket
72,Chantilly,-5,NVA,,1,Grocery Store,Brewery,Supermarket,Wine Shop,American Restaurant
74,Centreville,-5,NVA,,1,Grocery Store,Italian Restaurant,Fast Food Restaurant,American Restaurant,Supermarket
86,Clifton,-5,NVA,,1,Grocery Store,Park,American Restaurant,Italian Restaurant,Supermarket
94,Chantilly,-5,NVA,,1,Grocery Store,Brewery,Supermarket,Wine Shop,American Restaurant
100,Chantilly,-5,NVA,,1,Grocery Store,Brewery,Supermarket,Wine Shop,American Restaurant
116,Centreville,-5,NVA,,1,Grocery Store,Italian Restaurant,Fast Food Restaurant,American Restaurant,Supermarket


In [29]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 2, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
5,Fairfax,-5,NVA,,2,Supermarket,Greek Restaurant,Coffee Shop,Grocery Store,American Restaurant
6,Merrifield,-5,NVA,,2,Greek Restaurant,Wine Shop,Grocery Store,Supermarket,Taco Place
8,Burke,-5,NVA,,2,Greek Restaurant,Grocery Store,Supermarket,Pizza Place,Movie Theater
11,Merrifield,-5,NVA,,2,Greek Restaurant,Wine Shop,Grocery Store,Supermarket,Taco Place
12,Fairfax,-5,NVA,,2,Supermarket,Greek Restaurant,Coffee Shop,Grocery Store,American Restaurant
18,Springfield,-5,NVA,,2,Pizza Place,Taco Place,Supermarket,Greek Restaurant,Brewery
19,Fairfax,-5,NVA,,2,Supermarket,Greek Restaurant,Coffee Shop,Grocery Store,American Restaurant
24,Herndon,-5,NVA,,2,Brewery,Grocery Store,Taco Place,Greek Restaurant,Coffee Shop
25,Springfield,-5,NVA,,2,Pizza Place,Taco Place,Supermarket,Greek Restaurant,Brewery
28,Falls Church,-5,NVA,,2,Taco Place,American Restaurant,Pizza Place,Wine Shop,Coffee Shop


In [30]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 3, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
31,Dulles,-5,NVA,,3,Brewery,American Restaurant,Supermarket,Café,Vineyard
61,Purcellville,-5,NVA,,3,Brewery,Winery,American Restaurant,Pizza Place,Park
77,Dulles,-5,NVA,,3,Brewery,American Restaurant,Supermarket,Café,Vineyard
80,Leesburg,-5,NVA,,3,American Restaurant,Brewery,Sandwich Place,Wine Bar,Pizza Place
91,Dulles,-5,NVA,,3,Brewery,American Restaurant,Supermarket,Café,Vineyard
131,Purcellville,-5,NVA,,3,Brewery,Winery,American Restaurant,Pizza Place,Park
136,Leesburg,-5,NVA,,3,American Restaurant,Brewery,Sandwich Place,Wine Bar,Pizza Place
137,Leesburg,-5,NVA,,3,American Restaurant,Brewery,Sandwich Place,Wine Bar,Pizza Place
149,Dulles,-5,NVA,,3,Brewery,American Restaurant,Supermarket,Café,Vineyard
174,Leesburg,-5,NVA,,3,American Restaurant,Brewery,Sandwich Place,Wine Bar,Pizza Place


In [32]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 4, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
15,Woodbridge,-5,NVA,PWC,4,Fast Food Restaurant,Italian Restaurant,American Restaurant,Park,Grocery Store
17,Gainesville,-5,NVA,PWC,4,Fast Food Restaurant,Golf Course,Historic Site,Brewery,Grocery Store
20,Manassas,-5,NVA,PWC,4,Fast Food Restaurant,Grocery Store,Italian Restaurant,Park,American Restaurant
26,Haymarket,-5,NVA,PWC,4,Fast Food Restaurant,Historic Site,Brewery,Golf Course,Park
32,Manassas,-5,NVA,PWC,4,Fast Food Restaurant,Grocery Store,Italian Restaurant,Park,American Restaurant
37,Nokesville,-5,NVA,PWC,4,Brewery,Historic Site,Mexican Restaurant,Fast Food Restaurant,Convenience Store
45,Quantico,-5,NVA,PWC,4,Fast Food Restaurant,Italian Restaurant,Park,Greek Restaurant,Grocery Store
54,Bristow,-5,NVA,PWC,4,Brewery,Convenience Store,Fast Food Restaurant,Historic Site,Supermarket
60,Nokesville,-5,NVA,PWC,4,Brewery,Historic Site,Mexican Restaurant,Fast Food Restaurant,Convenience Store
79,Manassas,-5,NVA,PWC,4,Fast Food Restaurant,Grocery Store,Italian Restaurant,Park,American Restaurant


In [33]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 5, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
3,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
4,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
7,Arlington,-5,NVA,,5,Monument / Landmark,Coffee Shop,Park,Italian Restaurant,Plaza
9,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
10,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
13,Alexandria,-5,NVA,,5,Monument / Landmark,Park,Pizza Place,Coffee Shop,American Restaurant
14,Arlington,-5,NVA,,5,Monument / Landmark,Coffee Shop,Park,Italian Restaurant,Plaza
27,Arlington,-5,NVA,,5,Monument / Landmark,Coffee Shop,Park,Italian Restaurant,Plaza
36,Arlington,-5,NVA,,5,Monument / Landmark,Coffee Shop,Park,Italian Restaurant,Plaza


In [34]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 6, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
16,Middleburg,-5,NVA,,6,Winery,Brewery,Vineyard,Coffee Shop,American Restaurant
88,Marshall,-5,NVA,,6,Winery,Vineyard,American Restaurant,Coffee Shop,Golf Course
125,Marshall,-5,NVA,,6,Winery,Vineyard,American Restaurant,Coffee Shop,Golf Course
156,Middleburg,-5,NVA,,6,Winery,Brewery,Vineyard,Coffee Shop,American Restaurant


In [35]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 7, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Culpeper,-5,NVA,,7,Mexican Restaurant,Mediterranean Restaurant,Seafood Restaurant,Café,Burger Joint
21,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
22,Spotsylvania,-5,NVA,,7,Fast Food Restaurant,Grocery Store,Farmers Market,Donut Shop,Market
23,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
29,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
35,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
42,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
44,Fredericksburg,-5,NVA,,7,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Brewery,Supermarket
85,Warrenton,-5,NVA,,7,Bakery,Brewery,Winery,Fast Food Restaurant,American Restaurant
89,Warrenton,-5,NVA,,7,Bakery,Brewery,Winery,Fast Food Restaurant,American Restaurant


In [37]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 8, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
108,Dumfries,-5,NVA,PWC,8,Greek Restaurant,Fast Food Restaurant,Japanese Restaurant,Bakery,French Restaurant
163,Triangle,-5,NVA,PWC,8,Japanese Restaurant,Greek Restaurant,Fast Food Restaurant,Hotel,National Park


In [38]:
NVA_merged.loc[NVA_merged['Cluster Labels'] == 9, NVA_merged.columns[[1] + list(range(5, NVA_merged.shape[1]))]]

Unnamed: 0,City,Timezone,NVA,PWC,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
47,Washington,-5,NVA,,9,Scenic Lookout,Winery,Trail,Coffee Shop,National Park
