# Segmenting and Clustering Neighborhoods in Irving
***

## Introduction/Business Problem

The City of Irving Texas, located in the DFW area between DFW Airport and Dallas, is a city surrounded by major highways and is an interface to neighboring areas. 
Understanding the area well I would like to draw conclusions from segmenting and clustering neighborhood data obtained from foursquare. 
The conclusions will be the best locations to put particular stores. 

Additionally, I would speculate that more high end stores would be in the North areas of Irving and fast foods & Mexican foods would be more common in the central and south. 

## Data

The data would be modeled similarly to the New York City and Toronto problems. However, Irving, and all of Texas, do not have Boroughs so we will only need neighborhood names. 
After initial looking Wikipedia does not have a list so I found a list on a different site `http://www.city-data.com/nbmaps/neigh-Irving-Texas.html#N5`. In the next sections I will scrub the data.

The pipeline following getting neighborhood names would be to use `geocoder` to get the Latitude and Longitudes respectively.

## Methodology

A list of Neighborhoods in Irving is not available nicely on the internet so we will have to scrub it

In [6]:
import imp
import numpy as np # library to handle data in a vectorized manner
import pandas as pd
import re
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import requests
import json
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# !conda install -c conda-forge geocoder
try:
    imp.find_module('geocoder')
except ImportError:
    !conda install -c conda-forge geocoder 
from geopy.geocoders import Nominatim
import geocoder

# !conda install -c conda-forge folium=0.5.0 --yes
try:
    imp.find_module('folium')
except ImportError:
    !conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

In [9]:
url ="http://www.city-data.com/nbmaps/neigh-Irving-Texas.html#N5"

body = requests.get(url)
text = body.text
start = text.find(">Neighborhoods:</h2>") + len(">Neighborhoods:</h2>") + 1
end = text.find(">Woodhaven</a>") + len(">Woodhaven</a>")
text = text[start:end]

text = re.sub("<.*?>", " ", text) # remove all tag elements
text = text.split(',') # split by comma
text = [x.strip() for x in text] # trim leading and trailing whitespace

df = pd.DataFrame(data = {'Neighborhood': text})
df.head(10)

Unnamed: 0,Neighborhood
0,Arts District
1,Barton Estates
2,Beverly Oaks
3,Broadmoor Hills
4,Cardinal Family Village
5,Club Townhomes
6,Cottonwood Valley
7,Country Club Place
8,Del Paseo
9,Downtown Heritage District


#### Adding Latitude and Longitude

In [10]:
df["Latitude"] = ""
df["Longitude"] = ""
df.head(10)

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Arts District,,
1,Barton Estates,,
2,Beverly Oaks,,
3,Broadmoor Hills,,
4,Cardinal Family Village,,
5,Club Townhomes,,
6,Cottonwood Valley,,
7,Country Club Place,,
8,Del Paseo,,
9,Downtown Heritage District,,


In [11]:
for index, row in df.iterrows():
    # initialize your variable to None
    lat_lng_coords = None
    postal_code = row['Neighborhood']
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.google('{}, Irving, Texas'.format(postal_code))
        lat_lng_coords = g.latlng

    latitude = lat_lng_coords[0]
    longitude = lat_lng_coords[1]
    row["Latitude"] = latitude
    row["Longitude"] = longitude

In [12]:
pd.set_option('precision', 8)
df.head(10)

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Arts District,32.848551,-96.966548
1,Barton Estates,32.824903,-96.988034
2,Beverly Oaks,32.833081,-96.927017
3,Broadmoor Hills,32.876134,-97.000122
4,Cardinal Family Village,32.84782,-96.954465
5,Club Townhomes,32.823383,-96.939296
6,Cottonwood Valley,32.861251,-96.965205
7,Country Club Place,32.863181,-96.949431
8,Del Paseo,32.832752,-96.960506
9,Downtown Heritage District,32.814701,-96.947804


### 1. Exploring the Dataset

Use geopy library to get the latitude and longitude values of Irving.

In [14]:
address = 'Irving, TX'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude_irving = location.latitude
longitude_irving = location.longitude
print('The geograpical coordinate of Irving are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Irving are 32.8482674, -96.946411.


Create a map of Irving

In [48]:
# create map of Irving using latitude and longitude values
map_irving = folium.Map(location=[latitude_irving, longitude_irving], zoom_start=12)

# add markers to map
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_irving)  
    
map_irving

Foursquare Credentials

In [18]:
CLIENT_ID = 'I1WYE1Z151WOQFKLAYX054S0RWGV50EHKBCQDTAIJIHIX2PQ' # your Foursquare ID
CLIENT_SECRET = 'O45ZLZKWMDTDAFIDZRKKOJFMLEXNCJOOUWIBQ3LDT0MSX33J' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

Exploring first neighborhood in dataframe

In [19]:
df.loc[0, 'Neighborhood']

'Arts District'

Get the neighborhood's latitude and longitude values.

In [20]:
neighborhood_latitude = df.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = df.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Arts District are 32.8485508, -96.9665476.


#### Now, let's get the top 100 venues that are in The Beaches within a radius of 750 meters.

First, let's create the GET request URL.

In [82]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 750 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

In [83]:
results = requests.get(url).json()

In [84]:
# function that extracts the category of the venue
# Taken from the foursquare lab in coursera
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a pandas dataframe.

In [85]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Irving Arts Center,Performing Arts Venue,32.85055132,-96.9609867
1,Subway,Sandwich Place,32.85202562,-96.97052586
2,Lee Park,Gym,32.84772067,-96.96495587
3,Lee Park Recreation Center,Gym / Fitness Center,32.84761591,-96.96433701
4,bountiful baskets co-op,Fruit & Vegetable Store,32.84512958,-96.96586319


In [86]:
print('{} venue(s) were returned by Foursquare.'.format(nearby_venues.shape[0]))

5 venue(s) were returned by Foursquare.


### 2. Explore Neighborhoods in Irving

In [87]:
def getNearbyVenues(names, latitudes, longitudes, radius=750):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [88]:
irving_venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Arts District
Barton Estates
Beverly Oaks
Broadmoor Hills
Cardinal Family Village
Club Townhomes
Cottonwood Valley
Country Club Place
Del Paseo
Downtown Heritage District
Espanita
Fairway Vista
Fox Glen
Garden Oaks
Grauwyler Heights
Hackberry Creek
Hillcrest Oaks
Hospital District
Hospital District South
Irving Heights
Irving Lake
Lakeside Landing
Lamar-Brown
Las Brisas Hills
Las Colinas
Macarthur Commons
Mandalay Place
Nichols Way
North Irving
Northgate Heights
Oaks on the Ridge
Pecan Estates
Plymouth Park
Quail Run
Revere Place
SONG
Sherwood Forest
South Irving
The Collections
Timberlake
Townlake II
Townlake III
Trinity Oaks
University Hills
University Park
Valley Ranch
Windsor Ridge
Woodhaven


In [89]:
print(irving_venues.shape)
irving_venues.head()

(651, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Arts District,32.8485508,-96.9665476,Irving Arts Center,32.85055132,-96.9609867,Performing Arts Venue
1,Arts District,32.8485508,-96.9665476,Subway,32.85202562,-96.97052586,Sandwich Place
2,Arts District,32.8485508,-96.9665476,Lee Park,32.84772067,-96.96495587,Gym
3,Arts District,32.8485508,-96.9665476,Lee Park Recreation Center,32.84761591,-96.96433701,Gym / Fitness Center
4,Arts District,32.8485508,-96.9665476,bountiful baskets co-op,32.84512958,-96.96586319,Fruit & Vegetable Store


Let's check how many venues were returned for each neighborhood

In [90]:
irving_venues.groupby('Neighborhood').size().reset_index(name='counts')

Unnamed: 0,Neighborhood,counts
0,Arts District,5
1,Barton Estates,15
2,Beverly Oaks,5
3,Broadmoor Hills,4
4,Cardinal Family Village,13
5,Club Townhomes,7
6,Cottonwood Valley,23
7,Country Club Place,27
8,Del Paseo,19
9,Downtown Heritage District,20


### 3. Analyze Each Neighborhood

In [91]:
# one hot encoding
irving_onehot = pd.get_dummies(irving_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
irving_onehot['Neighborhood'] = irving_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [irving_onehot.columns[-1]] + list(irving_onehot.columns[:-1])
irving_onehot = irving_onehot[fixed_columns]

irving_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Art Gallery,Asian Restaurant,Assisted Living,Athletics & Sports,Auto Garage,Auto Workshop,Automotive Shop,...,Theater,Thrift / Vintage Store,Trail,Train Station,Turkish Restaurant,Vape Store,Video Game Store,Video Store,Vietnamese Restaurant,Wings Joint
0,Arts District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Arts District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Arts District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Arts District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Arts District,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [92]:
irving_onehot.shape

(651, 134)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [93]:
irving_grouped = irving_onehot.groupby('Neighborhood').mean().reset_index()
irving_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Art Gallery,Asian Restaurant,Assisted Living,Athletics & Sports,Auto Garage,Auto Workshop,Automotive Shop,...,Theater,Thrift / Vintage Store,Trail,Train Station,Turkish Restaurant,Vape Store,Video Game Store,Video Store,Vietnamese Restaurant,Wings Joint
0,Arts District,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Barton Estates,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06666667,0.0,0.0
2,Beverly Oaks,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Broadmoor Hills,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Cardinal Family Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07692308,0.0,0.0


#### Let's print each neighborhood along with the top 5 most common venues

In [94]:
num_top_venues = 5

for hood in irving_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = irving_grouped[irving_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Arts District----
                     venue  freq
0    Performing Arts Venue   0.2
1                      Gym   0.2
2     Gym / Fitness Center   0.2
3  Fruit & Vegetable Store   0.2
4           Sandwich Place   0.2


----Barton Estates----
                 venue  freq
0          Gas Station  0.13
1   Seafood Restaurant  0.13
2  Peruvian Restaurant  0.07
3          Video Store  0.07
4          Flea Market  0.07


----Beverly Oaks----
                  venue  freq
0   Rental Car Location   0.4
1    Mexican Restaurant   0.2
2  Fast Food Restaurant   0.2
3        Sandwich Place   0.2
4          Optical Shop   0.0


----Broadmoor Hills----
                 venue  freq
0          Golf Course  0.25
1       Shipping Store  0.25
2         Soccer Field  0.25
3  Fried Chicken Joint  0.25
4         Optical Shop  0.00


----Cardinal Family Village----
                   venue  freq
0      Convenience Store  0.23
1      Indian Restaurant  0.15
2          Grocery Store  0.08
3               Phar

#### Let's put that into a *pandas* dataframe

In [95]:
# Sorts venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [96]:
# Dataframe for the top 10 venues for each neighborhood
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = irving_grouped['Neighborhood']

for ind in np.arange(irving_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(irving_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arts District,Fruit & Vegetable Store,Gym / Fitness Center,Performing Arts Venue,Gym,Sandwich Place,French Restaurant,Food Truck,Food Court,Food,Flea Market
1,Barton Estates,Seafood Restaurant,Gas Station,Pizza Place,Event Service,Mexican Restaurant,Sandwich Place,Martial Arts Dojo,Flea Market,BBQ Joint,Peruvian Restaurant
2,Beverly Oaks,Rental Car Location,Sandwich Place,Mexican Restaurant,Fast Food Restaurant,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food
3,Broadmoor Hills,Fried Chicken Joint,Golf Course,Shipping Store,Soccer Field,Wings Joint,Food,French Restaurant,Food Truck,Food Court,Flea Market
4,Cardinal Family Village,Convenience Store,Indian Restaurant,Donut Shop,Pharmacy,Grocery Store,Performing Arts Venue,Gym / Fitness Center,Lounge,Video Store,Burger Joint
5,Club Townhomes,Construction & Landscaping,Discount Store,Smoke Shop,Cosmetics Shop,Assisted Living,Food,Food Court,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
6,Cottonwood Valley,American Restaurant,Nail Salon,Cocktail Bar,Supplement Shop,Hotel,Liquor Store,Spa,Pool,Resort,Restaurant
7,Country Club Place,Golf Course,Hotel,Sandwich Place,Bakery,Plaza,Bank,Pizza Place,Convenience Store,Fast Food Restaurant,Cocktail Bar
8,Del Paseo,Mexican Restaurant,Fast Food Restaurant,Bank,Pizza Place,Ice Cream Shop,Rental Car Location,Convenience Store,Park,Paper / Office Supplies Store,Music Store
9,Downtown Heritage District,Fast Food Restaurant,Plaza,Bank,Pizza Place,Park,Cosmetics Shop,Construction & Landscaping,Coffee Shop,Chinese Restaurant,Seafood Restaurant


### 4.  Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [97]:
# set number of clusters
kclusters = 5

irving_grouped_clustering = irving_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(irving_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 1, 3, 2, 1, 1, 2, 2, 3, 3], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [110]:
irving_merged = df

# add clustering labels
irving_merged['Cluster Labels'] = kmeans.labels_

# merge irving_grouped with irving_data to add latitude/longitude for each neighborhood
irving_merged = irving_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

irving_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arts District,32.848551,-96.966548,3,Fruit & Vegetable Store,Gym / Fitness Center,Performing Arts Venue,Gym,Sandwich Place,French Restaurant,Food Truck,Food Court,Food,Flea Market
1,Barton Estates,32.824903,-96.988034,1,Seafood Restaurant,Gas Station,Pizza Place,Event Service,Mexican Restaurant,Sandwich Place,Martial Arts Dojo,Flea Market,BBQ Joint,Peruvian Restaurant
2,Beverly Oaks,32.833081,-96.927017,3,Rental Car Location,Sandwich Place,Mexican Restaurant,Fast Food Restaurant,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food
3,Broadmoor Hills,32.876134,-97.000122,2,Fried Chicken Joint,Golf Course,Shipping Store,Soccer Field,Wings Joint,Food,French Restaurant,Food Truck,Food Court,Flea Market
4,Cardinal Family Village,32.84782,-96.954465,1,Convenience Store,Indian Restaurant,Donut Shop,Pharmacy,Grocery Store,Performing Arts Venue,Gym / Fitness Center,Lounge,Video Store,Burger Joint


Finally, let's visualize the resulting clusters

In [106]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(irving_merged['Latitude'], irving_merged['Longitude'], irving_merged['Neighborhood'], irving_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 5. Examine the Clusters

#### Cluster 1

In [114]:
irving_merged.loc[irving_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Lamar-Brown,32.808363,-96.982661,0,Park,Restaurant,Construction & Landscaping,Pharmacy,Wings Joint,Flea Market,French Restaurant,Food Truck,Food Court,Food
40,Townlake II,32.819717,-97.005496,0,Nail Salon,Construction & Landscaping,Train Station,Park,Wings Joint,Food,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
41,Townlake III,32.819717,-97.005496,0,Nail Salon,Construction & Landscaping,Train Station,Park,Wings Joint,Food,Fried Chicken Joint,French Restaurant,Food Truck,Food Court
43,University Hills,32.850043,-96.9397,0,Motorcycle Shop,Park,Business Service,Pet Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food
44,University Park,32.849676,-96.933661,0,Motorcycle Shop,Business Service,Convenience Store,Park,Playground,Hotel,Hookah Bar,Donut Shop,Dumpling Restaurant,Event Service


#### Cluster 2

In [115]:
irving_merged.loc[irving_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Barton Estates,32.824903,-96.988034,1,Seafood Restaurant,Gas Station,Pizza Place,Event Service,Mexican Restaurant,Sandwich Place,Martial Arts Dojo,Flea Market,BBQ Joint,Peruvian Restaurant
4,Cardinal Family Village,32.84782,-96.954465,1,Convenience Store,Indian Restaurant,Donut Shop,Pharmacy,Grocery Store,Performing Arts Venue,Gym / Fitness Center,Lounge,Video Store,Burger Joint
5,Club Townhomes,32.823383,-96.939296,1,Construction & Landscaping,Discount Store,Smoke Shop,Cosmetics Shop,Assisted Living,Food,Food Court,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant
17,Hospital District,32.833789,-96.954465,1,Indian Restaurant,Pizza Place,Fast Food Restaurant,Breakfast Spot,Convenience Store,Rental Car Location,Ice Cream Shop,Pharmacy,Paper / Office Supplies Store,Music Store
18,Hospital District South,32.833789,-96.954465,1,Indian Restaurant,Pizza Place,Fast Food Restaurant,Breakfast Spot,Convenience Store,Rental Car Location,Ice Cream Shop,Pharmacy,Paper / Office Supplies Store,Music Store
24,Las Colinas,32.896988,-96.953122,1,Sandwich Place,Burger Joint,Salad Place,Gym,Indian Restaurant,Mediterranean Restaurant,Wings Joint,Nail Salon,Pizza Place,Pharmacy
25,Macarthur Commons,32.851141,-96.957821,1,Indian Restaurant,Gym / Fitness Center,Grocery Store,Asian Restaurant,Convenience Store,Burger Joint,Performing Arts Venue,Sandwich Place,Wings Joint,French Restaurant
30,Oaks on the Ridge,32.863692,-97.003145,1,Gas Station,Diner,Intersection,Asian Restaurant,Pizza Place,Gym / Fitness Center,Food,Fried Chicken Joint,French Restaurant,Food Truck
35,SONG,32.859302,-97.000122,1,Pizza Place,Bagel Shop,Dumpling Restaurant,Diner,Convenience Store,Coffee Shop,Gas Station,Himalayan Restaurant,Japanese Restaurant,Video Store
36,Sherwood Forest,32.818112,-96.984004,1,Jewelry Store,Mexican Restaurant,Pizza Place,Construction & Landscaping,Gas Station,Ice Cream Shop,Frozen Yogurt Shop,Italian Restaurant,Dumpling Restaurant,Event Service


#### Cluster 3

In [116]:
irving_merged.loc[irving_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Broadmoor Hills,32.876134,-97.000122,2,Fried Chicken Joint,Golf Course,Shipping Store,Soccer Field,Wings Joint,Food,French Restaurant,Food Truck,Food Court,Flea Market
6,Cottonwood Valley,32.861251,-96.965205,2,American Restaurant,Nail Salon,Cocktail Bar,Supplement Shop,Hotel,Liquor Store,Spa,Pool,Resort,Restaurant
7,Country Club Place,32.863181,-96.949431,2,Golf Course,Hotel,Sandwich Place,Bakery,Plaza,Bank,Pizza Place,Convenience Store,Fast Food Restaurant,Cocktail Bar
10,Espanita,32.84476,-96.996764,2,Clothing Store,Mexican Restaurant,Fast Food Restaurant,Sandwich Place,Shoe Store,Chinese Restaurant,Department Store,Video Store,Video Game Store,American Restaurant
11,Fairway Vista,32.863621,-96.954129,2,Golf Course,American Restaurant,Nail Salon,Cocktail Bar,Gym,Breakfast Spot,Coffee Shop,Shopping Mall,Pet Store,Pharmacy
12,Fox Glen,32.854543,-96.947082,2,Pool,Gym,Park,Food,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Wings Joint,Frozen Yogurt Shop
15,Hackberry Creek,32.909963,-96.971918,2,Playground,Food Court,Pizza Place,Shopping Mall,Pool,Hookah Bar,Food Truck,Indian Restaurant,Diner,Discount Store
21,Lakeside Landing,32.8653,-96.980647,2,Gas Station,Hookah Bar,Fried Chicken Joint,Baseball Field,Park,Lounge,Pizza Place,Fast Food Restaurant,Field,Flea Market
23,Las Brisas Hills,32.860659,-96.981876,2,Park,Gas Station,Fried Chicken Joint,Indian Restaurant,Baseball Field,Athletics & Sports,Hookah Bar,Food Court,Frozen Yogurt Shop,French Restaurant
26,Mandalay Place,32.875718,-96.969904,2,Pool,Fast Food Restaurant,Playground,Baseball Field,Home Service,College Cafeteria,Photography Studio,American Restaurant,Event Service,Dumpling Restaurant


#### Cluster 4

In [117]:
irving_merged.loc[irving_merged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arts District,32.848551,-96.966548,3,Fruit & Vegetable Store,Gym / Fitness Center,Performing Arts Venue,Gym,Sandwich Place,French Restaurant,Food Truck,Food Court,Food,Flea Market
2,Beverly Oaks,32.833081,-96.927017,3,Rental Car Location,Sandwich Place,Mexican Restaurant,Fast Food Restaurant,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food
8,Del Paseo,32.832752,-96.960506,3,Mexican Restaurant,Fast Food Restaurant,Bank,Pizza Place,Ice Cream Shop,Rental Car Location,Convenience Store,Park,Paper / Office Supplies Store,Music Store
9,Downtown Heritage District,32.814701,-96.947804,3,Fast Food Restaurant,Plaza,Bank,Pizza Place,Park,Cosmetics Shop,Construction & Landscaping,Coffee Shop,Chinese Restaurant,Seafood Restaurant
13,Garden Oaks,32.818711,-96.973261,3,Donut Shop,Mexican Restaurant,Discount Store,Gym / Fitness Center,Clothing Store,Pizza Place,Event Service,Fast Food Restaurant,Field,Flea Market
14,Grauwyler Heights,32.832323,-96.930306,3,Rental Car Location,Grocery Store,Discount Store,Fast Food Restaurant,Locksmith,Sandwich Place,Mexican Restaurant,Wings Joint,French Restaurant,Food Truck
16,Hillcrest Oaks,32.841461,-96.96789,3,Mexican Restaurant,Gym,Park,Brazilian Restaurant,Discount Store,Bank,Supermarket,Theater,Fruit & Vegetable Store,Pet Store
19,Irving Heights,32.824348,-96.922255,3,Fast Food Restaurant,Sandwich Place,Rental Service,Flea Market,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food,Wings Joint
28,North Irving,32.823833,-96.926127,3,Mexican Restaurant,Park,Sandwich Place,Fast Food Restaurant,Food,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Wings Joint
31,Pecan Estates,32.819249,-96.956625,3,Ice Cream Shop,Mexican Restaurant,Dessert Shop,Fast Food Restaurant,Plaza,Skate Park,Food,Fried Chicken Joint,Convenience Store,Event Service


#### Cluster 5

In [118]:
irving_merged.loc[irving_merged['Cluster Labels'] == 4]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Irving Lake,32.791088,-96.965205,4,Lake,Wings Joint,Food,Fruit & Vegetable Store,Frozen Yogurt Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Flea Market


## Results

Initially, having the radius for the Foursquare API at 500 meters yielded a single neighborhood having nothing in its proximity. Increasing the radius to 1000 meters had some overlap and the clusters became homogeneous to mostly a single cluster. Settling on a radius of 750 meters created better clusters. 

Superficially looking at the clusters we have an outlier to the very south, cluster 5. Cluster 3 occupies the north half and cluster 4 occupies the south half. Cluster 2 is spread around while cluster 1 are far East and South West. 

#### Cluster 1
Have *Nail Salon* and *Motorcycle Shop* as most common 

#### Cluster 2
Have *Indian Restaurant* as most common 

#### Cluster 3
Have *Golf Course* and other outdoor activites as most common

#### Cluster 4
Have *Mexican Restaurant* and other restaurants as most common

#### Cluster 5
Have *Lake* as most common

## Discussion

## Conclusion

Based off of the previous **Results** and **Discussion** sections I have presented an outline of possible locations a new establishment can go based off of the norm of a particular area. However, I would want to also venture that if someone places a *Nail Salon* in cluster 1 there would be a lot of competition. To extend this report we can do further analysis on the longevity and income of venues to determine the optimal location. 

My initial speculation was also correct. Instead of high end stores there exists more *Golf Course* and other outdoor activites as most common in the North half of Irving while Fast Foods / Mexican restaurants were most common in the south half. 