<h1 align=center><font size=5>Coursera Capstone - The Tech Migration to Dallas–Fort Worth metroplex</font></h1>

<h1>Introduction - Business Problem </h1>

Dallas–Fort Worth metroplex area has quickly become a hub for tech companies, tech departments of Banking and Financial Services organizations and start ups, with companies moving from high cost and denser areas like Bay Area and NY Metro. The migration is not only driving growth and innovation, it is also impacting the real estate market. The objective of this project is to cluster neighborhoods in Dallas–Fort Worth metroplex, so that migrating tech workers moving from Bay Area or any other Tech Metro areas across US can make an informed decision on the neighborhood to choose for their future homes.

## Who would be interested in this project?

Few of my colleagues moved from the NY Metro area to Dallas–Fort Worth metroplex in the recent past, and many more are planning to move to Dallas–Fort Worth metroplex in the future. This project will recommend neighborhoods/zipcodes based on facilities, thus helping my colleagues to make a prudent decision while choosing a neighborhood/zipcode for their future stay based on their lifestyle.

<h1>Data</h1>

I will explore, segment, and cluster the neighborhoods based on the Zipcodes in the Dallas–Fort Worth metroplex. The Wikipage https://en.wikipedia.org/wiki/Dallas%E2%80%93Fort_Worth_metroplex#Dallas%E2%80%93Plano%E2%80%93Irving_metropolitan_division[26][27] has all the information we need to explore and identify the major cities in the Dallas–Fort Worth metroplex. Based on the analysis the important cities in the Dallas–Fort Worth metroplex are Dallas, Plano, Irving, Fort Worth, Arlington, and Grapevine.

After exploring numerous websites, I found the website https://public.opendatasoft.com/explore/dataset/us-zip-code-latitude-and-longitude/ , for getting all relevant information related to zipcodes, latititude and longitude coordinates for Dallas, Plano, Irving, Fort Worth, Arlington, and Grapevine.The data in this website was downloaded in the form of a CSV file. I uploaded the CSV file in this project and converted it to Pandas Dataframe.

Dallas–Fort Worth metro area consist of 7 major cities with population more than 200,000. These 7 major cities are Arlington, Dallas Fort Worth, Grape Vine, Irving, Lake Dallas, and Plano. These 7 major cities consists of 121 unique zipcodes with distinct latitude and longitude. The CSV file converted into the dataframe consist of Zipcode, City, State, Latitude and Longitude.

Also, I will use the Foursquare API to explore this neighborhoods/zipcodes in Dallas–Fort Worth metro area. I will use the **explore** function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. I will use the *k*-means clustering algorithm to complete this task. Finally, I will use the Folium library to visualize the neighborhoods in Dallas–Fort Worth metro area and their emerging clusters.


# Methodology

After the data exploration and identifying the source of data, we will apply the K-Means machine learning technique for creating clusters of Zip codes representing similar facilities. The following sections are for the exploratory data analysis and inferential statistical testing to be performed:

In [2]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Zip,City,State,Latitude,Longitude
0,75294,Dallas,TX,32.767268,-96.777626
1,75255,Dallas,TX,32.669783,-96.614921
2,75252,Dallas,TX,32.998132,-96.79088
3,75202,Dallas,TX,32.77988,-96.80502
4,75270,Dallas,TX,32.78133,-96.80198


### Plot Zip Codes on Folium Map

In [3]:
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

map_dallas = folium.Map(location=[32.7079,-96.9209],zoom_start=10)

for Latitude,Longitude,City,Zip in zip(df_data_1['Latitude'],df_data_1['Longitude'],df_data_1['City'],df_data_1['Zip']):
    label = '{}, {}'.format(City,Zip)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [Latitude,Longitude],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_dallas)
map_dallas

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    ------------------------------------------------------------
                       

## Define Foursquare Credentials and Version

In [4]:
import numpy as np # library to handle data in a vectorized manner
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
CLIENT_ID = 'L3REGKHVDVIFD2FOTV4YOXKJZAVGLRHA0XI3ONCHOEW4AYJ0' # Foursquare ID
CLIENT_SECRET = 'T1TMKPK0KVQ2WB0M5FSXLGKVWJT1ZZ5PXAC0CZ5ZR23Y4AXZ' # Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # limit of number of venues returned by Foursquare API
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: L3REGKHVDVIFD2FOTV4YOXKJZAVGLRHA0XI3ONCHOEW4AYJ0
CLIENT_SECRET:T1TMKPK0KVQ2WB0M5FSXLGKVWJT1ZZ5PXAC0CZ5ZR23Y4AXZ


## Explore Neighborhoods in Dallas–Fort Worth metroplex based on Zipcode

### Defining a function to get all neighborhood data using Foursquare API

In [5]:
def getNearbyVenues(Zip, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for Zip, lat, lng in zip(Zip, latitudes, longitudes):
        print(Zip)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            Zip, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Zip', 
                  'Zip Latitude', 
                  'Zip Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


### Create new dataframe *dallas_venues* by using the function *getNearbyVenues*.

In [6]:
dallas_venues = getNearbyVenues(Zip=df_data_1['Zip'],
                                   latitudes=df_data_1['Latitude'],
                                   longitudes=df_data_1['Longitude']
                                  )


75294
75255
75252
75202
75270
75220
75234
75215
75231
75251
75214
75210
75246
75247
75207
75212
75245
75204
75223
75287
75205
75230
75254
75217
75219
75226
75065
75228
75233
75227
75211
75218
75203
75229
75209
75201
75221
75237
75249
75236
75248
75225
75208
75243
75224
75216
75238
75232
75240
75241
75244
75253
75206
75235
75075
75094
75074
75024
75093
75025
75023
75086
75026
76025
75016
75059
75062
75063
75039
75060
75038
75061
76107
76179
76137
76345
76177
76129
76114
76103
76118
76110
76120
76115
76148
76102
76123
76153
76111
76112
76135
76134
76109
76105
76108
76116
76133
76131
76106
76104
76140
76119
76126
76155
76132
76013
76004
76012
76016
76002
76018
76017
76010
76001
76014
76015
76003
76011
76006
76051
76099


#### Check the resulting dataframe

In [7]:
print(dallas_venues.shape)
dallas_venues.head()

(1235, 7)


Unnamed: 0,Zip,Zip Latitude,Zip Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,75294,32.767268,-96.777626,Starbucks,32.766497,-96.774145,Coffee Shop
1,75294,32.767268,-96.777626,Enterprise Rent-A-Car,32.765043,-96.773478,Rental Car Location
2,75294,32.767268,-96.777626,Cedars Open Studios,32.766217,-96.781942,Art Gallery
3,75255,32.669783,-96.614921,Sid's Food Mart,32.669854,-96.614021,Deli / Bodega
4,75255,32.669783,-96.614921,Compressors Unlimited International LLC,32.66672,-96.61378,Home Service


#### Validate the number of venues returned for each neighborhood

In [8]:
dallas_venues.groupby('Zip').count()

Unnamed: 0_level_0,Zip Latitude,Zip Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Zip,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
75016,3,3,3,3,3,3
75023,7,7,7,7,7,7
75025,9,9,9,9,9,9
75038,2,2,2,2,2,2
75039,23,23,23,23,23,23
75059,14,14,14,14,14,14
75060,4,4,4,4,4,4
75061,2,2,2,2,2,2
75062,1,1,1,1,1,1
75063,11,11,11,11,11,11


#### We will find out the unique categories that can be obtained from all  the returned venues

In [9]:
print('There are {} uniques categories.'.format(len(dallas_venues['Venue Category'].unique())))

There are 216 uniques categories.


## Analyze Each Neighborhood

In [10]:
# one hot encoding
dallas_onehot = pd.get_dummies(dallas_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
dallas_onehot['Zip'] = dallas_venues['Zip'] 

# move neighborhood column to the first column
fixed_columns = [dallas_onehot.columns[-1]] + list(dallas_onehot.columns[:-1])
dallas_onehot = dallas_onehot[fixed_columns]

dallas_onehot.head()

Unnamed: 0,Zip,Adult Boutique,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Weight Loss Center,Wine Shop,Wings Joint,Women's Store
0,75294,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,75294,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,75294,0,0,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,75255,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,75255,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### We will examine the size of the dataframe

In [11]:
dallas_onehot.shape

(1235, 217)

#### Group rows by Zip code and by taking the mean of the frequency of occurence of each category

In [12]:
dallas_grouped = dallas_onehot.groupby('Zip').mean().reset_index()
dallas_grouped

Unnamed: 0,Zip,Adult Boutique,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Weight Loss Center,Wine Shop,Wings Joint,Women's Store
0,75016,0.0,0.000000,0.000000,0.0000,0.333333,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
1,75023,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
2,75025,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.111111,0.00000,0.0,0.0,0.000000,0.0000,0.0
3,75038,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
4,75039,0.0,0.086957,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
5,75059,0.0,0.071429,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
6,75060,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.250000,0.00000,0.0,0.0,0.000000,0.0000,0.0
7,75061,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
8,75062,0.0,0.000000,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0
9,75063,0.0,0.181818,0.000000,0.0000,0.000000,0.000000,0.00000,0.00000,0.0,...,0.0,0.0,0.00,0.000000,0.00000,0.0,0.0,0.000000,0.0000,0.0


#### *dallas_grouped* size

In [18]:
dallas_grouped.shape

(111, 217)

#### We will print each Zip code along with the top 5 most common venues

In [13]:
num_top_venues = 5

for hood in dallas_grouped['Zip']:
    print(hood)
    temp = dallas_grouped[dallas_grouped['Zip'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

75016
                 venue  freq
0          Art Gallery  0.33
1  Rental Car Location  0.33
2          Coffee Shop  0.33
3       Adult Boutique  0.00
4    Other Repair Shop  0.00


75023
               venue  freq
0    Thai Restaurant  0.43
1  Indian Restaurant  0.14
2           Pharmacy  0.14
3               Park  0.14
4             Bakery  0.14


75025
               venue  freq
0  Electronics Store  0.11
1  Health Food Store  0.11
2               Bank  0.11
3           Pharmacy  0.11
4    Doctor's Office  0.11


75038
                        venue  freq
0  Construction & Landscaping   0.5
1            Insurance Office   0.5
2              Adult Boutique   0.0
3           Other Repair Shop   0.0
4               Movie Theater   0.0


75039
                 venue  freq
0  American Restaurant  0.09
1            Gastropub  0.09
2   Tex-Mex Restaurant  0.09
3          Music Venue  0.09
4  Japanese Restaurant  0.04


75059
                  venue  freq
0  Gym / Fitness Center  0.21
1     

### We will put the data into a *pandas* dataframe

##### We will write a function to sort the venues in descending order.

In [14]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### We will create a new data frame and display the top 10 venues for each neighborhood

In [15]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Zip']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Zip'] = dallas_grouped['Zip']

for ind in np.arange(dallas_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(dallas_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Zip,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,75016,Art Gallery,Coffee Shop,Rental Car Location,Women's Store,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop
1,75023,Thai Restaurant,Bakery,Pharmacy,Indian Restaurant,Park,Driving School,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop
2,75025,Gas Station,Bank,Pharmacy,Fast Food Restaurant,Health Food Store,Video Store,Electronics Store,Doctor's Office,Grocery Store,Entertainment Service
3,75038,Construction & Landscaping,Insurance Office,Driving School,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
4,75039,American Restaurant,Music Venue,Tex-Mex Restaurant,Gastropub,Pizza Place,Japanese Restaurant,Deli / Bodega,Rental Car Location,Salad Place,Cajun / Creole Restaurant


# Cluster Neighborhoods

### Run *k-means* to cluster the neighborhood into 5 cluster

In [16]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

dallas_grouped_clustering = dallas_grouped.drop('Zip', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(dallas_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 2, 3, 2, 2, 2, 4, 2, 2], dtype=int32)

### We will create a new  dataframe the includes the Cluster as well as the top 10 venues for each Zip Code

In [17]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

dallas_merged = df_data_1

# merge dallas_grouped with df_data_1 to add latitude/longitude, City,  for each Zip code
dallas_merged = dallas_merged.join(neighborhoods_venues_sorted.set_index('Zip'), on='Zip')

dallas_merged.head() # check the last columns!

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,75294,Dallas,TX,32.767268,-96.777626,2.0,Art Gallery,Coffee Shop,Rental Car Location,Women's Store,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop
1,75255,Dallas,TX,32.669783,-96.614921,2.0,Mexican Restaurant,Home Service,Deli / Bodega,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space
2,75252,Dallas,TX,32.998132,-96.79088,2.0,Home Service,Juice Bar,Sandwich Place,Cosmetics Shop,Department Store,Health & Beauty Service,Coffee Shop,New American Restaurant,Gym / Fitness Center,Shipping Store
3,75202,Dallas,TX,32.77988,-96.80502,2.0,Hotel,Sandwich Place,Plaza,Bar,Coffee Shop,Convenience Store,Gift Shop,History Museum,Nightclub,Liquor Store
4,75270,Dallas,TX,32.78133,-96.80198,2.0,Coffee Shop,Hotel,Sandwich Place,Cocktail Bar,Mexican Restaurant,Café,Convenience Store,Gym,Sports Bar,Taco Place


### Finally we will visualize the resulting Clusters

In [57]:
# Drop all NaN value from dallas_merged dataframe
dallas_merged_1 = dallas_merged.dropna()

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[32.7079,-96.9209], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(dallas_merged_1['Latitude'], dallas_merged_1['Longitude'], dallas_merged_1['Zip'], dallas_merged_1['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# Examine Clusters

### We will examine each Cluster and determine the discriminating venue categories that distinguish each Cluster.

## Cluster 1

In [49]:
dallas_merged_1.loc[dallas_merged_1['Cluster Labels'] == 0, dallas_merged_1.columns[[0] + list(range(1, dallas_merged_1.shape[1]))]]

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
46,75238,Dallas,TX,32.873926,-96.70922,0.0,Athletics & Sports,Women's Store,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space
109,76002,Arlington,TX,32.632349,-97.0963,0.0,Athletics & Sports,Women's Store,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space


## Cluster 2

In [50]:
dallas_merged_1.loc[dallas_merged_1['Cluster Labels'] == 1, dallas_merged_1.columns[[0] + list(range(1, dallas_merged_1.shape[1]))]]

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
29,75227,Dallas,TX,32.77003,-96.69,1.0,Mexican Restaurant,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
81,76110,Fort Worth,TX,32.706331,-97.33787,1.0,Mexican Restaurant,Lawyer,Park,Women's Store,Dumpling Restaurant,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop
103,76155,Fort Worth,TX,32.830932,-97.04778,1.0,Mexican Restaurant,Food Court,Food Truck,Dance Studio,Deli / Bodega,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market
113,76001,Arlington,TX,32.634203,-97.14403,1.0,Mexican Restaurant,Dance Studio,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space


## Cluster 3

In [51]:
dallas_merged_1.loc[dallas_merged_1['Cluster Labels'] == 2, dallas_merged_1.columns[[0] + list(range(1, dallas_merged_1.shape[1]))]]

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,75294,Dallas,TX,32.767268,-96.777626,2.0,Art Gallery,Coffee Shop,Rental Car Location,Women's Store,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop
1,75255,Dallas,TX,32.669783,-96.614921,2.0,Mexican Restaurant,Home Service,Deli / Bodega,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space
2,75252,Dallas,TX,32.998132,-96.790880,2.0,Home Service,Juice Bar,Sandwich Place,Cosmetics Shop,Department Store,Health & Beauty Service,Coffee Shop,New American Restaurant,Gym / Fitness Center,Shipping Store
3,75202,Dallas,TX,32.779880,-96.805020,2.0,Hotel,Sandwich Place,Plaza,Bar,Coffee Shop,Convenience Store,Gift Shop,History Museum,Nightclub,Liquor Store
4,75270,Dallas,TX,32.781330,-96.801980,2.0,Coffee Shop,Hotel,Sandwich Place,Cocktail Bar,Mexican Restaurant,Café,Convenience Store,Gym,Sports Bar,Taco Place
5,75220,Dallas,TX,32.867977,-96.863060,2.0,Pizza Place,Grocery Store,Gym,Mobile Phone Shop,Electronics Store,Restaurant,Gas Station,Mexican Restaurant,Chinese Restaurant,Fast Food Restaurant
6,75234,Dallas,TX,32.925975,-96.883220,2.0,Pizza Place,Mexican Restaurant,Breakfast Spot,Fast Food Restaurant,BBQ Joint,Big Box Store,Chinese Restaurant,Shoe Store,Bank,Thrift / Vintage Store
7,75215,Dallas,TX,32.761030,-96.770350,2.0,Convenience Store,BBQ Joint,Fried Chicken Joint,Food,Home Service,Deli / Bodega,Electronics Store,Fondue Restaurant,Flower Shop,Dance Studio
8,75231,Dallas,TX,32.874317,-96.747640,2.0,Football Stadium,Athletics & Sports,Tennis Court,Baseball Field,Dumpling Restaurant,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market
9,75251,Dallas,TX,32.919104,-96.774970,2.0,Hotel,Gym / Fitness Center,American Restaurant,Deli / Bodega,Restaurant,Residential Building (Apartment / Condo),Arts & Crafts Store,Coffee Shop,Gym,Japanese Restaurant


## Cluster 4

In [52]:
dallas_merged_1.loc[dallas_merged_1['Cluster Labels'] == 3, dallas_merged_1.columns[[0] + list(range(1, dallas_merged_1.shape[1]))]]

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,75214,Dallas,TX,32.825628,-96.74872,3.0,Construction & Landscaping,Spa,Dry Cleaner,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
70,75038,Irving,TX,32.872386,-96.98524,3.0,Construction & Landscaping,Insurance Office,Driving School,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
78,76114,Fort Worth,TX,32.781329,-97.40099,3.0,Boxing Gym,Construction & Landscaping,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space
93,76105,Fort Worth,TX,32.724831,-97.26992,3.0,Construction & Landscaping,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
101,76119,Fort Worth,TX,32.691033,-97.26479,3.0,Construction & Landscaping,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant


## Cluster 5

In [53]:
dallas_merged_1.loc[dallas_merged_1['Cluster Labels'] == 4, dallas_merged_1.columns[[0] + list(range(1, dallas_merged_1.shape[1]))]]

Unnamed: 0,Zip,City,State,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
39,75236,Dallas,TX,32.685533,-96.91746,4.0,American Restaurant,Women's Store,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space
71,75061,Irving,TX,32.826729,-96.9614,4.0,Convenience Store,Park,Driving School,Fondue Restaurant,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
92,76109,Fort Worth,TX,32.699565,-97.37808,4.0,Home Service,Park,Women's Store,Driving School,Flower Shop,Fast Food Restaurant,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
104,76132,Fort Worth,TX,32.670345,-97.4143,4.0,Fast Food Restaurant,Women's Store,Dry Cleaner,Food,Fondue Restaurant,Flower Shop,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
112,76010,Arlington,TX,32.723382,-97.08498,4.0,American Restaurant,Fast Food Restaurant,Park,Chinese Restaurant,Women's Store,Dry Cleaner,Fondue Restaurant,Flower Shop,Farmers Market,Fabric Shop
116,76003,Arlington,TX,32.741685,-97.225324,4.0,Fast Food Restaurant,Park,Women's Store,Driving School,Fondue Restaurant,Flower Shop,Farmers Market,Fabric Shop,Event Space,Ethiopian Restaurant
