# Potential venue based On zip codes in Houston 

#### In this project, we will analyze existing venues in different houston zipcode. A question we are trying to answer is if someone is looking to open a venue, where would you recommend that they open it?


##### For this project we need the following data:
<ul>
    <li> Zip Codes in Houston </li>
    <li> Latitudes and Longitudes of Zip Codes </li>
</ul>

#### We will use the data available at http://www.geonames.org/postalcode-search.html?q=Houston&country=US&adminCode1=TX

#### Using Beautifulsoup4 package we will extract the table in the website.  
#### Table would consist of 3 columns
<ul>
    <li> Zip Code </li>
    <li> Latitude </li>
    <li> Longitude </li>
</ul>

#### Let's import required packages

In [65]:
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
import requests

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe


import matplotlib.cm as cm
import matplotlib.colors as colors

#### Let's read in the page including the table

In [66]:
source = requests.get('http://www.geonames.org/postalcode-search.html?q=Houston&country=US&adminCode1=TX').text
soup = BeautifulSoup(source,'lxml')
print(soup.prettify())

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd ">
<html>
 <head>
  <title>
   Postal Codes Texas, United States
  </title>
  <link href="http://www.geonames.org/opensearch-description.xml" rel="search" title="geonames" type="application/opensearchdescription+xml"/>
  <link href="/geonames.ico" rel="shortcut icon"/>
  <link href="/geonames.css" rel="StyleSheet" type="text/css"/>
 </head>
 <body>
  <table cellpadding="0" cellspacing="0" id="topmenutable">
   <tr>
    <td class="topmenu">
     <a href="/" title="GeoName Home">
      GeoNames Home
     </a>
     |
     <a href="/postal-codes/" title="Postal Codes">
      Postal Codes
     </a>
     |
     <a href="/export/" title="Database Dump and Webservice API">
      Download / Webservice
     </a>
     |
     <a href="/about.html" title="About GeoNames">
      About
     </a>
    </td>
    <td class="topsearch">
     <form action="/servlet/geonames" class="topsearch" method="get" n

#### Let's read the table in page and turn it into a data frame

In [67]:
data = []
table = soup.find('table',{'class':'restable'})

rows = table.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])
    
data = data[1:]
Houston = []
latlong = data[1::2]
latlong
zip_code = []
for ele in data:
    if len(ele) > 2:
        zip_code.append(ele[2])

Houston_zip = []
for ele in range(len(zip_code)):
    Houston_zip.append(int(zip_code[ele]))
    
Houston_zip  # One part of our data is good to go


Houston_latitude = []
Houston_longitude = []


for ele in range(len(latlong)):
    for elem in latlong[ele]:
        k = elem.split('/')
        Houston_latitude.append(float(k[0]))
        Houston_longitude.append(float(k[1]))
        
# Now we have all three columns required
Houston_data = pd.DataFrame({'Zip Code': Houston_zip,'Latitude' : Houston_latitude,'Longitude':Houston_longitude},
                             columns=['Zip Code','Latitude','Longitude'])

Houston_data

Unnamed: 0,Zip Code,Latitude,Longitude
0,77002,29.759,-95.359
1,77092,29.832,-95.472
2,77005,29.718,-95.426
3,77006,29.741,-95.392
4,77007,29.774,-95.403
5,77019,29.752,-95.405
6,77024,29.770,-95.520
7,77025,29.689,-95.434
8,77027,29.740,-95.446
9,77030,29.704,-95.401


#### So based on the above table Houston has 188 zip codes

### Lets create the map of Houston

In [68]:
#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim

address = 'Houston, TX'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinate of Houston are {},{}.'.format(latitude,longitude))



The geographical coordinate of Houston are 29.7589382,-95.3676974.


In [69]:
map_Houston = folium.Map(location=[latitude,longitude],zoom_start=10)
map_Houston

#### Now let's also show different zip code and their area

In [70]:
for lat, lng, zip_code in zip(Houston_data['Latitude'], Houston_data['Longitude'], Houston_data['Zip Code']):
    label = '{}'.format(zip_code)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Houston)  
    
map_Houston

#### In next step we cluster the zip codes into 15 clusters (Choice of k-15 is arbitrary)

In [95]:
num_clusters = 15
kmeans = KMeans(n_clusters=num_clusters,random_state=0).fit(Houston_data[['Latitude','Longitude']])
Houston_data['Cluster_label'] = kmeans.labels_
Lat = []
Long = []
for i in range(len(kmeans.cluster_centers_)):
    Lat.append(kmeans.cluster_centers_[i][0])
    Long.append(kmeans.cluster_centers_[i][1])

cluster_id = ['Cluster {}'.format(i) for i in range(15)]
centers = pd.DataFrame({'Cluster ID':cluster_id,'Cluster Center Lat':Lat,'Cluster Center Long':Long},columns=['Cluster ID','Cluster Center Lat','Cluster Center Long'])


Houston_data[Houston_data.Cluster_label==4]
Houston_data

Unnamed: 0,Zip Code,Latitude,Longitude,Cluster_label
0,77002,29.759,-95.359,7
1,77092,29.832,-95.472,3
2,77005,29.718,-95.426,11
3,77006,29.741,-95.392,7
4,77007,29.774,-95.403,7
5,77019,29.752,-95.405,7
6,77024,29.770,-95.520,5
7,77025,29.689,-95.434,11
8,77027,29.740,-95.446,11
9,77030,29.704,-95.401,7


#### and we visualize the resulting clusters

In [72]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=9)

# set color scheme for the clusters
x = np.arange(num_clusters)
ys = [i+x+(i*x)**2 for i in range(num_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]




# add markers to the map
markers_colors = []
for lat, lon, zip_code, cluster in zip(Houston_data['Latitude'], Houston_data['Longitude'],Houston_data['Zip Code'], Houston_data['Cluster_label']):
    label = folium.Popup(str(zip_code) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=2,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Now, let's get the top 50 venues that are in each cluster (if exists) within a radius of 5 miles.

First, let's create the GET request URL. Name your URL **url**.

#### Define Foursquare Credentials and Version


In [73]:
CLIENT_ID = 'INZNX14GS3IMP5RBM5UO552GMEZTOFP12GOUAHZO10AIPLPU' # your Foursquare ID
CLIENT_SECRET = 'ZOPPSDLYKDHPNC0K5BZQR2VUZSHHINGQSSEXNQGBI5140DIN' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: INZNX14GS3IMP5RBM5UO552GMEZTOFP12GOUAHZO10AIPLPU
CLIENT_SECRET:ZOPPSDLYKDHPNC0K5BZQR2VUZSHHINGQSSEXNQGBI5140DIN


First, let's create the GET request URL for cluster 0. Name your URL **url**.

In [74]:
LIMIT = 100
radius = 5000 * 1.64
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    centers.iloc[0,0], 
    centers.iloc[0,1], 
    radius, 
    LIMIT)

Send the GET request and examine the resutls

In [75]:
results = requests.get(url).json()
results

{'meta': {'code': 400,
  'errorDetail': 'll must be of the form XX.XX,YY.YY (received Cluster 0,29.813875)',
  'errorType': 'param_error',
  'requestId': '5becb3eddb04f56963a821c1'},
 'response': {}}

We know that all the information is in the *items* key. Before we proceed, let's borrow the **get_category_type** function from the Foursquare.

In [76]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a *pandas* dataframe.

In [77]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

KeyError: 'groups'

#### This was the venues in 5 miles radius from center of cluster 0

## Explore other clusters in Houston

#### Let's create a function to repeat the same process to all the clusters 

In [None]:
def getNearbyVenues(names, latitudes, longitudes, radius=5000*1.64):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Now let's start exploring cluster 1 - cluster 14

In [78]:
Houston_venues = getNearbyVenues(names=centers['Cluster ID'],
                                latitudes = centers['Cluster Center Lat'],
                                longitudes = centers['Cluster Center Long'])

Houston_venues.head()

Cluster 0
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Cluster 6
Cluster 7
Cluster 8
Cluster 9
Cluster 10
Cluster 11
Cluster 12
Cluster 13
Cluster 14


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Cluster 0,29.813875,-95.238875,BONFIRE WINGS,29.793568,-95.193373,Cajun / Creole Restaurant
1,Cluster 0,29.813875,-95.238875,Pappasito's Cantina,29.773364,-95.229224,Mexican Restaurant
2,Cluster 0,29.813875,-95.238875,Pappadeaux Seafood Kitchen,29.7691,-95.21705,Seafood Restaurant
3,Cluster 0,29.813875,-95.238875,Saltgrass Steak House,29.770529,-95.226547,Steakhouse
4,Cluster 0,29.813875,-95.238875,Pappa's Bar B Q,29.76989,-95.21717,BBQ Joint


Let's check how many venues were returned for each cluster

In [79]:
Houston_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Cluster 0,100,100,100,100,100,100
Cluster 1,3,3,3,3,3,3
Cluster 10,100,100,100,100,100,100
Cluster 11,100,100,100,100,100,100
Cluster 12,100,100,100,100,100,100
Cluster 13,2,2,2,2,2,2
Cluster 14,100,100,100,100,100,100
Cluster 2,100,100,100,100,100,100
Cluster 3,100,100,100,100,100,100
Cluster 4,8,8,8,8,8,8


#### We will drop clusters with less than 100 venues

In [80]:
Houston_venues.drop(Houston_venues[Houston_venues['Neighborhood']=='Cluster 1'].index,inplace=True)
Houston_venues.drop(Houston_venues[Houston_venues['Neighborhood']=='Cluster 4'].index,inplace=True)
Houston_venues.drop(Houston_venues[Houston_venues['Neighborhood']=='Cluster 8'].index,inplace=True)
Houston_venues.drop(Houston_venues[Houston_venues['Neighborhood']=='Cluster 13'].index,inplace=True)

In [81]:
Houston_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Cluster 0,29.813875,-95.238875,BONFIRE WINGS,29.793568,-95.193373,Cajun / Creole Restaurant
1,Cluster 0,29.813875,-95.238875,Pappasito's Cantina,29.773364,-95.229224,Mexican Restaurant
2,Cluster 0,29.813875,-95.238875,Pappadeaux Seafood Kitchen,29.7691,-95.21705,Seafood Restaurant
3,Cluster 0,29.813875,-95.238875,Saltgrass Steak House,29.770529,-95.226547,Steakhouse
4,Cluster 0,29.813875,-95.238875,Pappa's Bar B Q,29.76989,-95.21717,BBQ Joint


In [82]:
Houston_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Cluster 0,100,100,100,100,100,100
Cluster 10,100,100,100,100,100,100
Cluster 11,100,100,100,100,100,100
Cluster 12,100,100,100,100,100,100
Cluster 14,100,100,100,100,100,100
Cluster 2,100,100,100,100,100,100
Cluster 3,100,100,100,100,100,100
Cluster 5,100,100,100,100,100,100
Cluster 6,100,100,100,100,100,100
Cluster 7,100,100,100,100,100,100


#### Let's find out how many unique categories can be curated from all the returned venues

In [83]:
print('There are {} uniques categories.'.format(len(Houston_venues['Venue Category'].unique())))

There are 181 uniques categories.


## Analyze Each cluster


In [84]:
# one hot encoding
Houston_onehot = pd.get_dummies(Houston_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Houston_onehot['Neighborhood'] = Houston_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Houston_onehot.columns[-1]] + list(Houston_onehot.columns[:-1])
Houston_onehot = Houston_onehot[fixed_columns]

Houston_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,Airport Service,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,...,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Cluster 0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Cluster 0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Cluster 0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Cluster 0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Cluster 0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [86]:
Houston_onehot.shape

(1100, 182)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [87]:
Houston_grouped = Houston_onehot.groupby('Neighborhood').mean().reset_index()
Houston_grouped

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,Airport Service,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,...,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Cluster 0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.01,0.0,0.0
1,Cluster 10,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0
2,Cluster 11,0.01,0.01,0.0,0.04,0.0,0.0,0.01,0.0,0.0,...,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0
3,Cluster 12,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0
4,Cluster 14,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,...,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0
5,Cluster 2,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,...,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Cluster 3,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,...,0.0,0.02,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.01
7,Cluster 5,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,...,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0
8,Cluster 6,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0
9,Cluster 7,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.02,...,0.0,0.02,0.0,0.0,0.04,0.0,0.03,0.0,0.0,0.01


#### Let's confirm the new size


In [89]:
Houston_grouped.shape

(11, 182)

#### Let's print each neighborhood along with the top 10 most common venues


In [91]:
num_top_venues = 20

for hood in Houston_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Houston_grouped[Houston_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Cluster 0----
                   venue  freq
0     Mexican Restaurant  0.15
1     Seafood Restaurant  0.05
2         Discount Store  0.05
3         Sandwich Place  0.05
4               Pharmacy  0.04
5         Ice Cream Shop  0.04
6              BBQ Joint  0.04
7     Chinese Restaurant  0.03
8       Department Store  0.03
9                   Park  0.03
10  Gym / Fitness Center  0.03
11          Burger Joint  0.03
12   American Restaurant  0.03
13           Pizza Place  0.03
14  Fast Food Restaurant  0.02
15            Steakhouse  0.02
16     Convenience Store  0.02
17            Donut Shop  0.02
18            Taco Place  0.02
19   Fried Chicken Joint  0.02


----Cluster 10----
                        venue  freq
0                 Coffee Shop  0.06
1          Mexican Restaurant  0.05
2        Fast Food Restaurant  0.05
3                Burger Joint  0.05
4            Asian Restaurant  0.04
5                       Hotel  0.03
6                    Pharmacy  0.03
7            Sushi Res

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [92]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 20 venues for each neighborhood.

In [94]:
num_top_venues = 20

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Houston_grouped['Neighborhood']

for ind in np.arange(Houston_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Houston_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Cluster 0,Mexican Restaurant,Discount Store,Sandwich Place,Seafood Restaurant,Ice Cream Shop,BBQ Joint,Pharmacy,Pizza Place,Burger Joint,...,American Restaurant,Park,Gym / Fitness Center,Department Store,Video Game Store,Taco Place,Fast Food Restaurant,Fried Chicken Joint,Steakhouse,Grocery Store
1,Cluster 10,Coffee Shop,Mexican Restaurant,Burger Joint,Fast Food Restaurant,Asian Restaurant,Hotel,Pharmacy,BBQ Joint,Sushi Restaurant,...,American Restaurant,Fried Chicken Joint,Italian Restaurant,Cajun / Creole Restaurant,Deli / Bodega,Pizza Place,Donut Shop,Grocery Store,Vietnamese Restaurant,Liquor Store
2,Cluster 11,Burger Joint,Ice Cream Shop,American Restaurant,Sushi Restaurant,Steakhouse,Clothing Store,Café,Dessert Shop,Mexican Restaurant,...,Smoke Shop,Bakery,Shopping Mall,Seafood Restaurant,Cajun / Creole Restaurant,Fried Chicken Joint,Deli / Bodega,Department Store,Jewelry Store,Electronics Store
3,Cluster 12,Mexican Restaurant,Hotel,Burger Joint,Seafood Restaurant,Sandwich Place,Rental Car Location,Discount Store,Pharmacy,Donut Shop,...,Cajun / Creole Restaurant,Furniture / Home Store,Electronics Store,Flea Market,Ice Cream Shop,Grocery Store,American Restaurant,Fast Food Restaurant,Brewery,Chinese Restaurant
4,Cluster 14,Seafood Restaurant,Mexican Restaurant,American Restaurant,Fast Food Restaurant,Italian Restaurant,Coffee Shop,Pharmacy,Grocery Store,Smoothie Shop,...,Breakfast Spot,Cajun / Creole Restaurant,Steakhouse,Jewelry Store,Pizza Place,Park,Donut Shop,Bar,Bookstore,Toy / Game Store
5,Cluster 2,Hotel,Cocktail Bar,Coffee Shop,Mexican Restaurant,American Restaurant,Beer Garden,Ice Cream Shop,Theater,Plaza,...,Sandwich Place,Steakhouse,Restaurant,Historic Site,Lounge,Concert Hall,New American Restaurant,Dessert Shop,Italian Restaurant,Church
6,Cluster 3,Burger Joint,Coffee Shop,BBQ Joint,Sports Bar,Bar,Italian Restaurant,Taco Place,Beer Garden,Mexican Restaurant,...,Fast Food Restaurant,Gym / Fitness Center,Wine Bar,Seafood Restaurant,Park,Cocktail Bar,Donut Shop,Brewery,Pizza Place,Chinese Restaurant
7,Cluster 5,Grocery Store,Mexican Restaurant,Park,Fast Food Restaurant,Burger Joint,Pizza Place,Pub,Asian Restaurant,Cajun / Creole Restaurant,...,Latin American Restaurant,Vietnamese Restaurant,Mediterranean Restaurant,Noodle House,Liquor Store,Café,Department Store,Bookstore,Middle Eastern Restaurant,Seafood Restaurant
8,Cluster 6,Sushi Restaurant,Grocery Store,Mexican Restaurant,Coffee Shop,Science Museum,Bakery,American Restaurant,Park,Cocktail Bar,...,Cosmetics Shop,Spa,Liquor Store,Sandwich Place,Sporting Goods Shop,Furniture / Home Store,Chinese Restaurant,Breakfast Spot,Pizza Place,Burger Joint
9,Cluster 7,Pizza Place,Coffee Shop,Park,Bar,Cocktail Bar,Mexican Restaurant,Vietnamese Restaurant,American Restaurant,Brewery,...,Beer Garden,Wine Bar,Concert Hall,Cajun / Creole Restaurant,Baseball Stadium,Burger Joint,Steakhouse,Breakfast Spot,Theater,Art Museum


## Conclusion

Based on the above table and map of clusters provided above, any person who wants
to establish a venue can decide where is the best cluster and where are the best zip codes
to establish his or her venue.

For example, if you as a business person want to open up a Mexican Restaurant 
it is strong evidence that cluster 0,9,12 are the best clusters in houston for 
mexican restaurant. These clusters include zip codes that can be found using 
**Houston_data** table