# THE BATTLE OF THE NEIGHBORHOODS

### Applied Data Science Capstone Project by IBM/Coursera

## Table of Contents

* [INTRODUCTION: Description & Background](#introduction)
* [DATA: Source, Acquisition & Cleaning](#data)
* [METHODOLOGY](#methodology)
    * [Fourquare](#foursquare)
    * [Distance](#distance)
* [ANALYSIS & DISCUSSION](#analysis)
* [CONCLUSION](#conclusion)

## INTRODUCTION <a name="introduction"> </a>

### Description and Background

In this project, we'll explore a Korean fried chicken and bar business opportunity around Austin, Texas neighborhoods. Fried chicken and bar has a very wide appeal in Korea, among all age groups but is particularly popular among the young college students. Korean fried chicken and bar restaurants are also very popular in places like Los Angeles and San Jose where there are large Korean communities.  

We'll explore many of the Austin downtown surrounding neighborhoods for the location of the business. Contrary to finding a location away from other food establishments, we'll look to setup the business in a neighborhood where there is a concentration of restaurants. 

* Opportunity:      
Korean Fried Chicken and Bar


* Location:         
A neighborhood near Austin, Texas


* Target Audience:  
Korean community but we believe it will have a mass appeal to all age groups and non-Koreans.


## DATA <a name="data"> </a>

### Sources to Explore: <a name="data"> </a>

A list of neighborhoods in Austin, Texas will be obtained from wikipedia and the data will be cleaned up and processed through geocode/Nominatim to extract latitudes and longitudes of each neighborhood.

Next the data will be processed through the Foursquare API to obtain venues around the neighborhoods. We'll analyze the resulting data to look for the neighborhood with the highest concentration of fast food/restaurant/fried chicken/bar type businesses as the location choice.

We'll also obtain and analyze data on population size, growth and demographics in Austin for additional support of the business opportunity thesis.  

Data Sources:
1. Data from wikipedia / Neighborhoods around Austin Texas:

https://en.wikipedia.org/wiki/List_of_Austin_neighborhoods

2. World Population Review: 

http://worldpopulationreview.com/us-cities/austin-population/

3. Austin Asian Chamber of Commerce 

http://www.austinasianchamber.org/asian-data






## Data Acquisition and Cleaning <a name="dataprep"> </a>

In [1]:
import numpy as np    # library to handle data in a vectorized manner

import pandas as pd    # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Matplotlib and associated plotting modules
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
%matplotlib inline

import requests
import json    # library to handle JSON files
from pandas.io.json import json_normalize

# uncomment this line if you haven't completed the Foursquare API lab
#!conda install -c conda-forge geopy --yes 
import geopy as gp
from geopy import distance
from geopy.geocoders import Nominatim    # convert an address into latitude and longitude values
gp.geocoders.options.default_user_agent = "agentpython"

# import k-means from clustering stage
from sklearn.cluster import KMeans
from sklearn.datasets.samples_generator import make_blobs

# uncomment this line if you haven't completed the Foursquare API lab
#!conda install -c conda-forge folium=0.5.0 --yes 
import folium    # map rendering library

print('Libraries imported.')

Libraries imported.


In [2]:
# Neighborhoods near Austin Texas
# Cleanup data to 
url = 'https://en.wikipedia.org/wiki/List_of_Austin_neighborhoods'
df = pd.read_html(url, header=0)[0]
df.rename(columns={'Name':'Neighborhood'},inplace=True)
df.drop(df.columns[1],axis=1,inplace=True)

print(df)
print()
print(df.info())


                Neighborhood
0               Bryker Woods
1            Caswell Heights
2            Downtown Austin
3                  Eastwoods
4                    Hancock
5                   Heritage
6                  Hyde Park
7               Judges' Hill
8         Lower Waller Creek
9           North University
10           Oakmont Heights
11               Old Enfield
12          Old Pecan Street
13           Old West Austin
14           Original Austin
15  Original West University
16         Pemberton Heights
17                  Ridgelea
18                  Ridgetop
19                  Rosedale
20               Shoal Crest
21             West Downtown

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 22 entries, 0 to 21
Data columns (total 1 columns):
Neighborhood    22 non-null object
dtypes: object(1)
memory usage: 256.0+ bytes
None


In [3]:

# Retrive location information usning Nominatim geocoder
alat = []
alng = []
aloc = []
aneigh = []

for neigh in df['Neighborhood']:
    geolocator = Nominatim(user_agent="agent")
    location = geolocator.geocode(neigh+' Texas')
    
    if pd.isnull(location):
        df.drop(df[df['Neighborhood'] == neigh].index,inplace=True)
    elif (int(location.latitude) != 30) or (int(location.longitude) != -97):
        df.drop(df[df['Neighborhood'] == neigh].index,inplace=True)        
    else:
        alat.append(location.latitude)
        alng.append(location.longitude)
        aloc.append(location.address)
        aneigh.append(neigh)
#         print('{:.8f} | {:.8f} | {} |{}'.format(location.latitude, location.longitude, neigh, location.address))

# Convert list to Series to be added to the main df
alat = pd.Series(alat, name='Latitude')
alng = pd.Series(alng, name='Longitude')
aloc = pd.Series(aloc, name='Address')
aneigh = pd.Series(aneigh, name='Neighborhood')

# Reset index for df since 1 or more rows were dropped for lack of data
df.reset_index(drop=True, inplace=True)

# Create DF for the Series
austin = pd.concat([df, alat, alng, aloc],axis=1)
austin

Unnamed: 0,Neighborhood,Latitude,Longitude,Address
0,Bryker Woods,30.305246,-97.754585,"Bryker Woods, Austin, Travis County, Texas, 78..."
1,Downtown Austin,30.268054,-97.744764,"Downtown, Austin, Travis County, Texas, 78701,..."
2,Eastwoods,30.290562,-97.731418,"Eastwoods Neighborhood Park, 3001, Hancock, Au..."
3,Hyde Park,30.304412,-97.730448,"Hyde Park, Austin, Travis County, Texas, 78751..."
4,North University,30.284151,-97.731956,"University of Texas at Austin, 1, East 23rd St..."
5,Oakmont Heights,30.311989,-97.75407,"Oakmont Heights, Austin, Travis County, Texas,..."
6,Old Enfield,30.284864,-97.759106,"Enfield, Austin, Travis County, Texas, 78703, USA"
7,Old West Austin,30.296822,-97.754851,"Old West Austin, Austin, Travis County, Texas,..."
8,Original Austin,30.273778,-97.72048,The Original New Orleans Po-Boy and Gumbo Shop...
9,Shoal Crest,30.297465,-97.747888,"Shoal Crest Avenue, Old West Austin, Austin, T..."


In [4]:
print(austin.info())
print()
print('Shape: ',austin.shape)


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 4 columns):
Neighborhood    10 non-null object
Latitude        10 non-null float64
Longitude       10 non-null float64
Address         10 non-null object
dtypes: float64(2), object(2)
memory usage: 400.0+ bytes
None

Shape:  (10, 4)


## Map of Austin Neighborhoods

In [5]:
print('The dataframe has {} neighborhoods. Shape {}'
    .format(len(austin['Neighborhood'].unique()),austin.shape[0]))

# Get geo code for the address
address = 'Austin Texas'
geolocator = Nominatim(user_agent="agent2")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('Geographical coordinates of {} are {}, {}.'.format(address, latitude, longitude))


# create map of Austin using latitude and longitude values
map_austin = folium.Map(location=[latitude, longitude], zoom_start=12) 

# add markers to map
for lat, lng, label in zip(austin['Latitude'], 
                           austin['Longitude'], 
                           austin['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_austin)  
    
map_austin

The dataframe has 10 neighborhoods. Shape 10
Geographical coordinates of Austin Texas are 30.2711286, -97.7436995.


## METHODOLOGY<a name="methodology"> </a>

## Foursquare<a name="foursquare"> </a>

### Austin, Texas Neighborhoods

In [6]:
# HIDE / DELETE THIS SECTION BEFORE PUBLISHING



In [7]:
austin.loc[0,'Neighborhood']
neighborhood_latitude = austin.loc[0, 'Latitude']
neighborhood_longitude = austin.loc[0, 'Longitude']
neighborhood_name = austin.loc[0, 'Neighborhood'] 

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

# Set limits
LIMIT = 100
radius = 500
urlfs = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

urlfs # display URL
results = requests.get(urlfs).json()
results

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    
venues = results['response']['groups'][0]['items']    
nearby_venues = json_normalize(venues)    # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Latitude and longitude values of Bryker Woods are 30.305246, -97.7545846.


Unnamed: 0,name,categories,lat,lng
0,Olive & June,Italian Restaurant,30.30745,-97.751046
1,Salon Hush,Cosmetics Shop,30.307783,-97.752403
2,Austin Flower Company,Flower Shop,30.307787,-97.751224
3,Tiny Boxwoods,American Restaurant,30.306058,-97.749789
4,Brykerwood Veterinary Clinic,Veterinarian,30.305978,-97.749611


In [8]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

21 venues were returned by Foursquare.


### 2. Explore Neighborhoods in Austin Texas

In [9]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

# Nearby venues
austin_venues = getNearbyVenues(names=austin['Neighborhood'],
                                   latitudes=austin['Latitude'],
                                   longitudes=austin['Longitude']
                                  )

print(austin_venues.shape)
print('There are {} uniques categories.'.format(len(austin_venues['Venue Category'].unique())))

# austin_venues.head()
austin_venues.groupby('Neighborhood').count()


Bryker Woods
Downtown Austin
Eastwoods
Hyde Park
North University
Oakmont Heights
Old Enfield
Old West Austin
Original Austin
Shoal Crest
(244, 7)
There are 112 uniques categories.


Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bryker Woods,21,21,21,21,21,21
Downtown Austin,100,100,100,100,100,100
Eastwoods,20,20,20,20,20,20
Hyde Park,20,20,20,20,20,20
North University,21,21,21,21,21,21
Oakmont Heights,14,14,14,14,14,14
Old Enfield,5,5,5,5,5,5
Old West Austin,6,6,6,6,6,6
Original Austin,20,20,20,20,20,20
Shoal Crest,17,17,17,17,17,17


### 3. Analyze Each Neighborhood

In [10]:
# one hot encoding
austin_onehot = pd.get_dummies(austin_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
austin_onehot['Neighborhood'] = austin_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [austin_onehot.columns[-1]] + list(austin_onehot.columns[:-1])
austin_onehot = austin_onehot[fixed_columns]
austin_onehot.head()
austin_onehot.shape
austin_grouped = austin_onehot.groupby('Neighborhood').mean().reset_index()
austin_grouped
austin_grouped.shape

num_top_venues = 10
for hood in austin_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = austin_grouped[austin_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')
    

----Bryker Woods----
                  venue  freq
0    Italian Restaurant  0.10
1   Rental Car Location  0.10
2                  Park  0.10
3             Gift Shop  0.05
4        Rental Service  0.05
5           Pizza Place  0.05
6        Clothing Store  0.05
7  Fast Food Restaurant  0.05
8           Flower Shop  0.05
9              Pharmacy  0.05


----Downtown Austin----
                venue  freq
0         Coffee Shop  0.07
1               Hotel  0.07
2                 Bar  0.06
3  Italian Restaurant  0.04
4          Restaurant  0.04
5             Gay Bar  0.04
6        Cocktail Bar  0.04
7              Lounge  0.04
8          Steakhouse  0.03
9    Sushi Restaurant  0.03


----Eastwoods----
                   venue  freq
0         Sandwich Place  0.15
1                    Pub  0.10
2                    ATM  0.05
3             Sports Bar  0.05
4      Convenience Store  0.05
5           Concert Hall  0.05
6  College Arts Building  0.05
7            Coffee Shop  0.05
8             Pl

### Convert into pandas dataframe

In [11]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = austin_grouped['Neighborhood']

for ind in np.arange(austin_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(austin_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bryker Woods,Park,Rental Car Location,Italian Restaurant,Clothing Store,Flower Shop,Recreation Center,Rental Service,Cosmetics Shop,Bridal Shop,Pharmacy
1,Downtown Austin,Hotel,Coffee Shop,Bar,Gay Bar,Restaurant,Cocktail Bar,Lounge,Italian Restaurant,Burger Joint,Sushi Restaurant
2,Eastwoods,Sandwich Place,Pub,Bubble Tea Shop,History Museum,Mexican Restaurant,Convenience Store,Concert Hall,Park,Performing Arts Venue,College Arts Building
3,Hyde Park,Italian Restaurant,Cheese Shop,Bus Station,Lighthouse,Pool,Mexican Restaurant,Salon / Barbershop,Food Truck,Bed & Breakfast,Bakery
4,North University,History Museum,Fast Food Restaurant,Performing Arts Venue,Coffee Shop,Pool,Sandwich Place,Fountain,Football Stadium,Food Truck,Outdoor Sculpture


### 4. Cluster Neighborhoods with KMeans and Map

In [12]:
# set number of clusters
kclusters = 6
austin_grouped_clustering = austin_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(austin_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 3, 1, 3, 0, 2, 4, 1, 5], dtype=int32)

In [13]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
austin_merged = austin

# merge austin_grouped with austin_data to add latitude/longitude for each neighborhood
austin_merged = austin_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
austin_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bryker Woods,30.305246,-97.754585,"Bryker Woods, Austin, Travis County, Texas, 78...",0,Park,Rental Car Location,Italian Restaurant,Clothing Store,Flower Shop,Recreation Center,Rental Service,Cosmetics Shop,Bridal Shop,Pharmacy
1,Downtown Austin,30.268054,-97.744764,"Downtown, Austin, Travis County, Texas, 78701,...",1,Hotel,Coffee Shop,Bar,Gay Bar,Restaurant,Cocktail Bar,Lounge,Italian Restaurant,Burger Joint,Sushi Restaurant
2,Eastwoods,30.290562,-97.731418,"Eastwoods Neighborhood Park, 3001, Hancock, Au...",3,Sandwich Place,Pub,Bubble Tea Shop,History Museum,Mexican Restaurant,Convenience Store,Concert Hall,Park,Performing Arts Venue,College Arts Building
3,Hyde Park,30.304412,-97.730448,"Hyde Park, Austin, Travis County, Texas, 78751...",1,Italian Restaurant,Cheese Shop,Bus Station,Lighthouse,Pool,Mexican Restaurant,Salon / Barbershop,Food Truck,Bed & Breakfast,Bakery
4,North University,30.284151,-97.731956,"University of Texas at Austin, 1, East 23rd St...",3,History Museum,Fast Food Restaurant,Performing Arts Venue,Coffee Shop,Pool,Sandwich Place,Fountain,Football Stadium,Food Truck,Outdoor Sculpture


In [14]:
# Create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(austin_merged['Latitude'], austin_merged['Longitude'], austin_merged['Neighborhood'], austin_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 5. Examine Clusters

In [15]:
# Cluster 0
austin_merged.loc[austin_merged['Cluster Labels'] == 0, austin_merged.columns[[0] + list(range(5,austin_merged.shape[1]))]]
# print(austin_merged)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bryker Woods,Park,Rental Car Location,Italian Restaurant,Clothing Store,Flower Shop,Recreation Center,Rental Service,Cosmetics Shop,Bridal Shop,Pharmacy
5,Oakmont Heights,Café,Rental Service,Bridal Shop,Gift Shop,Pharmacy,Sandwich Place,Recreation Center,Asian Restaurant,Rental Car Location,Music Store


In [16]:
# Cluster 1
austin_merged.loc[austin_merged['Cluster Labels'] == 1, austin_merged.columns[[0] + list(range(5, austin_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Downtown Austin,Hotel,Coffee Shop,Bar,Gay Bar,Restaurant,Cocktail Bar,Lounge,Italian Restaurant,Burger Joint,Sushi Restaurant
3,Hyde Park,Italian Restaurant,Cheese Shop,Bus Station,Lighthouse,Pool,Mexican Restaurant,Salon / Barbershop,Food Truck,Bed & Breakfast,Bakery
8,Original Austin,Cocktail Bar,Park,Convenience Store,Bar,Southern / Soul Food Restaurant,Bagel Shop,Cajun / Creole Restaurant,Café,Lighthouse,Bakery


In [17]:
# Cluster 2
austin_merged.loc[austin_merged['Cluster Labels'] == 2, austin_merged.columns[[0] + list(range(5, austin_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Old Enfield,Italian Restaurant,American Restaurant,Grocery Store,Bakery,Food,Fast Food Restaurant,Comedy Club,Concert Hall,Convenience Store,Cosmetics Shop


In [18]:
# Cluster 3
austin_merged.loc[austin_merged['Cluster Labels'] == 3, austin_merged.columns[[0] + list(range(5, austin_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Eastwoods,Sandwich Place,Pub,Bubble Tea Shop,History Museum,Mexican Restaurant,Convenience Store,Concert Hall,Park,Performing Arts Venue,College Arts Building
4,North University,History Museum,Fast Food Restaurant,Performing Arts Venue,Coffee Shop,Pool,Sandwich Place,Fountain,Football Stadium,Food Truck,Outdoor Sculpture


In [19]:
# Cluster 4
austin_merged.loc[austin_merged['Cluster Labels'] == 4, austin_merged.columns[[0] + list(range(5, austin_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Old West Austin,Park,Electronics Store,Tanning Salon,Food Truck,Shop & Service,Yoga Studio,College Gym,Comedy Club,Concert Hall,Convenience Store


In [20]:
# Cluster 5
austin_merged.loc[austin_merged['Cluster Labels'] == 5, austin_merged.columns[[0] + list(range(5, austin_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Shoal Crest,Yoga Studio,Sporting Goods Shop,Men's Store,Cupcake Shop,Credit Union,New American Restaurant,Convenience Store,Outdoor Sculpture,Burrito Place,Bookstore


## Distances of Selected Neighborhoods<a name="distance"> </a>

In [28]:
loc_university = (30.2849,-97.7341)  # from Google
loc_downtown = (30.268054,-97.744764)
loc_originalaustin = (30.273778,-97.720480)
loc_hydepark =(30.304412,-97.730448)

dist_downtown_oa = distance.distance(loc_downtown, loc_originalaustin).miles
dist_downtown_hyde = distance.distance(loc_downtown, loc_hydepark).miles
dist_oa_hyde = distance.distance(loc_originalaustin, loc_hydepark).miles

dist_university_downtown = distance.distance(loc_university,loc_downtown).miles
dist_university_oa = distance.distance(loc_university, loc_originalaustin).miles
dist_university_hyde = distance.distance(loc_university, loc_hydepark).miles


print("From Downtown Austin to Original Austin is {:.2f} miles".format(dist_downtown_oa))
print("From Downtown Austin to Hyde Park is {:.2f} miles".format(dist_downtown_hyde))
print("From Original Austin to Hyde Park is {:.2f} miles".format(dist_oa_hyde))
print()
print("From University of Texas Austin to Downtown Austin is {:.2f} miles".format(dist_university_downtown))
print("From University of Texas Austin to Original Austin is {:.2f} miles".format(dist_university_oa))
print("From University of Texas Austin to Hyde Park is {:.2f} miles".format(dist_university_hyde))




From Downtown Austin to Original Austin is 1.50 miles
From Downtown Austin to Hyde Park is 2.65 miles
From Original Austin to Hyde Park is 2.19 miles

From University of Texas Austin to Downtown Austin is 1.32 miles
From University of Texas Austin to Original Austin is 1.12 miles
From University of Texas Austin to Hyde Park is 1.36 miles


## ANALYSIS AND DISCUSSION <a name="analysis"> </a>

### Foursquare Analysis

#### Original Austin

* 1.1 miles walking distance from the University of Texas Austin
* Ample food and bar establishments as top searches in Foursquare
* Probably lower rent cost than Austin Downtown
* 1.5 miles from Downtown Austin, close enough to attract downtown after work crowds

#### Austin Downtown

* Austin Downtown has many food related venues and bars. This would also be a great choice 
* Downtown business after work crowd
* Year-round downtown business crowd

### Population and Demographics in Austin, Texas:
(Source: datausa.io)
* Total population in Austin is reaching 1.0 million and growing at 3.0% per year in 2019
* Approximately 10% of Austin population are Asians which represents about 100K people
* Approximately 9% of Asian population in Austin are Koreans which represents 9K people.
* Whites 45%+
* Hispanics 30%+



## CONCLUSION <a name="conclusion"> </a>

Original Austin seems to be the ideal neighborhood with its high concentration of food and bar businesses already established. 

Original Austin is close to Downtown Austin (1.5 miles) and the University of Texas Austin (1.1 mile) to attract crowds from these adjacent neighborhoods.

Austin offers a growing community, with population reaching 1.0 million in 2019, with annual growth rate of 3.0%, and relatively young crowd with median age of 33.  

The Korean population in Austin is about 9K which should generate ample demand. Assume, nominally, another 1% conversion of total Austin population and the potential customer base grows to ~20K which is a very sizable opportunity.

As mentioned on the onset of this project, Korean fried chicken and bar is very popular in Korea, among all age groups from college students and up. It is also very popular in Los Angeles and San Jose where there are large Korean communities and very diverse ethnicities and we believe it will be very popular in Austin as well. 

Combining all the data collected and analyzed, Original Austin neighborhood, Austin population growth and demographics data, all support a good foundation to start the Korean fried chicken and bar business in Austin, Texas.

Next phase of the business case would be to actually build the Korean fried chicken and bar design, style, menu, services, size, financials, etc. All of these are beyond the scope of this project. 


## THE END