# Capstone Project - The Battle of the Neighborhoods (Week 1)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

Approximately 6 million people visit the National Parks in the state of Utah every year.  The number of visitors increased 17% between 2017 and 2018 from 5.6 million visitors to 6.6 million visitors.  Of all the National Parks in Utah, Zion National Park is the most popular with 4.3 million visitors in 2018.  **Zion National Park** is loacated in **Washington County** in Utah which is in the southwest corner of the state.

According to Chefspencil.com, Mexican Cuisine is the most popular ethnic cuisine in 27 of the 50 United States. (54%)
https://www.chefspencil.com/most-popular-ethnic-cuisines-in-america/

![image.png](attachment:image.png)

The goal of this project is two-fold.  The first goal is to find the best location to open a new **Mexican restaurant** in **Washington County, Utah.**

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decision are:
* number of existing restaurants in the neighborhood (any type of restaurant)
* number of and distance to Mexican restaurants in the neighborhood, if any
* distance of neighborhood from city center

We aquired the list of all cities and zip codes with latitude and longitude from the website:
https://www.unitedstateszipcodes.org/zip-code-database/

To narrow down our area of focus, We used visitor statistics from a **State of Utah’s Travel and Tourism Industry** study performed by the Kem C. Gardner Policy Institute of the University of Utah.
https://travel.utah.gov/wp-content/uploads/2019-TTtrifold-Updated.pdf

The number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**

In [1]:
##!pip install conda
!pip install geopy
!pip install folium

##import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

##import matplotlib.cm as cm
##import matplotlib.colors as colors

##from sklearn.cluster import KMeans

import folium # map rendering library


df_zipsall = pd.read_csv('zip_code_database.csv') 
df_zipsall.shape



(42632, 15)

In [2]:
df_zipsall.head()

Unnamed: 0,zip,type,decommissioned,primary_city,acceptable_cities,unacceptable_cities,state,county,timezone,area_codes,world_region,country,latitude,longitude,irs_estimated_population_2015
0,501,UNIQUE,0,Holtsville,,I R S Service Center,NY,Suffolk County,America/New_York,631,,US,40.81,-73.04,562
1,544,UNIQUE,0,Holtsville,,Irs Service Center,NY,Suffolk County,America/New_York,631,,US,40.81,-73.04,0
2,601,STANDARD,0,Adjuntas,,"Colinas Del Gigante, Jard De Adjuntas, Urb San...",PR,Adjuntas Municipio,America/Puerto_Rico,787939,,US,18.16,-66.72,0
3,602,STANDARD,0,Aguada,,"Alts De Aguada, Bo Guaniquilla, Comunidad Las ...",PR,Aguada Municipio,America/Puerto_Rico,787939,,US,18.38,-67.18,0
4,603,STANDARD,0,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba...",PR,Aguadilla Municipio,America/Puerto_Rico,787,,US,18.43,-67.15,0


In [3]:
df_zipsall.drop(['acceptable_cities', 'unacceptable_cities', 'timezone'], axis=1, inplace=True)

In [4]:
df_zipsall.rename(columns={"decommissioned": "decomm", "primary_city": "city", "irs_estimated_population_2015": "population2015"}, inplace=True)

In [5]:
df_zipsut = df_zipsall[(df_zipsall['state'] == 'UT')].reset_index(drop=True)

df_zipsut = df_zipsut[(df_zipsut['county'] == 'Washington County')].reset_index(drop=True)


df_zipsut

Unnamed: 0,zip,type,decomm,city,state,county,area_codes,world_region,country,latitude,longitude,population2015
0,84722,STANDARD,0,Central,UT,Washington County,435,,US,37.39,-113.63,470
1,84725,PO BOX,0,Enterprise,UT,Washington County,435,,US,37.52,-113.75,1718
2,84733,STANDARD,0,Gunlock,UT,Washington County,435,,US,37.24,-113.79,106
3,84737,STANDARD,0,Hurricane,UT,Washington County,435,,US,37.04,-113.21,15120
4,84738,STANDARD,0,Ivins,UT,Washington County,435,,US,37.18,-113.71,6810
5,84745,STANDARD,0,La Verkin,UT,Washington County,435,,US,37.23,-113.24,3880
6,84746,PO BOX,0,Leeds,UT,Washington County,435,,US,37.23,-113.36,964
7,84757,STANDARD,0,New Harmony,UT,Washington County,435,,US,37.46,-113.26,1110
8,84763,PO BOX,0,Rockville,UT,Washington County,435,,US,37.15,-113.05,168
9,84765,STANDARD,0,Santa Clara,UT,Washington County,435,,US,37.13,-113.65,6600


In [6]:
address = 'Springdale, Utah'

geolocator = Nominatim(user_agent="tor_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Utah are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Utah are 37.182802800000005, -113.00212093620803.


In [7]:
map_utah = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(df_zipsut['latitude'], df_zipsut['longitude'], df_zipsut['city']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_utah)  
    
map_utah

In [8]:
CLIENT_ID = 'FTSFPNKDGW1UEYRWM5IUOEDCEETWICHPW0AJKFF4RKPZQ0JT' # your Foursquare ID
CLIENT_SECRET = 'XNOIIIQFCLCN4ZWP25KOJHKLL35CY0HAX4ZARSH12BNUS53C' # your Foursquare Secret
VERSION = '20180831' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: FTSFPNKDGW1UEYRWM5IUOEDCEETWICHPW0AJKFF4RKPZQ0JT
CLIENT_SECRET:XNOIIIQFCLCN4ZWP25KOJHKLL35CY0HAX4ZARSH12BNUS53C


In [9]:
neighborhood_latitude = latitude
neighborhood_longitude = longitude

neighborhood_name = address

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Springdale, Utah are 37.182802800000005, -113.00212093620803.


In [20]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 40250 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=FTSFPNKDGW1UEYRWM5IUOEDCEETWICHPW0AJKFF4RKPZQ0JT&client_secret=XNOIIIQFCLCN4ZWP25KOJHKLL35CY0HAX4ZARSH12BNUS53C&v=20180831&ll=37.182802800000005,-113.00212093620803&radius=40250&limit=100'

In [21]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5f51aa44edbb5d0711f3b4d6'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': '$-$$$$', 'key': 'price'},
    {'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Springdale',
  'headerFullLocation': 'Springdale',
  'headerLocationGranularity': 'city',
  'totalResults': 103,
  'suggestedBounds': {'ne': {'lat': 37.54505316225037,
    'lng': -112.54828781481179},
   'sw': {'lat': 36.82055243774964, 'lng': -113.45595405760427}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '5810d27e38fa6ecbf569acb1',
       'name': "King's Landing Bistro",
       'location': {'address': '1515 Zion Park Blvd',
        'lat': 37.17992239203187,
        'lng': -113.00623292298661,
        'labeledLatLngs': [{'la

In [22]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [23]:
venues = results['response']['groups'][0]['items']
venues

[{'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '5810d27e38fa6ecbf569acb1',
   'name': "King's Landing Bistro",
   'location': {'address': '1515 Zion Park Blvd',
    'lat': 37.17992239203187,
    'lng': -113.00623292298661,
    'labeledLatLngs': [{'label': 'display',
      'lat': 37.17992239203187,
      'lng': -113.00623292298661},
     {'label': 'entrance', 'lat': 37.179909, 'lng': -113.006275}],
    'distance': 485,
    'postalCode': '84767',
    'cc': 'US',
    'city': 'Springdale',
    'state': 'UT',
    'country': 'United States',
    'formattedAddress': ['1515 Zion Park Blvd',
     'Springdale, UT 84767',
     'United States']},
   'categories': [{'id': '4bf58dd8d48988d14e941735',
     'name': 'American Restaurant',
     'pluralName': 'American Restaurants',
     'shortName': 'American',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/default

In [24]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,King's Landing Bistro,American Restaurant,37.179922,-113.006233
1,Zion National Park,National Park,37.199275,-112.989049
2,Zion National Park Visitor Center,Tourist Information Center,37.200271,-112.986939
3,Desert Pearl Inn,Resort,37.190510,-112.994420
4,Deep Creek Coffee,Coffee Shop,37.188994,-113.000009
...,...,...,...,...
95,Silver Reef Museum,Museum,37.253132,-113.368179
96,Colorado City Municipal Airport,Airport,36.995964,-113.003738
97,Edge Of The World Brewery,Brewery,36.991281,-112.973656
98,Dixie Nat'l Forest @ Silver Reef,Trail,37.272723,-113.388718
