This notebook will be used for the IBM Applied Data Science capstone project.

Author:  Greg Kale

## Description of the problem and background

The Village of Lombard's community development committee has requested an analysis of restaurants in the village and how they compare to Downers Grove and Oak Brook.  The committee is trying to attract new dining options to the village and they need to understand how the quantity, quality, and diversification of restaurant choices compares to neighboring suburbs.  The village will use the data and insights obtained to target new restaurants to open in Lombard.

The Village of Lombard is located in Dupage County Illinois, which is in the Western suburbs of Chicago.  Lombard has numerous restaurants in Yorktown mall, downtown Lombard, and various locations through out the village.  Oak Brook has their mall and both Oak Brook and Downers Grove have numerous dining choices along the I-88 corridor. 

## Description of the data and how it will be used to solve the problem

The following data sources will be used to solve the problem:

1) FourSquare API will be used to obtain data on the different restaurants for Lombard, Oak Brook, and Downers Grove. This data will be used to analyze the restaurants and create different rankings and groupings based on category and customer rankings for each of the three suburbs.  

2) Data from datausa.io will be used to get demographic data for each suburb. This data will be used to understand the suburbs different metrics on population, income, number of employees, and age. This data will be used to complement the FourSquare data to gain a complete understanding of each suburb. 

## Import the diffeerent python libraries

In [1]:
import pandas as pd 
import numpy as np
import requests # library to handle requests

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
##from IPython.display import Image 
##from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

#print('Folium installed')
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    certifi-2019.9.11          |           py36_0         147 KB  conda-forge
    ca-certificates-2019.9.11  |       hecc5488_0         144 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    cer

In [2]:
print ("Hello Capstone Project Course!")

Hello Capstone Project Course!


## Start of Data Import that will be used for the analysis

In [3]:
#### FourSquare API Details

CLIENT_ID = 'UQHJMBESKS5PAIR2TGRBNCTGWDQOJULHX5QMKD2NODE3BCP4'
CLIENT_SECRET = 'MPCRFK4NLCD0KSN31IJXBTRP0JKWQB2NG5CRCQG4XDH25LQB'
VERSION = '20180604'
LIMIT = 100
##https://api.foursquare.com/v2/venues/search?


In [4]:
#### details for Lombard Il
address = 'Lombard, IL'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

41.8864687 -88.0201536


In [5]:
search_query = ''
radius = 500
category='4d4b7105d754a06374d81259'
town = 'Lombard, Il'
intent_match = 'browse'
print(search_query + ' .... OK!')

 .... OK!


In [6]:
### Foursquare API call for Lombard Il

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&categoryId={}&client_secret={}&v={}&near={}&limit={}&intent={}'.format(CLIENT_ID, category, CLIENT_SECRET, VERSION, town, LIMIT, intent_match)
url

'https://api.foursquare.com/v2/venues/search?client_id=UQHJMBESKS5PAIR2TGRBNCTGWDQOJULHX5QMKD2NODE3BCP4&categoryId=4d4b7105d754a06374d81259&client_secret=MPCRFK4NLCD0KSN31IJXBTRP0JKWQB2NG5CRCQG4XDH25LQB&v=20180604&near=Lombard, Il&limit=100&intent=browse'

In [7]:
results = requests.get(url).json()


In [8]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe for Lombard Data
lombard_restuarants_df = json_normalize(venues)


In [9]:
len(lombard_restuarants_df)

50

In [10]:
lombard_restuarants_df.head()

Unnamed: 0,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,id,location.address,...,location.crossStreet,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",901493.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/yard-house-...,False,5b5fab74535d6f002cb34600,2301 Fountain Square Dr,...,,"[2301 Fountain Square Dr, Lombard, IL 60148, U...","[{'label': 'display', 'lat': 41.84360794431711...",41.843608,-87.992135,60148,IL,Yard House,v-1573417046,
1,"[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",,,,,,,False,4f342242e4b0935810b898c8,717 E Butterfield Rd,...,,"[717 E Butterfield Rd, Lombard, IL 60148, Unit...","[{'label': 'display', 'lat': 41.8393057, 'lng'...",41.839306,-87.998574,60148,IL,Chick-fil-A,v-1573417046,
2,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",751857.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/rock-bottom...,False,4acf5c42f964a52038d320e3,94 Yorktown Shopping Ctr,...,at Highland Ave,"[94 Yorktown Shopping Ctr (at Highland Ave), L...","[{'label': 'display', 'lat': 41.83818474154692...",41.838185,-88.010584,60148,IL,Rock Bottom Restaurant & Brewery,v-1573417046,
3,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",,,,,,,False,5480e64e498ebc7397d9e678,455 Butterfield Rd,...,,"[455 Butterfield Rd, Lombard, IL 60148, United...","[{'label': 'display', 'lat': 41.836676, 'lng':...",41.836676,-88.00469,60148,IL,Miller's Ale House - Chicago Lombard,v-1573417046,
4,"[{'id': '4bf58dd8d48988d14c941735', 'name': 'W...",442875.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/buffalo-wil...,False,4a5c841af964a52044bc1fe3,207 E Roosevelt Rd,...,btwn Highland Ave & Main St,[207 E Roosevelt Rd (btwn Highland Ave & Main ...,"[{'label': 'display', 'lat': 41.85971712599237...",41.859717,-88.013846,60148,IL,Buffalo Wild Wings,v-1573417046,


In [11]:
#### details for Downers Grove Il
address = 'Downers Grove, IL'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

search_query = ''
radius = 500
category='4d4b7105d754a06374d81259'
town = 'Downers Grove, IL'
intent_match = 'browse'
print(search_query + ' .... OK!')


41.7938195 -88.010376
 .... OK!


In [12]:
### Foursquare API call for Downners Grove Il

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&categoryId={}&client_secret={}&v={}&near={}&limit={}&intent={}'.format(CLIENT_ID, category, CLIENT_SECRET, VERSION, town, LIMIT, intent_match)
url

'https://api.foursquare.com/v2/venues/search?client_id=UQHJMBESKS5PAIR2TGRBNCTGWDQOJULHX5QMKD2NODE3BCP4&categoryId=4d4b7105d754a06374d81259&client_secret=MPCRFK4NLCD0KSN31IJXBTRP0JKWQB2NG5CRCQG4XDH25LQB&v=20180604&near=Downers Grove, IL&limit=100&intent=browse'

In [13]:
results = requests.get(url).json()

In [14]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe for Downers Grove Data
downers_grove_restuarants_df = json_normalize(venues)


In [15]:
len(downers_grove_restuarants_df)

50

In [16]:
downers_grove_restuarants_df.head()

Unnamed: 0,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,id,location.address,...,location.crossStreet,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d1e0931735', 'name': 'C...",,,,,,,False,54d0cce9498e164146718e99,1450 Butterfield Road,...,Finley,"[1450 Butterfield Road (Finley), Downers Grove...","[{'label': 'display', 'lat': 41.83440387789383...",41.834404,-88.021539,60515,IL,Starbucks,v-1573417051,
1,"[{'id': '4bf58dd8d48988d16f941735', 'name': 'H...",,,,,,,False,4a293d31f964a52071951fe3,1500 Butterfield Rd,...,at Finley Rd,"[1500 Butterfield Rd (at Finley Rd), Downers G...","[{'label': 'display', 'lat': 41.83439379624065...",41.834394,-88.022488,60515,IL,Portillo's,v-1573417051,
2,"[{'id': '4bf58dd8d48988d124941735', 'name': 'O...",,,,,,,False,4c3df62e0596c928729e8378,3113 Woodcreek Dr,...,,"[3113 Woodcreek Dr, Downers Grove, IL 60515, U...","[{'label': 'display', 'lat': 41.82882655274317...",41.828827,-88.035032,60515,IL,FTD Inc,v-1573417051,
3,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",1188290.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/hooters-130...,False,4b50c2bef964a5203f3127e3,1303 Butterfield Rd,...,btwn Finley Rd & Highland Ave,[1303 Butterfield Rd (btwn Finley Rd & Highlan...,"[{'label': 'display', 'lat': 41.83356014761201...",41.83356,-88.019341,60515,IL,Hooters,v-1573417051,
4,"[{'id': '4bf58dd8d48988d16a941735', 'name': 'B...",1250619.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/panera-brea...,False,4ba78b22f964a520f99b39e3,1400 Butterfield Rd,...,,"[1400 Butterfield Rd, Downers Grove, IL 60515,...","[{'label': 'display', 'lat': 41.8357171, 'lng'...",41.835717,-88.018103,60515,IL,Panera Bread,v-1573417051,


In [17]:
### details for Oak Brook Il
address = 'Oak Brook, IL'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

search_query = ''
radius = 500
category='4d4b7105d754a06374d81259'
town = 'Oak Brook, IL'
intent_match = 'browse'
print(search_query + ' .... OK!')


41.8328085 -87.9289504
 .... OK!


In [18]:
### Foursquare API call for Oak Brook Il

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&categoryId={}&client_secret={}&v={}&near={}&limit={}&intent={}'.format(CLIENT_ID, category, CLIENT_SECRET, VERSION, town, LIMIT, intent_match)
url

'https://api.foursquare.com/v2/venues/search?client_id=UQHJMBESKS5PAIR2TGRBNCTGWDQOJULHX5QMKD2NODE3BCP4&categoryId=4d4b7105d754a06374d81259&client_secret=MPCRFK4NLCD0KSN31IJXBTRP0JKWQB2NG5CRCQG4XDH25LQB&v=20180604&near=Oak Brook, IL&limit=100&intent=browse'

In [19]:
results = requests.get(url).json()

In [20]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe for Oak Brook Data
oak_brook_restuarants_df = json_normalize(venues)


In [21]:
len(oak_brook_restuarants_df)

50

In [22]:
oak_brook_restuarants_df['location.postalCode'].fillna('60523', inplace=True)


In [23]:
oak_brook_restuarants_df

Unnamed: 0,categories,delivery.id,delivery.provider.icon.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.name,delivery.url,hasPerk,id,location.address,...,location.crossStreet,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",,,,,,,False,4b367ab0f964a520663625e3,240 Oakbrook Ctr,...,btwn 16th St & 22nd St,"[240 Oakbrook Ctr (btwn 16th St & 22nd St), Oa...","[{'label': 'display', 'lat': 41.84988917571001...",41.849889,-87.950669,60523,IL,Maggiano's Little Italy,v-1573417056,551163459.0
1,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",,,,,,,False,4b131cc0f964a520449423e3,2020 Spring Rd,...,at 22nd St,"[2020 Spring Rd (at 22nd St), Oak Brook, IL 60...","[{'label': 'display', 'lat': 41.84980175400769...",41.849802,-87.949731,60523,IL,The Cheesecake Factory,v-1573417056,
2,"[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",,,,,,,False,5119346fe4b0665cf26a3c52,1 Oakbrook Ctr,...,,"[1 Oakbrook Ctr, Oak Brook, IL 60523, United S...","[{'label': 'display', 'lat': 41.84952630378556...",41.849526,-87.954087,60523,IL,Macy's Marketplace,v-1573417056,
3,"[{'id': '4bf58dd8d48988d114951735', 'name': 'B...",,,,,,,False,4aaa76d6f964a5200e5620e3,297 Oakbrook Ctr,...,at 22nd St.,"[297 Oakbrook Ctr (at 22nd St.), Oak Brook, IL...","[{'label': 'display', 'lat': 41.84887326308261...",41.848873,-87.951565,60523,IL,Barnes & Noble,v-1573417056,
4,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",447541.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/the-clubhou...,False,4c8672ced4e237045d4e8b88,298 Oakbrook Ctr,...,btwn 16th St & 22nd St,"[298 Oakbrook Ctr (btwn 16th St & 22nd St), Oa...","[{'label': 'display', 'lat': 41.8486054, 'lng'...",41.848605,-87.951604,60523,IL,The Clubhouse,v-1573417056,
5,"[{'id': '4bf58dd8d48988d16c941735', 'name': 'B...",1197094.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/shake-shack...,False,5c0163c4588e36002c53831e,1950 Spring Rd.,...,,"[1950 Spring Rd., Oak Brook, IL 60523, United ...","[{'label': 'display', 'lat': 41.85194317004867...",41.851943,-87.948873,60523,IL,Shake Shack,v-1573417056,
6,"[{'id': '4bf58dd8d48988d16a941735', 'name': 'B...",847872.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/corner-bake...,False,4c127525a1010f47c1e14818,"240 Oak Brook Ctr,",...,,"[240 Oak Brook Ctr,, Oak Brook, IL 60523, Unit...","[{'label': 'display', 'lat': 41.849896, 'lng':...",41.849896,-87.951121,60523,IL,Corner Bakery Cafe,v-1573417056,
7,"[{'id': '4def73e84765ae376e57713a', 'name': 'P...",,,,,,,False,582bcbf3b4f96227bc841dd9,523 Oakbrook Ctr,...,,"[523 Oakbrook Ctr, Oak Brook, IL 60523, United...","[{'label': 'display', 'lat': 41.85196765881677...",41.851968,-87.95239,60523,IL,Nando's Peri-Peri,v-1573417056,
8,"[{'id': '4bf58dd8d48988d1cc941735', 'name': 'S...",311222.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/wildfire-23...,False,4b4000a1f964a52007b425e3,232 Oakbrook Ctr,...,btwn 16th St & 22nd St,"[232 Oakbrook Ctr (btwn 16th St & 22nd St), Oa...","[{'label': 'display', 'lat': 41.84913384779693...",41.849134,-87.95054,60523,IL,Wildfire,v-1573417056,
9,"[{'id': '56aa371ce4b08b9a8d57356c', 'name': 'B...",810240.0,/delivery_provider_grubhub_20180129.png,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",grubhub,https://www.grubhub.com/restaurant/old-town-po...,False,527d3fb511d27facc942d0bf,8 Oakbrook Ctr,...,btwn 16th & 22nd St,"[8 Oakbrook Ctr (btwn 16th & 22nd St), Oak Bro...","[{'label': 'display', 'lat': 41.84902650081165...",41.849027,-87.950098,60523,IL,Old Town Pour House,v-1573417056,76970326.0


In [24]:
### Combine all 3 suburb dataframes into one restaurant Data Frame for analysis and to get details

restuarants_df = pd.concat([lombard_restuarants_df, downers_grove_restuarants_df, oak_brook_restuarants_df], ignore_index=True, sort =False)


In [25]:
len(restuarants_df)

150

In [26]:
### Remove columns we don't care about

restuarants_df.drop(['delivery.id', 'delivery.provider.icon.name', 'delivery.provider.icon.prefix'], axis=1, inplace=True) 
restuarants_df.drop(['delivery.provider.icon.sizes', 'delivery.provider.name', 'delivery.url', 'hasPerk'], axis=1, inplace=True) 
restuarants_df.drop(['location.crossStreet', 'location.country', 'location.formattedAddress', 'location.labeledLatLngs'], axis=1, inplace=True) 
restuarants_df.drop(['venuePage.id', 'location.cc', 'location.state', 'referralId'], axis=1, inplace=True) 


In [27]:
### Get thee category for each restaurant
restuarants_df['categories'] = restuarants_df['categories'].astype(str)
restuarants_df['categories2'] = restuarants_df['categories'].str.split(',').str[1]
restuarants_df['category'] = restuarants_df['categories2'].str.split(':').str[1]
restuarants_df.drop(['categories', 'categories2'], axis=1, inplace=True) 

restuarants_df['category'] = restuarants_df['category'].str.replace("'", "")



In [28]:
### set the Dataframe index to the unique ID provided by FourSquare

restuarants_df.set_index('id', inplace=True)


In [29]:
restuarants_df.head()

Unnamed: 0_level_0,location.address,location.city,location.lat,location.lng,location.postalCode,name,category
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
5b5fab74535d6f002cb34600,2301 Fountain Square Dr,Lombard,41.843608,-87.992135,60148,Yard House,American Restaurant
4f342242e4b0935810b898c8,717 E Butterfield Rd,Lombard,41.839306,-87.998574,60148,Chick-fil-A,Fast Food Restaurant
4acf5c42f964a52038d320e3,94 Yorktown Shopping Ctr,Lombard,41.838185,-88.010584,60148,Rock Bottom Restaurant & Brewery,Brewery
5480e64e498ebc7397d9e678,455 Butterfield Rd,Lombard,41.836676,-88.00469,60148,Miller's Ale House - Chicago Lombard,Restaurant
4a5c841af964a52044bc1fe3,207 E Roosevelt Rd,Lombard,41.859717,-88.013846,60148,Buffalo Wild Wings,Wings Joint


In [30]:
### based on data review - need to clean up some data based on typos and confine the data to my three suburbs

### 1) Remove Hinsdale restaurant
### 2) Rename restaurants with city name of Dowers Grove to Downers Grove
### 3) Rename restaurants with city name of Oakbrook to Oak Book

restuarants_df.drop(restuarants_df[restuarants_df['location.city'] == 'Hinsdale'].index, inplace=True) 

for index, row in restuarants_df.iterrows():
    
    if row['location.city'] == 'Dowers Grove':
        restuarants_df.at[index, 'location.city'] = 'Downers Grove'
    
    if row['location.city'] == 'Oakbrook':
        restuarants_df.at[index, 'location.city'] = 'Oak Brook'
        
    if row['location.postalCode'] == '60516':
        restuarants_df.at[index, 'location.postalCode'] = '60515'
    
 



In [31]:
### verify data groups look good by city and postal code and that we have the correct numbers

df_counts = restuarants_df.groupby(['location.city','location.postalCode']).count()
df_counts

Unnamed: 0_level_0,Unnamed: 1_level_0,location.address,location.lat,location.lng,name,category
location.city,location.postalCode,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Downers Grove,60515,46,50,50,50,50
Lombard,60148,50,50,50,50,50
Oak Brook,60523,43,49,49,49,49


In [32]:
### add venue detail columns needed for analysis
restuarants_df['likes.count'] = 0.0
restuarants_df['price.message'] = ''
restuarants_df['price.tier'] = ''
restuarants_df['rating'] = 0.0
restuarants_df['ratingSignals'] = 0
restuarants_df['reasons.count'] = 0
restuarants_df['specials.count'] = 0
restuarants_df['stats.tipCount'] = 0
restuarants_df['tips.count'] = 0

In [33]:
restuarants_df.head()

Unnamed: 0_level_0,location.address,location.city,location.lat,location.lng,location.postalCode,name,category,likes.count,price.message,price.tier,rating,ratingSignals,reasons.count,specials.count,stats.tipCount,tips.count
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
5b5fab74535d6f002cb34600,2301 Fountain Square Dr,Lombard,41.843608,-87.992135,60148,Yard House,American Restaurant,0.0,,,0.0,0,0,0,0,0
4f342242e4b0935810b898c8,717 E Butterfield Rd,Lombard,41.839306,-87.998574,60148,Chick-fil-A,Fast Food Restaurant,0.0,,,0.0,0,0,0,0,0
4acf5c42f964a52038d320e3,94 Yorktown Shopping Ctr,Lombard,41.838185,-88.010584,60148,Rock Bottom Restaurant & Brewery,Brewery,0.0,,,0.0,0,0,0,0,0
5480e64e498ebc7397d9e678,455 Butterfield Rd,Lombard,41.836676,-88.00469,60148,Miller's Ale House - Chicago Lombard,Restaurant,0.0,,,0.0,0,0,0,0,0
4a5c841af964a52044bc1fe3,207 E Roosevelt Rd,Lombard,41.859717,-88.013846,60148,Buffalo Wild Wings,Wings Joint,0.0,,,0.0,0,0,0,0,0


In [34]:
restuarants_detail_df = pd.DataFrame()

for index, row in restuarants_df.iterrows():
##    index = '5b5fab74535d6f002cb34600'    
    
    ### ignore york town mall entry which is listed as a restaurant
    if index != '4d681c2dc406f04d23b0f64c':

        venue_url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(index, CLIENT_ID, CLIENT_SECRET, VERSION)
        venue_results = requests.get(venue_url).json()
    
#        print("look number: ", index)
#        print(venue_results)

        venue_details = venue_results['response']['venue']

        restuarants_detail_df = json_normalize(venue_details)
    
        columns_list = restuarants_detail_df.columns

        restuarants_detail_df.set_index('id', inplace=True)
        
        ### had it implement logic for each column as some of these fields are not returned based on the restaurant. 
        if ('likes.count' in columns_list):
            restuarants_df.at[index, 'likes.count'] = restuarants_detail_df['likes.count']   
        
        if ('price.message' in columns_list):
            restuarants_df.at[index, 'price.message'] = restuarants_detail_df['price.message'] 
        
        if ('price.tier' in columns_list):
            restuarants_df.at[index, 'price.tier'] = restuarants_detail_df['price.tier'] 
        
        if ('rating' in columns_list):
            restuarants_df.at[index, 'rating'] = restuarants_detail_df['rating'] 
        
        if ('ratingSignals' in columns_list):
            restuarants_df.at[index, 'ratingSignals'] = restuarants_detail_df['ratingSignals'] 
       
        if ('reasons.count' in columns_list):
            restuarants_df.at[index, 'reasons.count'] = restuarants_detail_df['reasons.count'] 
        
        if ('specials.count' in columns_list):
            restuarants_df.at[index, 'specials.count'] = restuarants_detail_df['specials.count'] 

        if ('stats.tipCount' in columns_list):
            restuarants_df.at[index, 'stats.tipCount'] = restuarants_detail_df['stats.tipCount'] 

        if ('tips.count' in columns_list):
            restuarants_df.at[index, 'tips.count'] = restuarants_detail_df['tips.count'] 
    
        restuarants_detail_df = pd.DataFrame()
                

In [35]:
#### convert price messagee tier to single values from their current list of values to standardize the values
for index, row in restuarants_df.iterrows():

    price_list = row['price.message']
    price_tier = row['price.tier']
    
    if len(price_list) != 0:
        
        if ('Cheap' in price_list[0]):
            restuarants_df.at[index, 'price.message'] = 'Cheap'

        if ('Expensive' in price_list[0]):
            restuarants_df.at[index, 'price.message'] = 'Expensive'
    
        if ('Moderate' in price_list[0]):
            restuarants_df.at[index, 'price.message'] = 'Moderate'
    
        if ('Very Expensive' in price_list[0]):
            restuarants_df.at[index, 'price.message'] = 'Very Expensive'

    if len(price_tier) != 0:
        restuarants_df.at[index, 'price.tier'] = price_tier[0]




In [36]:
#### if a restaurant does not have a rating or tier then set to moderate
for index, row in restuarants_df.iterrows():

    if row['price.message'] == '':
        restuarants_df.at[index, 'price.message'] = 'Moderate'

    if row['price.tier'] == '':
        restuarants_df.at[index, 'price.tier'] = '2'


In [37]:
restuarants_df.head()

Unnamed: 0_level_0,location.address,location.city,location.lat,location.lng,location.postalCode,name,category,likes.count,price.message,price.tier,rating,ratingSignals,reasons.count,specials.count,stats.tipCount,tips.count
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
5b5fab74535d6f002cb34600,2301 Fountain Square Dr,Lombard,41.843608,-87.992135,60148,Yard House,American Restaurant,20.0,Moderate,2,8.0,25,0,0,4,4
4f342242e4b0935810b898c8,717 E Butterfield Rd,Lombard,41.839306,-87.998574,60148,Chick-fil-A,Fast Food Restaurant,103.0,Cheap,1,8.8,139,1,0,23,23
4acf5c42f964a52038d320e3,94 Yorktown Shopping Ctr,Lombard,41.838185,-88.010584,60148,Rock Bottom Restaurant & Brewery,Brewery,149.0,Moderate,2,7.9,229,1,0,48,48
5480e64e498ebc7397d9e678,455 Butterfield Rd,Lombard,41.836676,-88.00469,60148,Miller's Ale House - Chicago Lombard,Restaurant,60.0,Moderate,2,7.6,117,1,0,51,51
4a5c841af964a52044bc1fe3,207 E Roosevelt Rd,Lombard,41.859717,-88.013846,60148,Buffalo Wild Wings,Wings Joint,62.0,Moderate,2,7.2,97,1,0,17,17


In [38]:
restuarants_df.tail()

Unnamed: 0_level_0,location.address,location.city,location.lat,location.lng,location.postalCode,name,category,likes.count,price.message,price.tier,rating,ratingSignals,reasons.count,specials.count,stats.tipCount,tips.count
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
57c47ef6cd10e8433fb43101,2121 Butterfield Rd,Oak Brook,41.850946,-87.972186,60523,Skippy's Gyros,Greek Restaurant,10.0,Moderate,2,6.9,16,0,0,2,2
58b30e3772714f3fd61596c3,,Oak Brook,41.852173,-87.952061,60523,cilantro taco grill,Taco Place,7.0,Cheap,1,7.5,11,0,0,2,2
4b9119cff964a520d9a333e3,2060 York Rd,Oak Brook,41.848973,-87.930094,60523,Jason's Deli,Deli / Bodega,50.0,Cheap,1,8.1,75,1,0,23,23
58cd42638f0be437372cf20c,Oakbrook Mall - The District,Oak Brook,41.852179,-87.952053,60523,Stan's Donuts & Coffee,Coffee Shop,21.0,Cheap,1,7.3,27,0,0,0,0
4b7edcc0f964a5200d0530e3,1401 W 22nd St,Oak Brook,41.846124,-87.952806,60523,1401 West,Restaurant,4.0,Moderate,2,0.0,0,0,0,0,0


In [39]:
#### pull in data io suburban data downloaded from DATA IO web site. 
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.
client_cf3349256d5a45c5b75c7e6628f8db04 = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='iE2F4Dp04DJwKmG73isv3Scl-9Qe9ibW0fYxTgl0iqLS',
    ibm_auth_endpoint="https://iam.ng.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

body = client_cf3349256d5a45c5b75c7e6628f8db04.get_object(Bucket='applieddatasciencecapstoneproject-donotdelete-pr-twmmi9qsoebtro',Key='DataIOSuburbDemographics.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_1 = pd.read_csv(body)
df_data_1.head()


Unnamed: 0,zipCode,Suburb,Populatatoin,MedianAge,MedianHouseholdIncome,PovertyRate,NumberofEmployees,MedianPropertyValue,color,latitude,longitude
0,60148,Lombard,43776,39.1,73145,5.86%,23278,244900,red,41.886469,-88.020154
1,60523,Oak Brook,8108,56.7,132500,3.40%,3266,784700,blue,41.832808,-87.92895
2,60515,Downers Grove,49649,43.1,85546,5.39%,25356,340200,green,41.793819,-88.010376


### This section is the analysis of the different restaurants in the 3 suburbs - 
### Goal is to compare restuarants in Lombard against Oak Brook and Downers Grove and determine types of restaurants that Lombard should pursue to compete with Oak Brook and Downers Grove

In [40]:
### isolate just the Lombard restuarants

lombard_restuarants_df = restuarants_df[(restuarants_df['location.postalCode'] == '60148')]


In [69]:
### check out the groupings of Lombard restuarants by category, price and average rating

lombard_restuarants_grouped_df = lombard_restuarants_df.groupby(['category', 'price.message']).agg({'rating': np.mean, 'location.postalCode': np.size})

lombard_restuarants_grouped_df.rename(columns={'rating': 'avg.rating'}, inplace=True)
lombard_restuarants_grouped_df.rename(columns={'location.postalCode': 'count'}, inplace=True)

lombard_restuarants_grouped_df


Unnamed: 0_level_0,Unnamed: 1_level_0,avg.rating,count
category,price.message,Unnamed: 2_level_1,Unnamed: 3_level_1
American Restaurant,Expensive,8.55,2
American Restaurant,Moderate,7.933333,3
American Restaurant,Very Expensive,7.5,1
Arcade,Moderate,6.7,1
BBQ Joint,Moderate,8.25,2
Brewery,Moderate,7.9,1
Burrito Place,Cheap,7.7,1
Burrito Place,Moderate,7.7,1
Chinese Restaurant,Cheap,6.9,1
Chinese Restaurant,Moderate,7.9,1


In [42]:
### isolate just the Oak Brook restuarants

oakbrook_restuarants_df = restuarants_df[(restuarants_df['location.postalCode'] == '60523')]


In [43]:
### check out the groupings of Lombard restuarants by category, price and average rating

oakbrook_restuarants_grouped_df = oakbrook_restuarants_df.groupby(['category', 'price.message']).agg({'rating': np.mean, 'location.postalCode': np.size})

oakbrook_restuarants_grouped_df.rename(columns={'rating': 'avg.rating'}, inplace=True)
oakbrook_restuarants_grouped_df.rename(columns={'location.postalCode': 'count'}, inplace=True)

oakbrook_restuarants_grouped_df


Unnamed: 0_level_0,Unnamed: 1_level_0,avg.rating,count
category,price.message,Unnamed: 2_level_1,Unnamed: 3_level_1
American Restaurant,Expensive,8.7,2
American Restaurant,Moderate,8.2,2
Asian Restaurant,Moderate,6.9,1
BBQ Joint,Moderate,8.9,1
Bakery,Cheap,7.5,1
Bakery,Moderate,6.8,1
Bar,Moderate,7.5,1
Beer Bar,Moderate,7.9,1
Bookstore,Moderate,8.6,1
Breakfast Spot,Cheap,8.7,1


In [44]:
### isolate just the Oak Brook restuarants

downersgrove_restuarants_df = restuarants_df[(restuarants_df['location.postalCode'] == '60515')]


In [45]:
### check out the groupings of Lombard restuarants by category, price and average rating

downersgrove_restuarants_grouped_df = downersgrove_restuarants_df.groupby(['category', 'price.message']).agg({'rating': np.mean, 'location.postalCode': np.size})

downersgrove_restuarants_grouped_df.rename(columns={'rating': 'avg.rating'}, inplace=True)
downersgrove_restuarants_grouped_df.rename(columns={'location.postalCode': 'count'}, inplace=True)

downersgrove_restuarants_grouped_df

Unnamed: 0_level_0,Unnamed: 1_level_0,avg.rating,count
category,price.message,Unnamed: 2_level_1,Unnamed: 3_level_1
American Restaurant,Moderate,7.45,4
BBQ Joint,Moderate,3.85,2
Bakery,Moderate,7.75,2
Bar,Moderate,7.8,1
Brazilian Restaurant,Expensive,8.8,1
Breakfast Spot,Moderate,8.55,2
Burger Joint,Moderate,7.866667,3
Chinese Restaurant,Cheap,6.7,1
Coffee Shop,Cheap,7.58,5
Diner,Cheap,7.7,1


In [46]:
#### review restaurants by category, price, and avg rating by suburb - visually review data to determine where Lombard can improve

restuarants_grouped_df = restuarants_df.groupby(['category', 'price.message', 'location.city']).agg({'rating': np.mean, 'location.postalCode': np.size})



In [47]:
len(restuarants_grouped_df)

99

In [48]:
### adjust maxp rows to display in order to review all the data
pd.set_option('display.max_rows', 500)


In [81]:
### look at groupings of restuarand by categeory, suburbm and average rating
restuarants_df.groupby(['category', 'location.city']).agg({'rating': np.mean})


Unnamed: 0_level_0,Unnamed: 1_level_0,rating
category,location.city,Unnamed: 2_level_1
American Restaurant,Downers Grove,7.45
American Restaurant,Lombard,8.066667
American Restaurant,Oak Brook,8.45
Arcade,Lombard,6.7
Asian Restaurant,Oak Brook,6.9
BBQ Joint,Downers Grove,3.85
BBQ Joint,Lombard,8.25
BBQ Joint,Oak Brook,8.9
Bakery,Downers Grove,7.75
Bakery,Oak Brook,7.15


In [78]:
### look at counts by category, zip code, and counts
restuarants_df.groupby(['category', 'location.postalCode']).agg({'location.postalCode': np.size})



Unnamed: 0_level_0,Unnamed: 1_level_0,location.postalCode
category,location.postalCode,Unnamed: 2_level_1
American Restaurant,60148,6
American Restaurant,60515,4
American Restaurant,60523,4
Arcade,60148,1
Asian Restaurant,60523,1
BBQ Joint,60148,2
BBQ Joint,60515,2
BBQ Joint,60523,1
Bakery,60515,2
Bakery,60523,2


In [49]:
restuarants_grouped_df.head(100)


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,rating,location.postalCode
category,price.message,location.city,Unnamed: 3_level_1,Unnamed: 4_level_1
American Restaurant,Expensive,Lombard,8.55,2
American Restaurant,Expensive,Oak Brook,8.7,2
American Restaurant,Moderate,Downers Grove,7.45,4
American Restaurant,Moderate,Lombard,7.933333,3
American Restaurant,Moderate,Oak Brook,8.2,2
American Restaurant,Very Expensive,Lombard,7.5,1
Arcade,Moderate,Lombard,6.7,1
Asian Restaurant,Moderate,Oak Brook,6.9,1
BBQ Joint,Moderate,Downers Grove,3.85,2
BBQ Joint,Moderate,Lombard,8.25,2


## Analysis after reviewing restuarant groupings by category
### The following restaurant categories should be investigated by Lombard


### Following categories Lombard does not have restuarants
Asian,
Bakery,
Bar,
Brazillian,
Breakfast Spot,
Burger Joint,
Cafe,
Diner,
hot dog Joint,
Ice Cream Shop,
Frech Restaurant,
Indian,
Portuguese,
Salad,
Taco Place,
Vegetarian / Vegan Restaurant,

### Following categories Lombard is lacking restaurants
Italian,
Pizza,
Coffee Shop

In [51]:
from folium import plugins


In [52]:
address = 'Lombard, IL'

geolocator = Nominatim(user_agent="lombard_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Lombard are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Lombard are 41.8864687, -88.0201536.


In [53]:
df_cords = restuarants_df[['location.lat', 'location.lng']]

In [54]:
df_cords.reset_index(inplace=True)

In [55]:
df_cords.drop(['id'], axis=1, inplace=True) 


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  errors=errors)


In [56]:
df_cords.head()

Unnamed: 0,location.lat,location.lng
0,41.843608,-87.992135
1,41.839306,-87.998574
2,41.838185,-88.010584
3,41.836676,-88.00469
4,41.859717,-88.013846


In [57]:
restuarants_df.head(1)

Unnamed: 0_level_0,location.address,location.city,location.lat,location.lng,location.postalCode,name,category,likes.count,price.message,price.tier,rating,ratingSignals,reasons.count,specials.count,stats.tipCount,tips.count
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
5b5fab74535d6f002cb34600,2301 Fountain Square Dr,Lombard,41.843608,-87.992135,60148,Yard House,American Restaurant,20.0,Moderate,2,8.0,25,0,0,4,4


In [68]:
# create map of Lombard, Oak Brook, and Downers Grovee using latitude and longitude values for the differenct venues
map_all = folium.Map(location=[latitude, longitude], zoom_start=12)


# add markers to map
for lat, lng, borough, neighborhood in zip(lombard_restuarants_df['location.lat'], lombard_restuarants_df['location.lng'], lombard_restuarants_df['location.postalCode'], lombard_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_all)  

# add markers to map
for lat, lng, borough, neighborhood in zip(oakbrook_restuarants_df['location.lat'], oakbrook_restuarants_df['location.lng'], oakbrook_restuarants_df['location.postalCode'], oakbrook_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_all) 

# add markers to map
for lat, lng, borough, neighborhood in zip(downersgrove_restuarants_df['location.lat'], downersgrove_restuarants_df['location.lng'], downersgrove_restuarants_df['location.postalCode'], downersgrove_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_all)  

legend_html = '''
    <div style=”position: fixed; 
     bottom: 50px; left: 50px; width: 100px; height: 90px; 
     border:2px solid grey; z-index:9999; font-size:14px; 
     “>&nbsp; Cool Legend <br> \
     &nbsp; Downers Grove &nbsp; <i class=”fa fa-map-marker fa-2x” 
                  style=”color:green”></i><br> 
     &nbsp; Lombard &nbsp; <i class=”fa fa-map-marker fa-2x” 
                  style=”color:red”></i> 
    </div>
    '''

map_all.get_root().html.add_child(folium.Element(legend_html))

map_all



In [82]:
# create map of Lombard using latitude and longitude values for the differenct venues
map_lombard = folium.Map(location=[latitude, longitude], zoom_start=12)


# add markers to map
for lat, lng, borough, neighborhood in zip(lombard_restuarants_df['location.lat'], lombard_restuarants_df['location.lng'], lombard_restuarants_df['location.postalCode'], lombard_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lombard)  

map_lombard



In [83]:
# create map of Oak Brook using latitude and longitude values for the differenct venues
map_oakbrook = folium.Map(location=[latitude, longitude], zoom_start=12)


# add markers to map
for lat, lng, borough, neighborhood in zip(oakbrook_restuarants_df['location.lat'], oakbrook_restuarants_df['location.lng'], oakbrook_restuarants_df['location.postalCode'], oakbrook_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_oakbrook)  

map_oakbrook


In [84]:
# create map of Downers Grovee using latitude and longitude values for the differenct venues
map_downsgrove = folium.Map(location=[latitude, longitude], zoom_start=12)


# add markers to map
for lat, lng, borough, neighborhood in zip(downersgrove_restuarants_df['location.lat'], downersgrove_restuarants_df['location.lng'], downersgrove_restuarants_df['location.postalCode'], downersgrove_restuarants_df['category']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
##        color={'red', 'Blue', 'Green'},##restuarants_df['location.postalCode'],
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_downsgrove)  

map_downsgrove