# The Chicken Wars
## Where Could American Chicken Sandwich Restaurants Thrive in Paris?
#### The Business problem is that if restaurants like Popeye's of Chick-fil-A decide to expand in to Paris, where would be the best place for them to set up shop?
To complete this analysis, I will use the population data from https://www.citypopulation.de/en/france/cityofparis/, I will use area size of each arrondissement from https://en.wikipedia.org/wiki/Arrondissements_of_Paris, and I will use the foursqure api to pull the data on competing restaurants in the areas

Download necessary tools

In [1]:
#import data analyzation tools
import pandas as pd
import numpy as np

#json tools
import json
from pandas.io.json import json_normalize

!conda install -c anaconda beautifulsoup4 -y

#import scraping tools
import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup

#import visual tools
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium

!conda install -c conda-forge geopy -y

#import geocoder tools
from geopy.geocoders import Nominatim

#import kmean tools
from sklearn.cluster import KMeans

print('All tools imported')

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - beautifulsoup4


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    soupsieve-1.9.3            |           py36_0          60 KB  anaconda
    openssl-1.1.1              |       h7b6447c_0         5.0 MB  anaconda
    certifi-2019.9.11          |           py36_0         154 KB  anaconda
    beautifulsoup4-4.8.1       |           py36_0         153 KB  anaconda
    ------------------------------------------------------------
                                           Total:         5.4 MB

The following NEW packages will be INSTALLED:

    soupsieve:      1.9.3-py36_0      anaconda   

The following packages will be UPDATED

### Data Gathering
Here we will scrape the population data and put it in a dataframe

In [49]:
# use BSoup to scrape initial data
wlink = 'https://www.citypopulation.de/en/france/cityofparis/'
raw_page = urlopen(wlink).read().decode('utf-8')
soup = BeautifulSoup(raw_page, 'html.parser')
new_table = soup.body.table.tbody

# use for loop to go through the website to get the specific data
i = 0
data = []
data2 = []
for i in range(19):
    data2 = []
    b = new_table.find_all('tr')[i]
    c = b.find_all('span')[0].string
    data2.append(c)
    e = b.find_all('span')[1].string
    data2.append(e)
    for j in range(2,8):
            a = new_table.find_all('tr')[i]
            d = a.find_all('td')[j].string
            d = int(d.replace(',', ''))
            data2.append(d)
    data.append(data2)
    
# assign the columns to the data we gathered
columns = ['Arrondissement #', 'Arr. Name', 'Population: 1975', 'Population: 1982', 'Population: 1990', 'Population: 1999', 'Population: 2007', 'Population: 2016']
df = pd.DataFrame(data, columns=columns)
df.head(19)

Unnamed: 0,Arrondissement #,Arr. Name,Population: 1975,Population: 1982,Population: 1990,Population: 1999,Population: 2007,Population: 2016
0,Paris 1er Arrondissement,Louvre,22793,18509,18360,16888,17915,16252
1,Paris 2e Arrondissement,Bourse,26328,21203,20738,19585,21745,20260
2,Paris 3e Arrondissement,Temple,41706,36094,35102,34248,34576,34788
3,Paris 4e Arrondissement,Hôtel-de-Ville,40466,33990,32226,30675,28572,27487
4,Paris 5e Arrondissement,Panthéon,67668,62173,61222,58849,62664,59108
5,Paris 6e Arrondissement,Luxembourg,56331,48905,47891,44919,45332,40916
6,Paris 7e Arrondissement,Palais-Bourbon,74250,67461,62939,56985,57410,52512
7,Paris 8e Arrondissement,Élysée,52999,46403,40814,39314,39165,36453
8,Paris 9e Arrondissement,Opéra,70270,64134,58019,55838,58632,59629
9,Paris 10e Arrondissement,Enclos-St-Laurent,94046,86970,90083,89612,93373,91932


In [50]:
# we will not append columns that will tell if the population has increased since 1975 and since 1999
df['Pop. Increase since 1975'] = '0';
df['Pop. Increase since 1999'] = '0';
for i in df.index:
    pop1 = df.iloc[i, 7] - df.iloc[i, 2]
    if pop1 > 0:
        df.iloc[i, 8] = 'Yes'
    else:
        df.iloc[i, 8] = 'No'
        
for i in df.index:
    pop1 = df.iloc[i, 7] - df.iloc[i, 5]
    if pop1 > 0:
        df.iloc[i, 9] = 'Yes'
    else:
        df.iloc[i, 9] = 'No'
df.head(19)

Unnamed: 0,Arrondissement #,Arr. Name,Population: 1975,Population: 1982,Population: 1990,Population: 1999,Population: 2007,Population: 2016,Pop. Increase since 1975,Pop. Increase since 1999
0,Paris 1er Arrondissement,Louvre,22793,18509,18360,16888,17915,16252,No,No
1,Paris 2e Arrondissement,Bourse,26328,21203,20738,19585,21745,20260,No,Yes
2,Paris 3e Arrondissement,Temple,41706,36094,35102,34248,34576,34788,No,Yes
3,Paris 4e Arrondissement,Hôtel-de-Ville,40466,33990,32226,30675,28572,27487,No,No
4,Paris 5e Arrondissement,Panthéon,67668,62173,61222,58849,62664,59108,No,Yes
5,Paris 6e Arrondissement,Luxembourg,56331,48905,47891,44919,45332,40916,No,No
6,Paris 7e Arrondissement,Palais-Bourbon,74250,67461,62939,56985,57410,52512,No,No
7,Paris 8e Arrondissement,Élysée,52999,46403,40814,39314,39165,36453,No,No
8,Paris 9e Arrondissement,Opéra,70270,64134,58019,55838,58632,59629,No,Yes
9,Paris 10e Arrondissement,Enclos-St-Laurent,94046,86970,90083,89612,93373,91932,No,Yes


In [51]:
#here we will isolate the columns that have both an increase
df2 = df[(df['Pop. Increase since 1975'] == 'Yes') & (df['Pop. Increase since 1999'] == 'Yes')].reset_index(drop=True)
df2.head(5)

Unnamed: 0,Arrondissement #,Arr. Name,Population: 1975,Population: 1982,Population: 1990,Population: 1999,Population: 2007,Population: 2016,Pop. Increase since 1975,Pop. Increase since 1999
0,Paris 12e Arrondissement,Reuilly,140900,138015,130257,136591,142425,141494,Yes,Yes
1,Paris 13e Arrondissement,Gobelins,163313,170818,171098,171533,179213,181552,Yes,Yes
2,Paris 15e Arrondissement,Vaugirard,231301,225596,223940,225362,232247,233484,Yes,Yes
3,Paris 19e Arrondissement,Buttes-Chaumont,144357,162649,165062,172730,184038,186393,Yes,Yes


In [52]:
# since we only need the most recent population data, we will isolate jus the number, name, and 2016 population
df3 = df2[['Arrondissement #', 'Arr. Name', 'Population: 2016']].reset_index(drop=True)
df3.head()

Unnamed: 0,Arrondissement #,Arr. Name,Population: 2016
0,Paris 12e Arrondissement,Reuilly,141494
1,Paris 13e Arrondissement,Gobelins,181552
2,Paris 15e Arrondissement,Vaugirard,233484
3,Paris 19e Arrondissement,Buttes-Chaumont,186393


In [53]:
#searched the locations of each arrondissement, and then put their locations on a the df3 dataframe
df3['Latitude'] = '0';
df3['Longitude'] = '0';
df3.iloc[0, 3] = 48.8378
df3.iloc[0, 4] = 02.3862
df3.iloc[1, 3] = 48.8262
df3.iloc[1, 4] = 02.3599
df3.iloc[2, 3] = 48.8422
df3.iloc[2, 4] = 02.2928
df3.iloc[3, 3] = 48.8761
df3.iloc[3, 4] = 02.3758
df3.head()

Unnamed: 0,Arrondissement #,Arr. Name,Population: 2016,Latitude,Longitude
0,Paris 12e Arrondissement,Reuilly,141494,48.8378,2.3862
1,Paris 13e Arrondissement,Gobelins,181552,48.8262,2.3599
2,Paris 15e Arrondissement,Vaugirard,233484,48.8422,2.2928
3,Paris 19e Arrondissement,Buttes-Chaumont,186393,48.8761,2.3758


In [54]:
#get the city center cooridinates for Paris
address = 'Paris, France'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('Central coordinates for Paris, France {}, {}.'.format(latitude, longitude))

Central coordinates for Paris, France 48.8566101, 2.3514992.


In [55]:
#Mapped Paris with markers of the 4 arrondissements 
map_paris = folium.Map(location=[latitude, longitude], zoom_start=13)

#place neighborhoods on map based with circle locaters
for lat, lng, label in zip(df3['Latitude'], df3['Longitude'], df3['Arr. Name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=7,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3166cc',
        fill_opacity=0.55,
        parse_html=False).add_to(map_paris)  
    
map_paris

In [56]:
#load Foursquare API data needed for analysis
CLIENT_ID =  # your Foursquare ID
CLIENT_SECRET =  # your Foursquare Secret
VERSION = '20180604'
LIMIT = 300

### Plotting the map of competing restaurants in Arr. 12:

In [57]:
#initial dataframe for Arr. 12
latitude = df3.iloc[0, 3]
longitude = df3.iloc[0, 4]
radius = 1000
venue_possibilities = ['Snack Place', 'Fried Chicken Joint', 'Wings Joint', 'Southern / Soul Food Restaurant', 'Fast Food Restaurant', 'American Restaurant']
search_query = venue_possibilities[0]
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
#for loop for rest
for i in range(0,6):
    search_query = venue_possibilities[i]
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    # assign relevant part of JSON to venues
    venues = results['response']['venues']
    # tranform venues into a dataframe
    df4 = json_normalize(venues)
    dataframe.append(df4, sort=True)

df_12 = dataframe

venues_map_12 = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Ecco

# add Ecco as a red circle mark
folium.features.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup='Ecco',
        fill=True,
        color='red',
        fill_color='red',
        fill_opacity=0.6
    ).add_to(venues_map_12)

# add the trending venues as blue circle markers
for lat, lng, label in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            poup=label,
            fill=True,
            color='blue',
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(venues_map_12)

venues_map_12

### Plotting the map of competing restaurants in Arr. 13:

In [58]:
#initial dataframe for 13
latitude = df3.iloc[1, 3]
longitude = df3.iloc[1, 4]
venue_possibilities = ['Snack Place', 'Fried Chicken Joint', 'Wings Joint', 'Southern / Soul Food Restaurant', 'Fast Food Restaurant', 'American Restaurant']
search_query = venue_possibilities[0]
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
#for loop for rest
for i in range(0,6):
    search_query = venue_possibilities[i]
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    # assign relevant part of JSON to venues
    venues = results['response']['venues']
    # tranform venues into a dataframe
    df4 = json_normalize(venues)
    dataframe.append(df4, sort=True)

df_13 = dataframe

venues_map_13 = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Ecco

# add Ecco as a red circle mark
folium.features.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup='Ecco',
        fill=True,
        color='red',
        fill_color='red',
        fill_opacity=0.6
    ).add_to(venues_map_13)

# add the trending venues as blue circle markers
for lat, lng, label in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            poup=label,
            fill=True,
            color='blue',
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(venues_map_13)

venues_map_13


### Plotting the map of competing restaurants in Arr. 15:

In [59]:
#initial dataframe for 15
latitude = df3.iloc[2, 3]
longitude = df3.iloc[2, 4]
venue_possibilities = ['Snack Place', 'Fried Chicken Joint', 'Wings Joint', 'Southern / Soul Food Restaurant', 'Fast Food Restaurant', 'American Restaurant']
search_query = venue_possibilities[0]
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
#for loop for rest
for i in range(0,6):
    search_query = venue_possibilities[i]
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    # assign relevant part of JSON to venues
    venues = results['response']['venues']
    # tranform venues into a dataframe
    df4 = json_normalize(venues)
    dataframe.append(df4, sort=True)

df_15 = dataframe

venues_map_15 = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Ecco

# add Ecco as a red circle mark
folium.features.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup='Ecco',
        fill=True,
        color='red',
        fill_color='red',
        fill_opacity=0.6
    ).add_to(venues_map_15)

# add the trending venues as blue circle markers
for lat, lng, label in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            poup=label,
            fill=True,
            color='blue',
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(venues_map_15)

venues_map_15


### Plotting the map of competing restaurants in Arr. 19:

In [60]:
#initial dataframe for 19
latitude = df3.iloc[3, 3]
longitude = df3.iloc[3, 4]
venue_possibilities = ['Snack Place', 'Fried Chicken Joint', 'Wings Joint', 'Southern / Soul Food Restaurant', 'Fast Food Restaurant', 'American Restaurant']
search_query = venue_possibilities[0]
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
#for loop for rest
for i in range(0,6):
    search_query = venue_possibilities[i]
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    # assign relevant part of JSON to venues
    venues = results['response']['venues']
    # tranform venues into a dataframe
    df4 = json_normalize(venues)
    dataframe.append(df4, sort=True)

df_19 = dataframe

venues_map_19 = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Ecco

# add Ecco as a red circle mark
folium.features.CircleMarker(
        [latitude, longitude],
        radius=10,
        popup='Ecco',
        fill=True,
        color='red',
        fill_color='red',
        fill_opacity=0.6
    ).add_to(venues_map_19)

# add the trending venues as blue circle markers
for lat, lng, label in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    folium.features.CircleMarker(
            [lat, lng],
            radius=5,
            poup=label,
            fill=True,
            color='blue',
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(venues_map_19)

venues_map_19


In [61]:
#reorganize data into a new dataframe, and calculate density data
data = {'Arr.': [12,13,15,19], 
        'Size (sq. km.)': [6.377, 7.146, 8.502, 6.786],
        'Population': [df3.iloc[0,2],df3.iloc[1,2],df3.iloc[2,2],df3.iloc[3,2]],
        '# of Restaurants': [df_12.shape[0],df_13.shape[0],df_15.shape[0],df_19.shape[0]],
        'Pop. Density': [df3.iloc[0,2]/6.377, df3.iloc[1,2]/7.146,df3.iloc[2,2]/5.621, df3.iloc[3,2]/8.502],
        'Restaurant Density': [df_12.shape[0]/6.377, df_13.shape[0]/7.146, df_15.shape[0]/5.621, df_19.shape[0]/8.502]
       }
df5 = pd.DataFrame(data)
df5.head()

Unnamed: 0,Arr.,Size (sq. km.),Population,# of Restaurants,Pop. Density,Restaurant Density
0,12,6.377,141494,26,22188.176258,4.077152
1,13,7.146,181552,47,25406.101315,6.577106
2,15,8.502,233484,44,41537.804661,7.827789
3,19,6.786,186393,37,21923.429781,4.351917


In [62]:
#implement a choice factor
df5['Choice Factor'] = '0';
for i in range (0,4):
    df5.iloc[i,6] = df5.iloc[i,4]/df5.iloc[i,5]

df5.head()

Unnamed: 0,Arr.,Size (sq. km.),Population,# of Restaurants,Pop. Density,Restaurant Density,Choice Factor
0,12,6.377,141494,26,22188.176258,4.077152,5442.08
1,13,7.146,181552,47,25406.101315,6.577106,3862.81
2,15,8.502,233484,44,41537.804661,7.827789,5306.45
3,19,6.786,186393,37,21923.429781,4.351917,5037.65


In [47]:
#fin