# **Introduction**
>*This Analysis of Paris Arrondissements, will include information pulled from the internet, including FourSquare Data to provide travelers, researchers, and other interested parties with neighborhood statistics such as density, population, landmarks, and locations.  Initially created to pull information on the 20 arrondissements of Paris, data on each arrondissement's 4 quarters seemed to be more relevant and detailed for exploration at the venue level.

# **Data Requirements and Descriptions**

### **Install and Import Libraries**

>*Setup the initial environment, using pip, it may be necessary to install the below models before importing modules </br>(some may not be needed, depending on prior usage and installations)*

>*For this project, I have choosen the following libraries and modules which will be used throughout the analysis*

In [1]:
#!pip install geopandas, geopy, geocoder, geoplot, folium, urllib3

import numpy as np
import pandas as pd

import os

import bs4
import csv
import json
import geopy
import folium
import requests
import geocoder
import geopandas

import matplotlib.pyplot as plt
from geopy import geocoders
from bs4 import BeautifulSoup
import geopandas as gpd
from shapely.geometry import Point, Polygon
from geopy.geocoders import Nominatim

### 80 Parisian Neighborhoods or *'Quartiers'* within 20 Parisian *'Arrondissements'*.  


>As an initial starting point, we first scrape our list of neighborhoods from data within the website 'https://en.wikipedia.org/wiki/Quarters_of_Paris'.  
>
>This data will be the beginning of our neighborhood dataset.</br>
>
>*Use "requests.get" to grab the html which is then parsed with Pandas and stored in a list.*
>
>Once displayed we see 80 rows of data (4 quarters X 20 arrondissements), each with features detailing: </br>
>**Arrondissement Number** and 'called' **Name, Quarter Number, Quarter Name, Population**, and **Area** in hectares
>

In [2]:
url = 'https://en.wikipedia.org/wiki/Quarters_of_Paris'
data = requests.get(url)
df_list = pd.read_html(data.text)

df = df_list[0]
df.head()

Unnamed: 0,Arrondissement(Districts),Quartiers(Quarters),Quartiers(Quarters).1,Population in1999[3],Area(hectares)[3],Map
0,"1st arrondissement(Called ""du Louvre"")",1st,Saint-Germain-l'Auxerrois,1672,86.9,
1,"1st arrondissement(Called ""du Louvre"")",2nd,Les Halles,8984,41.2,
2,"1st arrondissement(Called ""du Louvre"")",3rd,Palais-Royal,3195,27.4,
3,"1st arrondissement(Called ""du Louvre"")",4th,Place-Vendôme,3044,26.9,
4,"2nd arrondissement(Called ""de la Bourse"")",5th,Gaillon,1345,18.8,


>*We will need to also add two columns "Place" and "Arrondissement" in order to hold the split data from the first column "Arrondissement(Districts)"*

In [11]:
df['Arrond'] = df['Arrondissement(Districts)'].str.split('Called').str[0]
df['Aname'] = df['Arrondissement(Districts)'].str.split('Called').str[1]
df['Aname'] = df['Aname'].str.split('"').str[1]

df['Arrond'] = df['Arrondissement(Districts)'].str.split('(').str[0]
df['Place'] = df['Arrondissement(Districts)'].str.split('"').str[1]
df['Anum'] = df['Arrond'].str.split("s").str[0]
df['Anum'] = df['Anum'].str.split('n').str[0]
df['Anum'] = df['Anum'].str.split('t').str[0]
df['Anum'] = df['Anum'].str.split('r').str[0]
df['Qnum'] = df['Quartiers(Quarters)'].str.split('s').str[0]
df['Qnum'] = df['Qnum'].str.split('n').str[0]
df['Qnum'] = df['Qnum'].str.split('r').str[0]
df['Qnum'] = df['Qnum'].str.split('t').str[0]

df.head()

Unnamed: 0,Arrondissement(Districts),Quartiers(Quarters),Quartiers(Quarters).1,Population in1999[3],Area(hectares)[3],Map,Arrond,Aname,Place,Anum,Qnum
0,"1st arrondissement(Called ""du Louvre"")",1st,Saint-Germain-l'Auxerrois,1672,86.9,,1st arrondissement,du Louvre,du Louvre,1,1
1,"1st arrondissement(Called ""du Louvre"")",2nd,Les Halles,8984,41.2,,1st arrondissement,du Louvre,du Louvre,1,2
2,"1st arrondissement(Called ""du Louvre"")",3rd,Palais-Royal,3195,27.4,,1st arrondissement,du Louvre,du Louvre,1,3
3,"1st arrondissement(Called ""du Louvre"")",4th,Place-Vendôme,3044,26.9,,1st arrondissement,du Louvre,du Louvre,1,4
4,"2nd arrondissement(Called ""de la Bourse"")",5th,Gaillon,1345,18.8,,2nd arrondissement,de la Bourse,de la Bourse,2,5


>*Needing to generate an address for each Quartier in order to pull the geo-coordinates into our data so that data can be pulled from FourSquare to provide end users with the venue specific and street level information that they need*

#### **Generate the Postal Codes for each Arrondissement**
Postal codes are given for each arrondissement within Paris where the numbers 750 proceed the number of the arrondissement, for example the 10th Arrondissement of Canal St. Martin woould have a postal zip of 75010.   

>*The Postal Code of the Arrondissements follow the simple formula: Postcode = {75000 + "arrondissement number"}*

In [12]:
post = pd.DataFrame(columns=['Post', 'Quartiers', 'Postcode'])
post['Qnum'] = df.index.astype(float) + df.index 
post['Post'] = 75100 
post['Postcode'] = post['Post'].astype(float) + df['Anum'].astype(float)

>The data frame on the individual Quartiers is created with the following elements:
    
    Qrts (Quartiers Number)
    Quartier
    Pop (Population)
    Density (Population / Area)
    Latitude
    Longitude
    Postcode
    Arrnum (Arrondissement Number)
    Arrondissement
    City
    Country
    
    

In [13]:
arronds = pd.DataFrame(df, columns=['Quartier', 'Qrts', 'Pop', 'Density', 'Latitude', 'Longitude', 'Postcode', 'Anum', 'Arrondissement', 'City', 'Country'])
arronds['Arrondissement'] = df['Arrondissement(Districts)']
arronds['Quartier'] = df['Quartiers(Quarters).1']
arronds['Qrts'] = df['Quartiers(Quarters)']
arronds['City'] = 'Paris'
arronds['Country'] = 'FR'
arronds['Pop'] = df['Population in1999[3]']
arronds['Density']  = arronds['Pop'].astype(float) / df['Area(hectares)[3]'].astype(float)
arronds['Postcode'] = post['Postcode'].astype(str)
arronds['Postcode'] = arronds['Postcode'].str[:5]

arronds.reset_index(drop=False, inplace=False)
arronds.head()

Unnamed: 0,Quartier,Qrts,Pop,Density,Latitude,Longitude,Postcode,Anum,Arrondissement,City,Country
0,Saint-Germain-l'Auxerrois,1st,1672,19.240506,,,75101,1,"1st arrondissement(Called ""du Louvre"")",Paris,FR
1,Les Halles,2nd,8984,218.058252,,,75101,1,"1st arrondissement(Called ""du Louvre"")",Paris,FR
2,Palais-Royal,3rd,3195,116.605839,,,75101,1,"1st arrondissement(Called ""du Louvre"")",Paris,FR
3,Place-Vendôme,4th,3044,113.159851,,,75101,1,"1st arrondissement(Called ""du Louvre"")",Paris,FR
4,Gaillon,5th,1345,71.542553,,,75102,2,"2nd arrondissement(Called ""de la Bourse"")",Paris,FR


>*addy DataFrame is created to house the address and location information*

In [14]:
column_names = ['Qrts', 'Qnum', 'Quartier', 'addressline1', 'addressline', 'town', 'IsoCode', 'Lat', 'Long', 'Error', 'formatted_address', 'location_type']
df_addy = pd.DataFrame(arronds.Quartier,  columns=column_names)
df_addy['Quartier'] = arronds['Quartier'].map(str)
df_addy['Qrts'] = arronds['Qrts']
df_addy['addressline1'] = arronds['Arrondissement'].map(str)
df_addy['town'] = arronds['City'].map(str) 
df_addy['state'] = arronds['Country'].map(str)
df_addy['IsoCode'] = arronds['Postcode']

def removeNonAscii(addy): return "".join(i for i in addy if ord(i)<126 and ord(i)>31)

df_addy['addressline'] = df_addy['addressline1'].str.split('Called').str[0]
df_addy['addressline'] = df_addy['addressline1'].str.split('(').str[0]
df_addy['Add'] = df_addy['Quartier'] + ', ' + df_addy['Qrts'] + ' Quartier' + ',  ' +  df_addy['addressline'] + ', ' + ' Paris, FR  ' + df_addy['IsoCode']
df_addy.to_csv('addresses.csv')

In [15]:
df_addy['coordinates'] = df_addy['Quartier'] + ' Paris, FR'

In [16]:
add = df_addy['coordinates']
addy = pd.DataFrame(df_addy, columns = ['Qnum', 'Qrts', 'Quartier', 'addressline', 'IsoCode', 'Add', 'coordinates'])
addy['Qnum'] = df['Qnum']
addy['Qrts'] = df_addy['Qrts']
addy['Quartier'] = df_addy['Quartier']
addy['coordinates'] = add
addy.head()

Unnamed: 0,Qnum,Qrts,Quartier,addressline,IsoCode,Add,coordinates
0,1,1st,Saint-Germain-l'Auxerrois,1st arrondissement,75101,"Saint-Germain-l'Auxerrois, 1st Quartier, 1st ...","Saint-Germain-l'Auxerrois Paris, FR"
1,2,2nd,Les Halles,1st arrondissement,75101,"Les Halles, 2nd Quartier, 1st arrondissement,...","Les Halles Paris, FR"
2,3,3rd,Palais-Royal,1st arrondissement,75101,"Palais-Royal, 3rd Quartier, 1st arrondissemen...","Palais-Royal Paris, FR"
3,4,4th,Place-Vendôme,1st arrondissement,75101,"Place-Vendôme, 4th Quartier, 1st arrondisseme...","Place-Vendôme Paris, FR"
4,5,5th,Gaillon,2nd arrondissement,75102,"Gaillon, 5th Quartier, 2nd arrondissement, P...","Gaillon Paris, FR"


In [17]:
addy.tail()

Unnamed: 0,Qnum,Qrts,Quartier,addressline,IsoCode,Add,coordinates
75,76,76th,Combat,19th arrondissement,75119,"Combat, 76th Quartier, 19th arrondissement, ...","Combat Paris, FR"
76,77,77th,Belleville,20th arrondissement,75120,"Belleville, 77th Quartier, 20th arrondissemen...","Belleville Paris, FR"
77,78,78th,Saint-Fargeau,20th arrondissement,75120,"Saint-Fargeau, 78th Quartier, 20th arrondisse...","Saint-Fargeau Paris, FR"
78,79,79th,Père-Lachaise,20th arrondissement,75120,"Père-Lachaise, 79th Quartier, 20th arrondisse...","Père-Lachaise Paris, FR"
79,80,80th,Charonne,20th arrondissement,75120,"Charonne, 80th Quartier, 20th arrondissement,...","Charonne Paris, FR"


>*Use the geocoder to pull the longitude and latitude*

In [10]:
location= [x for x in addy['coordinates'].unique().tolist() 
            if type(x) == str]

latitude = []
longitude =  []
qrts = []

for i in range(0, len(location)):
    try:
        address = location[i]
        qrts = addy.Qrts[i]
        geolocator = Nominatim(user_agent="paris_explorer")
        loc = geolocator.geocode(address)
        latitude.append(loc.latitude)
        longitude.append(loc.longitude)
        print('Geo Coordinates: {}, {}, {}.'.format(loc.latitude, loc.longitude, qrts))
    except:
        latitude.append(np.nan)
        longitude.append(np.nan)
        qrts.append(np.nan)

df_ = pd.DataFrame({'location':location, 
                    'location_qrts':qrts,
                    'location_latitude': latitude,
                    'location_longitude':longitude,
                    })


Geo Coordinates: 48.860211199999995, 2.3362988847682233, 1st.
Geo Coordinates: 48.8624659, 2.3460086, 2nd.
Geo Coordinates: 48.863584700000004, 2.3362042200938715, 3rd.
Geo Coordinates: 48.867463400000005, 2.329428116825194, 4th.
Geo Coordinates: 48.869135150000005, 2.332908770335507, 5th.
Geo Coordinates: 48.86885895, 2.3393625582679, 6th.
Geo Coordinates: 48.8680539, 2.344592949731121, 7th.
Geo Coordinates: 48.8706233, 2.3487498, 8th.
Geo Coordinates: 48.8654414, 2.3561316, 9th.
Geo Coordinates: 48.8643317, 2.3626111049585248, 10th.
Geo Coordinates: 48.859571349999996, 2.3625762007242033, 11th.
Geo Coordinates: 48.862699750000004, 2.354135471358302, 12th.
Geo Coordinates: 48.85845555, 2.3517023379560156, 13th.
Geo Coordinates: 48.8555813, 2.3583593578227955, 14th.
Geo Coordinates: 48.85157155, 2.364795174126021, 15th.
Geo Coordinates: 48.85293705, 2.3500501225000026, 16th.
Geo Coordinates: 48.84792605, 2.355269043333334, 17th.
Geo Coordinates: 48.8432224, 2.3595089570948424, 18th.
Ge

# **Methodology** 

>*which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, 
>*if any, and what machine learnings were used and why.*


In [224]:
addy['Lat'] = df_['location_latitude']
addy['Lon'] = df_['location_longitude']

addy.head()

Unnamed: 0,Qnum,Qrts,Quartier,addressline,IsoCode,Add,coordinates,Lat,Lon
0,1,1st,Saint-Germain-l'Auxerrois,1st arrondissement,75101,"Saint-Germain-l'Auxerrois, 1st Quartier, 1st ...","Saint-Germain-l'Auxerrois Paris, FR",48.860211,2.336299
1,2,2nd,Les Halles,1st arrondissement,75101,"Les Halles, 2nd Quartier, 1st arrondissement,...","Les Halles Paris, FR",48.862466,2.346009
2,3,3rd,Palais-Royal,1st arrondissement,75101,"Palais-Royal, 3rd Quartier, 1st arrondissemen...","Palais-Royal Paris, FR",48.863585,2.336204
3,4,4th,Place-Vendôme,1st arrondissement,75101,"Place-Vendôme, 4th Quartier, 1st arrondisseme...","Place-Vendôme Paris, FR",48.867463,2.329428
4,5,5th,Gaillon,2nd arrondissement,75102,"Gaillon, 5th Quartier, 2nd arrondissement, P...","Gaillon Paris, FR",48.869135,2.332909


In [227]:
parcoord = pd.DataFrame(arronds[['Quartier', 'Anum','Latitude', 'Longitude', 'Postcode']])
parcoord['Qrts'] = addy['Qrts']
parcoord['Anum'] = arronds['Anum'].astype(str) +  'e'
parcoord['Quartier'] = arronds['Quartier']
parcoord['Postcode'] = addy['IsoCode']
parcoord['Address'] = addy['Add']
parcoord['Latitude'] = addy['Lat']
parcoord['Longitude'] = addy['Lon']
parcoord['Landmark'] = addy['Quartier']
parcoord['Density'] = arronds['Density']
parcoord['Population'] = arronds['Pop']
parcoord.head()                               

Unnamed: 0,Quartier,Anum,Latitude,Longitude,Postcode,Qrts,Address,Landmark,Density,Population
0,Saint-Germain-l'Auxerrois,1e,48.860211,2.336299,75101,1st,"Saint-Germain-l'Auxerrois, 1st Quartier, 1st ...",Saint-Germain-l'Auxerrois,19.240506,1672
1,Les Halles,1e,48.862466,2.346009,75101,2nd,"Les Halles, 2nd Quartier, 1st arrondissement,...",Les Halles,218.058252,8984
2,Palais-Royal,1e,48.863585,2.336204,75101,3rd,"Palais-Royal, 3rd Quartier, 1st arrondissemen...",Palais-Royal,116.605839,3195
3,Place-Vendôme,1e,48.867463,2.329428,75101,4th,"Place-Vendôme, 4th Quartier, 1st arrondisseme...",Place-Vendôme,113.159851,3044
4,Gaillon,2e,48.869135,2.332909,75102,5th,"Gaillon, 5th Quartier, 2nd arrondissement, P...",Gaillon,71.542553,1345


In [228]:
column_names = ['Qrts', 'Density', 'Population']
dens = pd.DataFrame(parcoord, columns = column_names)
                  
dens.head()

Unnamed: 0,Qrts,Density,Population
0,1st,19.240506,1672
1,2nd,218.058252,8984
2,3rd,116.605839,3195
3,4th,113.159851,3044
4,5th,71.542553,1345


In [282]:
from branca.colormap import linear

colormap = linear.YlGn_09.scale(
    dens.Density.min().astype(float),
    dens.Density.max().astype(float))

print(colormap(5.0))

colormap

#ffffe5ff


In [283]:
dens_dict = dens.set_index('Qrts')['Density']

dens_dict[67]

321.85776487663276

In [354]:
color_dict = {key: colormap(dens_dict[key]) for key in dens_dict.keys()}
dens_dict.keys
color_dict


{'1st': '#ffffe3ff',
 '2nd': '#89ce80ff',
 '3rd': '#dff3a7ff',
 '4th': '#e0f3a8ff',
 '5th': '#f7fcb9ff',
 '6th': '#ddf2a6ff',
 '7th': '#93d284ff',
 '8th': '#258745ff',
 '9th': '#3ba458ff',
 '10th': '#339a51ff',
 '11th': '#7ac77aff',
 '12th': '#1e8041ff',
 '13th': '#92d283ff',
 '14th': '#68bf71ff',
 '15th': '#a0d889ff',
 '16th': '#e3f4aaff',
 '17th': '#a1d889ff',
 '18th': '#82ca7dff',
 '19th': '#4eb264ff',
 '20th': '#84cb7eff',
 '21st': '#90d182ff',
 '22nd': '#dbf1a4ff',
 '23rd': '#44ad5eff',
 '24th': '#abdd8dff',
 '25th': '#c4e799ff',
 '26th': '#f9fdc4ff',
 '27th': '#bee596ff',
 '28th': '#acdd8eff',
 '29th': '#fcfed2ff',
 '30th': '#d9f0a3ff',
 '31st': '#f3fbb6ff',
 '32nd': '#c0e697ff',
 '33rd': '#40ab5dff',
 '34th': '#f8fdbfff',
 '35th': '#86cc7eff',
 '36th': '#004c2cff',
 '37th': '#7ac77aff',
 '38th': '#31974fff',
 '39th': '#0c723bff',
 '40th': '#278946ff',
 '41st': '#004529ff',
 '42nd': '#0a703aff',
 '43rd': '#006435ff',
 '44th': '#046c38ff',
 '45th': '#6ec274ff',
 '46th': '#278946ff

In [357]:
parcoord.head(1)

Unnamed: 0,Quartier,Anum,Latitude,Longitude,Postcode,Qrts,Address,Landmark,Density,Population
0,Saint-Germain-l'Auxerrois,1e,48.860211,2.336299,75101,1st,"Saint-Germain-l'Auxerrois, 1st Quartier, 1st ...",Saint-Germain-l'Auxerrois,19.240506,1672


In [359]:
import folium
from folium.plugins import MarkerCluster 

m = folium.Map(location=[latitude, longitude], zoom_start=12)

In [639]:
import folium

address = 'Paris, France'

geolocator = Nominatim(user_agent='my-application')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('Geo coordinates of Paris, FR {}, {}.'.format(latitude, longitude))

AttributeError: 'Location' object has no attribute 'geometry'

In [635]:
paris_map = folium.Map(location=[latitude, longitude], zoom_start=12)
pmap_df = dens.merge(df_, left_on='Qrts', right_on='location_qrts', left_index=True, right_index=False)

for lat, lng, qrts in zip(df_['location_latitude'], df_['location_longitude'], df_['location_qrts']):
    label = '{}'.format(qrts)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        style_function=lambda feature: {
            'id':qrts,
            'fillColor': color_dict[feature['id']],
            'radius':10,
            'popup':label,
            'color':color_dict[feature['id']],
            'fill':True,
            'fill_opacity':0.4
        },
        locations=list(zip(df_.location_latitude, df_.location_longitude)),
        icons=[folium.Icon(icon="star", prefix="fa") for _ in range(len(locations))],
        cluster=MarkerCluster(locations=locations, icons=icons)
    ).add_to(paris_map)
  
paris_map

In [636]:
CLIENT_ID = 'U4KD030Y4D1W4KHOW2TIKATHHUT4Z1NS2XR1T5FHYHTMCMPZ'
CLIENT_SECRET = 'AZUKDNM2EGIRD1PPR5DVQY114QXGWTAJUTN00Q5ABI0TBFF0' 
VERSION = '20180604'

LIMIT = 30

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)

Your credentails:
CLIENT_ID: U4KD030Y4D1W4KHOW2TIKATHHUT4Z1NS2XR1T5FHYHTMCMPZ


In [561]:
radius = 1000
LIMIT = 1000

venues = []
Quartier = addy['Quartier']
names = addy['Quartier']

for lat, lng, name in zip(df_['location_latitude'], df_['location_longitude'], names): 
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        lng,
        radius, 
        LIMIT
    )
    results = requests.get(url).json()['response']['groups'][0]['items']
    venues.append([(
            name, 
            lat, 
            lng, 
            location,
            venue['venue']['name'], 
            venue['venue']['location']['lat'],
            venue['venue']['location']['lng'],
            venue['venue']['location'],
            venue['venue']['categories'][0]['name']) for venue in results])
    
    
    

In [562]:
venues_df = pd.DataFrame(venues)
venues_df = pd.DataFrame([item for venue in venues for item in venue])
venues_df.columns = ['Quartier', 'Latitude', 'Longitude', 'Location', 'VenueName', 'VenueLongitude', 'VenueLatitude', 'VenueLocation','VenueCategory']
df_na = venues_df['Quartier'] != 'Na'
venues_df = venues_df[df_na]
print(venues_df.shape)
venues_df.head()

(7801, 9)


Unnamed: 0,Quartier,Latitude,Longitude,Location,VenueName,VenueLongitude,VenueLatitude,VenueLocation,VenueCategory
0,Saint-Germain-l'Auxerrois,48.860211,2.336299,"(Paris, Île-de-France, France métropolitaine, ...",Cour Carrée du Louvre,48.86036,2.338543,"{'address': 'Rue de Rivoli', 'crossStreet': 'P...",Pedestrian Plaza
1,Saint-Germain-l'Auxerrois,48.860211,2.336299,"(Paris, Île-de-France, France métropolitaine, ...",Musée du Louvre,48.860847,2.33644,"{'address': 'Rue de Rivoli', 'crossStreet': 'P...",Art Museum
2,Saint-Germain-l'Auxerrois,48.860211,2.336299,"(Paris, Île-de-France, France métropolitaine, ...",La Vénus de Milo (Vénus de Milo),48.859943,2.337234,"{'address': 'Musée du Louvre', 'crossStreet': ...",Exhibit
3,Saint-Germain-l'Auxerrois,48.860211,2.336299,"(Paris, Île-de-France, France métropolitaine, ...",Cour Napoléon,48.861172,2.335088,"{'address': 'Place du Carrousel', 'lat': 48.86...",Plaza
4,Saint-Germain-l'Auxerrois,48.860211,2.336299,"(Paris, Île-de-France, France métropolitaine, ...",Pont des Arts,48.858565,2.337635,"{'address': 'Pont des Arts', 'lat': 48.8585646...",Bridge


In [563]:
venues_df.tail()

Unnamed: 0,Quartier,Latitude,Longitude,Location,VenueName,VenueLongitude,VenueLatitude,VenueLocation,VenueCategory
7796,Charonne,48.854744,2.385356,"(Paris, Île-de-France, France métropolitaine, ...",Virtus,48.850193,2.37806,"{'address': '29 rue de Cotte', 'lat': 48.85019...",French Restaurant
7797,Charonne,48.854744,2.385356,"(Paris, Île-de-France, France métropolitaine, ...",Come a Casa,48.858585,2.382223,"{'address': '7 rue Pache', 'lat': 48.858585, '...",Italian Restaurant
7798,Charonne,48.854744,2.385356,"(Paris, Île-de-France, France métropolitaine, ...",Rivoluzione - Cantine Italienne,48.855417,2.374692,"{'address': '24 rue des Taillandiers', 'lat': ...",Italian Restaurant
7799,Charonne,48.854744,2.385356,"(Paris, Île-de-France, France métropolitaine, ...",Neoness Paris Bastille,48.854032,2.373111,"{'address': '4/6 passage Louis Philippe', 'lat...",Gym / Fitness Center
7800,Charonne,48.854744,2.385356,"(Paris, Île-de-France, France métropolitaine, ...",Comestibles & Marchand de Vins,48.861343,2.378424,"{'address': '3 rue du Général Renault', 'lat':...",Restaurant


In [564]:
venues_df.groupby(['Quartier']).mean().tail()


Unnamed: 0_level_0,Latitude,Longitude,VenueLongitude,VenueLatitude
Quartier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Sorbonne,48.849123,2.345325,48.849615,2.345539
Val-de-Grâce,48.842213,2.343882,48.843326,2.346032
Vivienne,48.868859,2.339363,48.868143,2.339405
École-Militaire,48.851848,2.304756,48.853671,2.304556
Épinettes,48.893751,2.319856,48.889632,2.320333


In [567]:
venues_df.groupby(['VenueCategory']).count()

Unnamed: 0_level_0,Quartier,Latitude,Longitude,Location,VenueName,VenueLongitude,VenueLatitude,VenueLocation
VenueCategory,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Accessories Store,1,1,1,1,1,1,1,1
Afghan Restaurant,2,2,2,2,2,2,2,2
African Restaurant,19,19,19,19,19,19,19,19
Alsatian Restaurant,5,5,5,5,5,5,5,5
American Restaurant,13,13,13,13,13,13,13,13
Arepa Restaurant,2,2,2,2,2,2,2,2
Argentinian Restaurant,10,10,10,10,10,10,10,10
Art Gallery,58,58,58,58,58,58,58,58
Art Museum,69,69,69,69,69,69,69,69
Arts & Crafts Store,5,5,5,5,5,5,5,5


In [568]:
print('There are {} unique categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 294 unique categories.


In [571]:
venues_df['VenueCategory'].unique()[:]

array(['Pedestrian Plaza', 'Art Museum', 'Exhibit', 'Plaza', 'Bridge',
       'Historic Site', 'Theater', 'Cosmetics Shop', 'Italian Restaurant',
       'Church', 'Coffee Shop', 'Hotel', 'Café', 'Garden', 'Bar',
       'Museum', 'Park', 'Spa', 'Shoe Store', 'Chinese Restaurant',
       'Tea Room', 'Sandwich Place', 'Udon Restaurant', 'Wine Bar',
       'Cheese Shop', 'Restaurant', 'Furniture / Home Store',
       'Cocktail Bar', 'Ramen Restaurant', 'Fountain',
       'French Restaurant', 'Pastry Shop', 'Japanese Restaurant',
       'Art Gallery', 'Bistro', 'Bakery', 'Korean Restaurant',
       'Thai Restaurant', 'Tapas Restaurant', 'Hotel Bar', 'Bookstore',
       'Wine Shop', 'Toy / Game Store', 'Clothing Store', 'Movie Theater',
       "Women's Store", 'Ice Cream Shop', 'Beer Store',
       'Lebanese Restaurant', 'Szechuan Restaurant',
       'Peruvian Restaurant', 'Souvenir Shop', 'Gastropub',
       'Gym / Fitness Center', 'Greek Restaurant', 'Sushi Restaurant',
       'Fish & Chip

In [573]:
'Furniture / Home Store' in venues_df['VenueCategory'].unique()
'Beach Bar' in venues_df['VenueCategory'].unique()

True

In [588]:
paris_hot_grouped = paris_hot.groupby(['Quartier']).sum().reset_index()

print(paris_hot_grouped.shape)
paris_hot_grouped

(80, 295)


Unnamed: 0,Quartier,Accessories Store,Afghan Restaurant,African Restaurant,Alsatian Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,...,Video Game Store,Vietnamese Restaurant,Vineyard,Water Park,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Amérique,0,0,0,0,0,0,0,1,1,...,0,0,0,0,0,0,0,0,0,0
1,Archives,0,0,0,0,0,0,0,5,2,...,0,1,0,0,3,1,0,0,0,0
2,Arsenal,0,0,0,0,0,0,0,2,1,...,0,0,0,0,2,1,0,0,0,0
3,Arts-et-Métiers,0,0,0,1,0,0,0,2,2,...,0,1,0,0,2,1,1,0,0,0
4,Auteuil,0,0,0,0,0,0,0,0,1,...,0,1,0,0,0,0,0,0,0,0
5,Batignolles,0,0,0,0,0,0,0,1,0,...,0,1,0,0,6,0,0,0,0,0
6,Bel-Air,0,0,0,0,0,0,0,0,0,...,0,1,0,0,1,1,0,0,0,0
7,Belleville,0,0,2,0,0,0,0,0,0,...,0,1,0,0,2,1,0,0,0,0
8,Bercy,0,0,0,0,0,0,0,0,0,...,0,1,0,0,1,0,0,0,0,0
9,Bonne-Nouvelle,0,0,0,0,0,0,0,0,0,...,0,0,0,0,7,1,2,0,0,0


>*Find the number of Cafes and Creperies and compare to the population density of each*

In [592]:
len(paris_hot_grouped[paris_hot_grouped['Creperie'] > 0])

46

In [599]:
len(paris_hot_grouped[paris_hot_grouped['Coffee Shop'] > 0])

65

In [617]:
len(paris_hot_grouped[paris_hot_grouped['Chinese Restaurant'] > 0])

33

In [618]:
paris_crepes = paris_hot_grouped[['Quartier','Creperie','Coffee Shop', 'Chinese Restaurant']]

In [619]:
paris_crepes_group = paris_crepes.groupby(['Quartier']).sum().reset_index()
paris_crepes_group.head()

Unnamed: 0,Quartier,Creperie,Coffee Shop,Chinese Restaurant
0,Amérique,0,0,1
1,Archives,2,5,0
2,Arsenal,2,2,0
3,Arts-et-Métiers,1,6,2
4,Auteuil,0,0,0


In [623]:
par_d = pd.DataFrame(paris_hot_grouped, columns = ['Quartier', 'Creperies', 'Cafes', 'Chinese', 'Density (pop per hectare)'])
par_d['Quartier'] = parcoord['Quartier']
par_d['Creperies'] = paris_crepes_group['Creperie']
par_d['Cafes'] = paris_crepes_group['Coffee Shop']
par_d['Chinese'] = paris_crepes_group['Chinese Restaurant']
par_d['Density (pop per hectare)'] = parcoord['Density'].astype(int)
par_d

Unnamed: 0,Quartier,Creperies,Cafes,Chinese,Density (pop per hectare)
0,Saint-Germain-l'Auxerrois,0,0,1,19
1,Les Halles,2,5,0,218
2,Palais-Royal,2,2,0,116
3,Place-Vendôme,1,6,2,113
4,Gaillon,0,0,0,71
5,Vivienne,2,1,0,119
6,Mail,1,0,1,208
7,Bonne-Nouvelle,1,2,2,340
8,Arts-et-Métiers,1,4,2,300
9,Enfants-Rouges,1,0,0,314


In [624]:
paris_hot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

paris_hot['Quartier'] = venues_df['Quartier'] 

fixed_columns = [paris_hot.columns[-1]] + list(paris_hot.columns[:-1])
paris_hot = paris_hot[fixed_columns]

print(paris_hot.shape)

paris_hot.head()

(7801, 295)


Unnamed: 0,Quartier,Accessories Store,Afghan Restaurant,African Restaurant,Alsatian Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,...,Video Game Store,Vietnamese Restaurant,Vineyard,Water Park,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [None]:
30080

# **Discussion**

>*section where you discuss any observations you noted and any recommendations you can make based on the results.*# **Data Requirements and Descriptions**

# **Results**
>*Conclusion section where you conclude the report.*