# Battle of the Neighborhoods - Week 1

This notebook is about the final project of the 'Applied Data Science Capstone' course. 
I am a Lebanese student in one of France's engineering schools. I will arrive in Paris in a few weeks and therefore I will be studying the spread of Paris' population across the city's "Arrondissements".
I will use data about the arrondissements of Paris from a wikipedia page and I will try to use this data in order to make observations, use pandas, numpy and other libraries for the analysis part, then I will go to the visualization part and come up with conclusions.

This project might interest every student going to live in Paris like me to understand the city's structure and poopulation spread.

In [51]:
# Let us start by importing pandas, numpy, Beautiful Soup and requests
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests

In [52]:
# Let us now get the data
link = "https://en.wikipedia.org/wiki/Arrondissements_of_Paris"
source = requests.get(link).text
soup = BeautifulSoup(source, 'xml')
table = soup.find('table', {"class":'wikitable sortable'})

In [53]:
# Now we create an empty dataframe using pandas. The columns will be: Arrondissement, Name, Area, Population, Density
df = pd.DataFrame(columns = ['Arrondissement', 'Name', 'Area (Km^2)', 'Population', 'Density (per Km^2)'])

In [65]:
# Filling the data frame with the data
for tr_cell in table.find_all('tr'):
    row_data = []
    for td_cell in tr_cell.find_all('td'):
        row_data.append(td_cell.text.strip())
    if len(row_data) >= 5:
        df.loc[len(df)] = row_data[0:5]

In [66]:
# Let us now check our dataframe
df

Unnamed: 0,Arrondissement,Name,Area (Km^2),Population,Density (per Km^2)
0,,"Louvre, Bourse, Temple, Hôtel-de-Ville",5.59 km2 (2.16 sq mi),100196,17924
1,,Panthéon,2.541 km2 (0.981 sq mi),59631,23477
2,,Luxembourg,2.154 km2 (0.832 sq mi),41976,19524
3,,Palais-Bourbon,4.088 km2 (1.578 sq mi),52193,12761
4,,Élysée,3.881 km2 (1.498 sq mi),37368,9631
5,,Opéra,2.179 km2 (0.841 sq mi),60071,27556
6,,Entrepôt,2.892 km2 (1.117 sq mi),90836,31431
7,11th (XIe) R,Popincourt,3.666 km2 (1.415 sq mi),147470,40183
8,12th (XIIe) R,Reuilly,16.324 km2 (6.303 sq mi)¹6.377 km2 (2.462 sq mi)²,141287,"8,657¹21,729²"
9,13th (XIIIe) L,Gobelins,7.146 km2 (2.759 sq mi),183399,25650


In [56]:
# Selecting up to the 10th arrondissement
df_paris = df.head(7)
df_paris

Unnamed: 0,Arrondissement,Name,Area (Km^2),Population,Density (per Km^2)
0,Paris Centre 1st (Ier) / 2nd (IIe) / 3rd (IIIe...,"Louvre, Bourse, Temple, Hôtel-de-Ville",5.59 km2 (2.16 sq mi),100196,17924
1,5th (Ve) L,Panthéon,2.541 km2 (0.981 sq mi),59631,23477
2,6th (VIe) L,Luxembourg,2.154 km2 (0.832 sq mi),41976,19524
3,7th (VIIe) L,Palais-Bourbon,4.088 km2 (1.578 sq mi),52193,12761
4,8th (VIIIe) R,Élysée,3.881 km2 (1.498 sq mi),37368,9631
5,9th (IXe) R,Opéra,2.179 km2 (0.841 sq mi),60071,27556
6,10th (Xe) R,Entrepôt,2.892 km2 (1.117 sq mi),90836,31431


Now that we have our data frame, we will be closely examining the difference of population density between the arrondissements.

In [57]:
# Let us import matplotlib.pyplot
import matplotlib.pyplot as plt

In [73]:
# We will rename some cells in our data frame to make our work easier
data = {'Arrondissement': ['0','5','6','7','8','9','10'],
        'Name': df_paris['Name'],
        'Area': df_paris['Area (Km^2)'],
        'Population': df_paris['Population'],
        'Density (per Km^2)': df_paris['Density (per Km^2)']}

data_paris = pd.DataFrame(data)

In [74]:
# Now we have created a new and final dataframe, where Arrondissement 0 means the 1st, 2nd, 3rd and 4th arrondissements together
data_paris

Unnamed: 0,Arrondissement,Name,Area,Population,Density (per Km^2)
0,0,"Louvre, Bourse, Temple, Hôtel-de-Ville",5.59 km2 (2.16 sq mi),100196,17924
1,5,Panthéon,2.541 km2 (0.981 sq mi),59631,23477
2,6,Luxembourg,2.154 km2 (0.832 sq mi),41976,19524
3,7,Palais-Bourbon,4.088 km2 (1.578 sq mi),52193,12761
4,8,Élysée,3.881 km2 (1.498 sq mi),37368,9631
5,9,Opéra,2.179 km2 (0.841 sq mi),60071,27556
6,10,Entrepôt,2.892 km2 (1.117 sq mi),90836,31431


We can see that population density varies from one arrondissement to the other.