# Objectives
Analyse the number of cyclists in Paris

* Quantify the rise of cyclists in Paris
    * Get data from traffic, accidents
    * Bike lane construction
* Correlate the accidents with time of day, condition of the road, gender
* Is the increase in car traffic leading to more bike accidents?
* Is the increase bikes lanes helping in the drecrease of bike accidents?
* Is the increase of bike traffic leading to more bike lanes? And in which areas?

# Data that we might need (to scrape)
* Public investment (bike lane construction, public incentives to buy bikes)
* Car Traffic in Paris
* Number of bycicles sold in Paris (we might have info about sales of eletric bikes in Paris)
* Average Salary in Paris
* Average bike prices

In [1]:
import pandas as pd

In [4]:
# Importing the accidents dataset - Imports with no issues
accidents = pd.read_csv('https://www.data.gouv.fr/en/datasets/r/3d5f2317-5afd-4a9f-a9c5-bd4fe0113f39', low_memory=False)

Unnamed: 0,identifiant accident,date,mois,jour,heure,departement,commune,lat,lon,en agglomeration,...,existence securite,usage securite,obstacle fixe heurte,obstacle mobile heurte,localisation choc,manoeuvre avant accident,identifiant vehicule,type autres vehicules,manoeuvre autres vehicules,nombre autres vehicules
0,200500000030,2005-01-13,01 - janvier,3 - jeudi,19.0,62,62331,50.3,2.84,oui,...,,,,Véhicule,Côté gauche,Changeant de file à gauche,200500000030B02,Transport en commun,Dépassant à gauche,1.0
1,200500000034,2005-01-19,01 - janvier,2 - mercredi,10.0,62,62022,0.0,0.0,non,...,,,,Véhicule,Avant,Sans changement de direction,200500000034B02,"VU seul 1,5T <= PTAC <= 3,5T avec ou sans remo...",Tournant à gauche,1.0
2,200500000078,2005-01-26,01 - janvier,2 - mercredi,13.0,2,2173,0.0,0.0,non,...,Casque,Non,,Véhicule,Avant,Sans changement de direction,200500000078B02,VL seul,Tournant à gauche,1.0
3,200500000093,2005-01-03,01 - janvier,0 - lundi,13.0,2,2810,49.255,3.094,oui,...,,,,Véhicule,Avant gauche,Manœuvre d’évitement,200500000093B02,VL seul,Manœuvre d’évitement,1.0
4,200500000170,2005-01-29,01 - janvier,5 - samedi,18.0,76,76196,0.0,0.0,non,...,Autre,Oui,,Véhicule,Arrière,"Même sens, même file",200500000170A01,"VU seul 1,5T <= PTAC <= 3,5T avec ou sans remo...","Même sens, même file",1.0


In [12]:
# Importing the bike lanes dataset
# The CSV file is in the wrong format because it doesn't have a delimiter and it is already separated
# Because of that, the read_csv function is having trouble reading the file
# The error_bad_lines parameter ignores the rows that can't read, so those are not imported
# However even after the import, the dataframe is in the wrong format
#Solved it! It was the separator, which is a ; instead of a comma...
bike_lanes = pd.read_csv('https://www.data.gouv.fr/en/datasets/r/1211e838-4b77-4ee4-9567-03d78d55f0bf', sep=';')
bike_lanes

Unnamed: 0,Typologie,Aménagement bidirectionnel,Régime de vitesse,Sens vélo,Voie,Arrondissement,Bois,Longueur du tronçon en m,Longueur du tronçon en km,Position aménagement,Circulation générale interdite,Piste,Couloir bus,Continuité cyclable,Réseau cyclable,Date de livraison,geo_shape,geo_point_2d
0,Couloirs de bus ouverts aux vélos,Non,Voie 50,Sens de circulation générale,BOULEVARD DE SEBASTOPOL,4.0,Non,152.999231,0.152999,,,,Marqué,,,,"{""type"": ""LineString"", ""coordinates"": [[2.3494...","48.8612701514,2.34981742995"
1,Pistes cyclables,Non,Voie 50,Sens de circulation générale,BOULEVARD VINCENT AURIOL,13.0,Non,14.263221,0.014263,Latéral,,Niveau chaussée,,,,2019-09-15,"{""type"": ""LineString"", ""coordinates"": [[2.3671...","48.8345003893,2.36725670179"
2,Couloirs de bus ouverts aux vélos,Oui,Voie 50,Sens de circulation générale,BOULEVARD DE L HOPITAL,13.0,Non,17.421704,0.017422,,,,Protégé,,,2008-12-31,"{""type"": ""LineString"", ""coordinates"": [[2.3619...","48.84004249,2.36196441128"
3,Couloirs de bus ouverts aux vélos,Oui,Voie 50,Sens de circulation générale,PLACE VALHUBERT,13.0,Non,7.371715,0.007372,,,,Protégé,,,2008-12-31,"{""type"": ""LineString"", ""coordinates"": [[2.3648...","48.8437491231,2.36486970584"
4,Bandes cyclables,Non,Voie 50,Sens de circulation générale,AVENUE D ITALIE,13.0,Non,172.387831,0.172388,,,,,,,2005-12-31,"{""type"": ""LineString"", ""coordinates"": [[2.3586...","48.8213113974,2.35888981211"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11949,Pistes cyclables,Non,Voie 50,Sens de circulation générale,Avenue des Champs-Elysées,8.0,Non,8.093520,0.008094,Latéral,Non,Niveau chaussée,,Passage piéton,REVe,2018-12-31,"{""type"": ""LineString"", ""coordinates"": [[2.3051...","48.8707118641,2.30508325736"
11950,Bandes cyclables,,Zone 30,Contresens,Rue Gérando,9.0,,24.791335,0.024791,,,,,Traversée de carrefour,,2018-05-31,"{""type"": ""LineString"", ""coordinates"": [[2.3456...","48.8823341253,2.34571978805"
11951,Couloirs de bus ouverts aux vélos,Non,Voie 50,Sens de circulation générale,RUE DE LA CONVENTION,15.0,Non,128.784477,0.128784,,,,Marqué,,,,"{""type"": ""LineString"", ""coordinates"": [[2.2947...","48.8376085136,2.29549533312"
11952,Autres itinéraires cyclables (ex : Aires piéto...,,Zone 30,Contresens,Rue Hélène Brion,13.0,,233.651642,0.233652,,,,,,,2020-03-31,"{""type"": ""LineString"", ""coordinates"": [[2.3807...","48.8286149251,2.38210139819"


In [13]:
# Importing the traffic dataset
# Same problem as the bike lane dataset. Even though it doesn't give an error when reading the file,
# the dataframe is in the wrong format as well.
# Problem solved! It was the same problem with the separator.
traffic = pd.read_csv('https://www.data.gouv.fr/en/datasets/r/237382af-0e7a-4ef8-9508-b3e9e78adcfd', sep=';')

In [14]:
traffic

Unnamed: 0,Identifiant du compteur,Nom du compteur,Identifiant du site de comptage,Nom du site de comptage,Comptage horaire,Date et heure de comptage,Date d'installation du site de comptage,Lien vers photo du site de comptage,Coordonnées géographiques
0,100003096-SC,97 avenue Denfert Rochereau SO-NE,100003096,97 avenue Denfert Rochereau SO-NE,1,2019-08-01T04:00:00+02:00,2012-02-22,https://www.eco-visio.net/Photos/100003096/157...,"48.83511,2.33338"
1,100003096-SC,97 avenue Denfert Rochereau SO-NE,100003096,97 avenue Denfert Rochereau SO-NE,0,2019-08-01T06:00:00+02:00,2012-02-22,https://www.eco-visio.net/Photos/100003096/157...,"48.83511,2.33338"
2,100003096-SC,97 avenue Denfert Rochereau SO-NE,100003096,97 avenue Denfert Rochereau SO-NE,0,2019-08-01T03:00:00+02:00,2012-02-22,https://www.eco-visio.net/Photos/100003096/157...,"48.83511,2.33338"
3,100003096-SC,97 avenue Denfert Rochereau SO-NE,100003096,97 avenue Denfert Rochereau SO-NE,6,2019-08-01T07:00:00+02:00,2012-02-22,https://www.eco-visio.net/Photos/100003096/157...,"48.83511,2.33338"
4,100003096-SC,97 avenue Denfert Rochereau SO-NE,100003096,97 avenue Denfert Rochereau SO-NE,21,2019-08-01T08:00:00+02:00,2012-02-22,https://www.eco-visio.net/Photos/100003096/157...,"48.83511,2.33338"
...,...,...,...,...,...,...,...,...,...
762305,100063173-SC,74 Boulevard Ornano S-N,100063173,74 Boulevard Ornano S-N,0,2020-09-17T07:00:00+02:00,2020-07-22,https://www.eco-visio.net/Photos/100063173/159...,"48.896825,2.345648"
762306,100063173-SC,74 Boulevard Ornano S-N,100063173,74 Boulevard Ornano S-N,0,2020-09-17T10:00:00+02:00,2020-07-22,https://www.eco-visio.net/Photos/100063173/159...,"48.896825,2.345648"
762307,100063173-SC,74 Boulevard Ornano S-N,100063173,74 Boulevard Ornano S-N,0,2020-09-17T13:00:00+02:00,2020-07-22,https://www.eco-visio.net/Photos/100063173/159...,"48.896825,2.345648"
762308,100063173-SC,74 Boulevard Ornano S-N,100063173,74 Boulevard Ornano S-N,0,2020-09-17T15:00:00+02:00,2020-07-22,https://www.eco-visio.net/Photos/100063173/159...,"48.896825,2.345648"


## Columns that we don't need from accidents
* Circulation (143 non-missing values)


### We can use pd.merge to merge the postal code dataframe with the main dataframe