# Road Traffic Accidents in Switzerland

Our project goal is to scrap all traffic accidents from the accidents map from http://map.donneesaccidents.ch/

## Data scraping strategy

Accessing http://map.donneesaccidents.ch/, wich redirects to : <br>
https://map.geo.admin.ch/?topic=vu&lang=fr&bgLayer=ch.swisstopo.pixelkarte-grau&layers=ch.astra.unfaelle-personenschaeden_alle&layers_timestamp=&catalogNodes=1318


Postman parses the following parameters : 
<code>
topic:vu
lang:en
bgLayer:ch.swisstopo.pixelkarte-grau
layers:ch.astra.unfaelle-personenschaeden_alle
layers_timestamp:
catalogNodes:1318
</code>

The most important one is layers:ch.astra.unfaelle-personenschaeden_alle.<br>
It is the layer that contains all the geo-information dots on "Accidents with personal injury" which is the selected data layer.
<img src="layer_selector.png">

Selection all kinds of accidents returns the following :<br>
<img src="layer_selector_all.png">
with layer parameters :<br>
layers:<br>
    &nbsp;ch.astra.unfaelle-personenschaeden_alle,<br>
    &nbsp;ch.astra.unfaelle-personenschaeden_getoetete,<br>
    &nbsp;ch.astra.unfaelle-personenschaeden_fussgaenger,<br>
    &nbsp;ch.astra.unfaelle-personenschaeden_fahrraeder,<br>
    &nbsp;ch.astra.unfaelle-personenschaeden_motorraeder<br>
layers_timestamp:,,,,<br>

Now we want every data for each layer. By selecting a dot on the map, it queries the related data to the server.
What we want to do is selecting all the entries in the map to retrieves all data. This is done by ctrl clicking the whole area.

This makes a query for each "layers" parameter :
<code>
geometry:443999.04209536605,39001.6733318335,870499.0420953662,303001.67333183356
geometryFormat:geojson
geometryType:esriGeometryEnvelope
imageDisplay:1536,759,96
lang:en
layers:all:<i>LAYER_PARAM</i>
mapExtent:269999.04209536605,9501.673331833561,1037999.042095366,389001.67333183356
returnGeometry:true
tolerance:5
</code><br>
But doesn't select all dots on map, so let's try the "load more results" button on a 'accidetns with fatalities' layer, we get :
<code>
geometry:443999.04209536605,39001.6733318335,870499.0420953662,303001.67333183356
geometryFormat:geojson
geometryType:esriGeometryEnvelope
imageDisplay:1536,759,96
lang:en
layers:all:ch.astra.unfaelle-personenschaeden_getoetete
mapExtent:136199.04209536605,-28148.32666816644,1134599.042095366,465201.67333183356
<b>offset:200</b>
returnGeometry:true
tolerance:5
</code>
Pressing load more until no more possible give offset=1200 (for a total of 1337 objects) i.e it loads data entries 200 by 200

## JSON Data scraping

In [1]:
import requests
import json

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

from Scripts.helpers import *
from Scripts.plots import *


import pprint
#from bs4 import BeautifulSoup

In [2]:
pp = pprint.PrettyPrinter(indent=4)

In [3]:
#import raw data
data = import_data(all_data = True)

Processing layer : ch.astra.unfaelle-personenschaeden_alle
Layer processed : 90600 records

Processing layer : ch.astra.unfaelle-personenschaeden_getoetete
Layer processed : 1343 records

Processing layer : ch.astra.unfaelle-personenschaeden_fussgaenger
Layer processed : 11738 records

Processing layer : ch.astra.unfaelle-personenschaeden_fahrraeder
Layer processed : 18104 records

Processing layer : ch.astra.unfaelle-personenschaeden_motorraeder
Layer processed : 19676 records

Whole dataset processed : 141461 records



In [4]:
#translate data from german
json_data_preprocessed = preprocess_data(data)

In [5]:
print("Data entry example after clean and reformat:\n")
json_data_preprocessed[0]

Data entry example after clean and reformat:



{'accidentday_fr': 'mardi / 17h-18h / mars 2013',
 'accidenttype_fr': 'dérapage ou perte de maîtrise',
 'accidenttypecode': 0,
 'accidentyear': 2013,
 'bbox': [599334.0, 210608.0, 599334.0, 210608.0],
 'canton': 'BE',
 'coordinates': [[599334.0, 210608.0]],
 'featureId': 'DA1679BAB84B02BEE0430A8394271E0D',
 'fsocommunecode': '0310',
 'geometryType': 'Feature',
 'id': 'DA1679BAB84B02BEE0430A8394271E0D',
 'label': 'Schleuder- oder Selbstunfall',
 'layerBodId': 'ch.astra.unfaelle-personenschaeden_alle',
 'layerName': 'Accidents avec dommages corporels',
 'roadtype_fr': 'route principale',
 'roadtypecode': 432,
 'severitycategory_fr': 'accident avec blessés légers',
 'severitycategorycode': 'ULV',
 'type': 'Feature'}

In [6]:
df = pd.DataFrame.from_dict(json_data_preprocessed)
df.set_index('id', inplace=True)
df.head()

Unnamed: 0_level_0,accidentday_fr,accidenttype_fr,accidenttypecode,accidentyear,bbox,canton,coordinates,featureId,fsocommunecode,geometryType,label,layerBodId,layerName,roadtype_fr,roadtypecode,severitycategory_fr,severitycategorycode,type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
DA1679BAB84B02BEE0430A8394271E0D,mardi / 17h-18h / mars 2013,dérapage ou perte de maîtrise,0,2013,"[599334.0, 210608.0, 599334.0, 210608.0]",BE,"[[599334.0, 210608.0]]",DA1679BAB84B02BEE0430A8394271E0D,310,Feature,Schleuder- oder Selbstunfall,ch.astra.unfaelle-personenschaeden_alle,Accidents avec dommages corporels,route principale,432,accident avec blessés légers,ULV,Feature
DA1683A9406C0114E0430A8394277AA8,mardi / 17h-18h / mars 2013,dérapage ou perte de maîtrise,0,2013,"[596996.0, 198498.0, 596996.0, 198498.0]",BE,"[[596996.0, 198498.0]]",DA1683A9406C0114E0430A8394277AA8,351,Feature,Schleuder- oder Selbstunfall,ch.astra.unfaelle-personenschaeden_alle,Accidents avec dommages corporels,autoroute,430,accident avec blessés légers,ULV,Feature
DA249317CD4E01D0E0430A839427E8A4,mardi / 13h-14h / mars 2013,accident impliquant des piétons,8,2013,"[604459.0, 202727.0, 604459.0, 202727.0]",BE,"[[604459.0, 202727.0]]",DA249317CD4E01D0E0430A839427E8A4,352,Feature,Fussgängerunfall,ch.astra.unfaelle-personenschaeden_alle,Accidents avec dommages corporels,route secondaire,433,accident avec blessés légers,ULV,Feature
DA795D17EC3A02D8E0430A8394278BF3,mardi / 11h-12h / mars 2013,accident impliquant des piétons,8,2013,"[585172.0, 220301.0, 585172.0, 220301.0]",BE,"[[585172.0, 220301.0]]",DA795D17EC3A02D8E0430A8394278BF3,371,Feature,Fussgängerunfall,ch.astra.unfaelle-personenschaeden_alle,Accidents avec dommages corporels,route secondaire,433,accident avec blessés légers,ULV,Feature
DA7961683A7F01F4E0430A839427DE79,mardi / 18h-19h / mars 2013,dérapage ou perte de maîtrise,0,2013,"[731330.0, 255478.0, 731330.0, 255478.0]",SG,"[[731330.0, 255478.0]]",DA7961683A7F01F4E0430A839427DE79,3424,Feature,Schleuder- oder Selbstunfall,ch.astra.unfaelle-personenschaeden_alle,Accidents avec dommages corporels,autoroute,430,accident avec blessés légers,ULV,Feature


In [7]:
for feature in df : 
    print("plotting feature {}".format(feature))
    plot_feature(df, feature)
print("Done plotting")

plotting feature accidentday_fr
plotting feature accidenttype_fr
plotting feature accidenttypecode
plotting feature accidentyear
plotting feature bbox
plotting feature canton
plotting feature coordinates
plotting feature featureId
plotting feature fsocommunecode
plotting feature geometryType
plotting feature label
plotting feature layerBodId
plotting feature layerName
plotting feature roadtype_fr
plotting feature roadtypecode
plotting feature severitycategory_fr
plotting feature severitycategorycode
plotting feature type
Done plotting


# Data analysis

1) Accidents par rapport au temps<br>
2) Corrélation nombre/type d'accident avec les endroits (Valais ivresse)<br>
3) Tracker des anomalies (fin/début d'une série d'accident) et essayer d'en trouver la cause<br>