# Research proposal TIL Programming: 
### Working from home and mobility patterns





## Project Group 20

Members and student numbers: 

- Stijn Brons - 5403707
- Philippe Sturm - 5168163
- Naut Linssen - 5032563
- Pelle Boucher - 4885236
- Teun van Wingerden - 5063108


# 1. Introduction
Over the past decade, the Netherlands has witnessed a notable increase in remote work, which has significantly altered various aspects of mobility and commuting patterns. The objective of this research is to examine the impact of remote work on mobility trends, transportation modes and their long-term implications for urban planning and environmental outcomes.


# 2. Research Objective

*Requires data modeling and quantitative research in Transport, Infrastructure & Logistics*

*The primary objective of this research is to quantify and understand the impact of the increase in remote work over the past 10 years on mobility in the Netherlands.*

# 3.Sub-Objectives
- **Commuting Patterns**: To assess how remote work has influenced commuting patterns and reduced the frequency of commuting trips in the past decade.
- **Transport Modes**: The aim is to assess how remote work has influenced commuting patterns and reduced the frequency of commuting trips over the past decade.
- **Traffic Congestion and Public Transport**: To investigate how the rise of remote work has impacted traffic congestion and crowding in public transport, especially during peak hours in urban areas.
- **Emissions and Air Quality**: To analyze how remote work has influenced transport-related emissions and air quality in cities.
- **Long-term Trends**: To project long-term trends in remote work adoption and their potential effect on future mobility demand and urban infrastructure planning.

# 4. Research questions
The research questions for this project are:

- **RQ1**: How has the frequency and pattern of commuting changed due to the rise of remote work over the past decade?
- **RQ2**: To what extent has the use of different transport modes (cars, public transport, bicycles) been affected by remote work in the last 10 years?
- **RQ3**: How has remote work influenced traffic congestion and crowding in public transport during peak hours in urban centers?
- **RQ4**: How has the increase in remote work affected transport-related emissions and air quality in Dutch cities?
- **RQ5**: What long-term trends are projected in remote work adoption, and how will they influence future mobility and urban infrastructure planning?


# Contribution Statement

*Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling*

**Author 1**: Pelle Boucher - Implementing data in the code, implementing text in the notebook

**Author 2**: Teun van Wingerden - Writing text

**Author 3**: Stijn Brons - Setting up the notebook

**Author 4**: Philippe Sturm - Data Research

**Author 5**: Naut Linssen - Data Research

# Data Used

## Data Sources

All data will be collected from the **CBS Statline Database**. This will be the primary data source for mobility, transport usage, and environmental data over the past 10 years in the Netherlands. (Available at [CBS Statline](https://opendata.cbs.nl)).

- **Mobility and Transport**: Data on trips, modes of transport, travel motives, age, and gender. (Available at [Mobiliteitstrend](https://opendata.cbs.nl/statline/#/CBS/nl/dataset/84755NED/table?ts=1727864298786)).

- **Emissions and Air Quality**: Data on CO2 emissions, particulate matter levels, and other environmental metrics.

- **Working from home**: Data on people working from home, per age group, education level, labour duration and position in the family Available at [Working from home](https://opendata.cbs.nl/statline/#/CBS/nl/dataset/83258NED/table?ts=1728062945121).


The data is imported as an csv file and decoded below:

In [50]:
import pandas as pd
import chardet
import plotly_express as px
""" MOBILITY AND TRANSPORT"""
dataset_path_mat = 'Mobiliteitstrend__per_rit_en_motief_02102024_122208.csv'
with open(dataset_path_mat, 'rb') as rawdata:

    result = chardet.detect(rawdata.read(100000))
    
result



{'encoding': 'UTF-8-SIG', 'confidence': 1.0, 'language': ''}

In [43]:
df_mat = pd.read_csv(dataset_path_mat,delimiter=';',encoding='UTF-8-SIG')
df_mat

Unnamed: 0,Geslacht,Leeftijd,Vervoerwijzen,Reismotieven,Marges,Perioden,Ritten per persoon per dag (gemiddeld) (aantal),Afgelegde afstand per rit (gemiddeld) (km),Reisduur per rit (Minuten)
0,Totaal mannen en vrouwen,Totaal,Totaal,Totaal,Waarde,1999*,3.41,10.96,23.72
1,Totaal mannen en vrouwen,Totaal,Totaal,Totaal,Waarde,2003*,3.30,11.04,23.65
2,Totaal mannen en vrouwen,Totaal,Totaal,Totaal,Waarde,2007*,3.20,11.07,23.88
3,Totaal mannen en vrouwen,Totaal,Totaal,Totaal,Waarde,2011*,3.09,11.21,23.76
4,Totaal mannen en vrouwen,Totaal,Totaal,Totaal,Waarde,2015*,3.03,11.36,24.14
...,...,...,...,...,...,...,...,...,...
1435,Totaal mannen en vrouwen,60 tot 65 jaar,Overige vervoerwijze,Overige reismotieven,Waarde,2019*,.,.,.
1436,Totaal mannen en vrouwen,60 tot 65 jaar,Overige vervoerwijze,Overige reismotieven,Waarde,2020*,.,.,.
1437,Totaal mannen en vrouwen,60 tot 65 jaar,Overige vervoerwijze,Overige reismotieven,Waarde,2021*,.,.,.
1438,Totaal mannen en vrouwen,60 tot 65 jaar,Overige vervoerwijze,Overige reismotieven,Waarde,2022*,.,.,.


In [64]:
px.line(df_mat,
        x="Perioden",
        y="Ritten per persoon per dag (gemiddeld) (aantal)",
        color="Vervoerwijzen")

In [44]:
""" EMISSIONS AND AIR QUALITY """
dataset_path_eaq = 'Emissies_naar_lucht__Nederland_totaal_04102024_195121.csv'
with open(dataset_path_eaq, 'rb') as rawdata:

    result = chardet.detect(rawdata.read(100000))
    
result

{'encoding': 'UTF-8-SIG', 'confidence': 1.0, 'language': ''}

In [46]:
df_eaq = pd.read_csv(dataset_path_eaq,delimiter=";",encoding="UTF-8-SIG")
df_eaq

Unnamed: 0,Emissiebronnen,Emissies naar lucht,Perioden,Emissies naar lucht (mln kg)
0,Vervoer,Kooldioxide (CO2),1990,30000.0
1,Vervoer,Kooldioxide (CO2),1995,31800.0
2,Vervoer,Kooldioxide (CO2),2000,35600.0
3,Vervoer,Kooldioxide (CO2),2001,36100.0
4,Vervoer,Kooldioxide (CO2),2002,36600.0
...,...,...,...,...
2075,Railverkeer; totaal,Koolmonoxide (CO),2019,0.2
2076,Railverkeer; totaal,Koolmonoxide (CO),2020,0.2
2077,Railverkeer; totaal,Koolmonoxide (CO),2021,0.2
2078,Railverkeer; totaal,Koolmonoxide (CO),2022,0.2


In [38]:
""" WORKING FROM HOME """
dataset_path_wfh = 'Werkzame_beroepsbevolking__thuiswerken_04102024_193141.csv'
with open(dataset_path_wfh, 'rb') as rawdata:

    result = chardet.detect(rawdata.read(100000))
    
result

{'encoding': 'UTF-8-SIG', 'confidence': 1.0, 'language': ''}

In [41]:
df_wfh = pd.read_csv(dataset_path_wfh,delimiter=";",encoding="UTF-8-SIG")
df_wfh

Unnamed: 0,Geslacht,Thuiswerker,Positie werkkring,Persoonskenmerken,Perioden,Werkzame beroepsbevolking (x 1 000)
0,Totaal mannen en vrouwen,Totaal,Totaal,Totaal personen,2013,8266
1,Totaal mannen en vrouwen,Totaal,Totaal,Totaal personen,2014,8214
2,Totaal mannen en vrouwen,Totaal,Totaal,Totaal personen,2015,8294
3,Totaal mannen en vrouwen,Totaal,Totaal,Totaal personen,2016,8403
4,Totaal mannen en vrouwen,Totaal,Totaal,Totaal personen,2017,8579
...,...,...,...,...,...,...
1619,Totaal mannen en vrouwen,Geen thuiswerker,Totaal,Positie: overig lid,2016,119
1620,Totaal mannen en vrouwen,Geen thuiswerker,Totaal,Positie: overig lid,2017,132
1621,Totaal mannen en vrouwen,Geen thuiswerker,Totaal,Positie: overig lid,2018,156
1622,Totaal mannen en vrouwen,Geen thuiswerker,Totaal,Positie: overig lid,2019,167


In [82]:
df_thuiswerker = df_wfh[(df_wfh['Thuiswerker'] == 'Thuiswerker') & 
                        (df_wfh['Positie werkkring'] == 'Totaal') & 
                        (df_wfh['Persoonskenmerken'] == 'Totaal personen')]


px.line(df_thuiswerker,
        x="Perioden",
        y="Werkzame beroepsbevolking (x 1 000)")

# Data Pipeline

## 5. Intended Data Pipeline

### Data Collection
The primary data source for this research will be the CBS Statline database, which provides a wide range of data on mobility, transport, commuting patterns, and environmental factors.

### Data Cleaning
Data from CBS will be pre-processed to remove incomplete records, normalize values across different datasets, and ensure consistency in time-series data.

### Data Integration
Different datasets (e.g., mobility data, emissions data, public transport usage data) will be integrated using geographic and temporal references to allow for comprehensive analysis.

### Data Analysis
Quantitative analysis will be performed to answer each research question:
- Time-series analysis of mobility trends over the past decade.
- Statistical comparisons of transport mode usage in urban vs. rural areas.
- Congestion analysis using data on peak-hour traffic volumes.
- Air quality and emissions analysis in cities before and after the rise of remote work.

### Visualization
The results will be visualized through:
- Line graphs showing changes in commuting frequency and transport mode usage.
- Heatmaps depicting traffic congestion and emissions levels across regions.
- Geographic visualizations comparing urban and rural mobility trends.


### Time and Spatial Scale

- **Time Scale**: The study will focus on a 10-year period, from 2013 to 2023, with a special focus on the years following the COVID-19 pandemic (2020–2023) due to the rapid adoption of remote work.

- **Spatial Scale**: The geographical boundary will cover the entirety of the Netherlands, with a particular focus on urban centers (e.g., Amsterdam, Rotterdam, Utrecht) compared to rural areas. This will allow for regional comparison and identification of mobility trends in different parts of the country.
