Skip to content

A graph based strategic transport planning dataset, aimed at creating the next generation of deep graph neural networks for transfer learning. Based on simulation results of the Four Step Model in PTV Visum.

nikita6187/TransportPlanningDataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Strategic Transport Planning Dataset

A graph based strategic transport planning dataset, aimed at creating the next generation of deep graph neural networks in transfer learning situations. Based on simulation results of the Four Step Model in PTV Visum.

Details of the work as well as results can be found in the thesis "Development of a Deep Learning Surrogate for the Four-Step Transportation Model".

drawing

Example of prediction of a city's congestion level, using only socioeconomic and network information as input

State of the Art

Dataset Model Metric Test Performance Publication
Classification Simple - Private Transport GCNII Average F1 0.87 Makarov, 2021
Classification Simple - Public Transport GCNII Average F1 0.78 Makarov, 2021
Classification Hard - Private Transport GCNII MAE >=10 118.5 Makarov, 2021
Classification Hard - Public Transport GCNII MAE >=10 131.8 Makarov, 2021
Regression - Private Transport GCNII MAE >=10 135.4* Makarov, 2021
Regression - Public Transport GCNII MAE >=10 141.2* Makarov, 2021

If you reach a better performance, please issue a pull request with the new values!

* Regression model used only augmented dataset for training data

"MAE >=10" is the mean absolute error, but only applied to test samples where the target has at least 10 units/h. This is done as the lower values can be predicted accurately and are less interesting for the domain. An implementation can be found in the examples and details in the thesis above.

Dataset Overview

Dataset Example Training Samples Validation Samples Test Samples Task Primary Metric Graph Sizes Private Transport Public Transport Input & Output Transformed
1_classification_simple Example 6398 1600* 2000* Classification - 3 classes Average F1 15 - 80 original nodes Yes Yes Yes
2_classification_hard Example 16393 1600* 2000* Classification - 51 classes MAE >=10 15 - 80 original nodes Yes Yes Yes
3_regression Example 16393 1600* 2000* Regression MAE >=10 15 - 80 original nodes Yes Yes Yes

* Identical validation and test datasets

All of the data is pickled and compressed with pbz2, with the datasets being ready to be used in PyTorch Geometric. Please see the examples on how to use the data.

Problem Setting

The aim of strategic transport planning is to do long term predictions of a given city, based purely on the underlying transport network and socioecenomic data. The socioeconomic data of every household is grouped into so called zones. To solve the problem the model needs to both understand how socioeconomic data creates demand as well as it is applied to the supply side of the network. In practise, this is usually solved by the 4 Step Model, however it has a number of downsides including requiring a large amount of manual work to calibrate and slow prediction speed.

This problem is a great task for deep graph neural networks. However, there is no large public dataset available, so we propose to generate data using in a surrogate model setup, using the 4 Step Model as the baseline. The problem is a transfer learning for GNNs. Within this project, the aim is to create a proof of concept, with all generated cities having between 15 and 80 nodes, and 3-10 zones. Details can be found in thesis.

Data Generation

Augmented Dataset

  • Extract random subnetworks from processed OpenStreetMaps and procedural generation for socioeconomic data
  • 6398 training samples, 1600 validation samples, 2000 test samples

Synthetic Dataset

  • Procedural generation for both network and socioeconomic data
  • +9995 training samples

The dataset 1_classification_simple uses exclusively the augmented dataset. Both 2_classification_hard and 3_regression use the augmented dataset for training, validation and test sets, but also add the synthetic dataset for additional training samples. All problems use identical validation and test datasets, with the targets transformed to the respective task.

Data Transformation

  • The datasets presented have all input features standardized and filtered as needed
  • As most common GNNs can only do simulatenous predictions for nodes, all edges are transformed into nodes, thus requiring masking during training
    • Irrelevant node/edge features are then set to 0

All details can be found in the thesis above.

Open Challenges for GNNs

If you need inspiration for what to focus on to improve GNNs, here are some open challenges with details in the thesis above:

  • Recurrent GNNs - dynamic depth based on size of input graph
  • Overcoming the bottleneck of GNNs - seen severely on this dataset
  • Seeing whether we can automatically correct nodes propagating errors

Issues & Dataset Requests

If you find any issues with the data or want to get a specific version of the data, please raise an issue over at the top.

Citing

If you create any new work based on this dataset, and it is only using the data:

@misc{makarov2021,
    author = {Makarov, Nikita and Narayanan, Santhanakrishnan and Antoniou, Constantinos},
    institution = {Transportation Systems Engineering},
    school = {Technical University of Munich},
    title = {Development of a Deep Learning Surrogate for the Four-Step Transportation Model},
    year = 2021,
    url = {https://github.com/nikita68/TransportPlanningDataset}
}

If you also refer to the previous baselines, please also cite:

@mastersthesis{makarov2021,
    author = {Makarov, Nikita},
    institution = {Transportation Systems Engineering},
    school = {Technical University of Munich},
    title = {Development of a Deep Learning Surrogate for the Four-Step Transportation Model},
    year = 2021,
    url = {https://mediatum.ub.tum.de/doc/1638691/dwz10x0l0w38xdklv9zkrprqs.pdf}
}

License & Acknowledgements

About

A graph based strategic transport planning dataset, aimed at creating the next generation of deep graph neural networks for transfer learning. Based on simulation results of the Four Step Model in PTV Visum.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages