Skip to content

For "CartaGenie: Context-Driven Synthesis of City-Scale Mobile Network Traffic Snapshots" at PerCom 2022

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



2 Commits

Repository files navigation


This repo contains a dataset of synthetic mobile traffic for 5 cities in Germany generated using publicly available context data for those cities as input to the pre-trained CartaGenie model. We also include the input context data as part of the dataset.


  • context/:
    • Each city contains a folder with data for all different contextual attributes (conditions) used (population, POI, land use and sea conditions).
  • synthetic-traffic/: synthetic traffic data for each city
    • The data is stored in the .npy format as 3D tensors with dimension [height, width, time].
      • It can be read via np.load(FILE_PATH) after import numpy as np in Python.
      • You can easily convert it to other formats—see documentation of NumPy.
    • There are also images (average daily/peak-hour traffic for weekdays and weekends) per city for visualisation of the traffic data.

When using this dataset, please cite our paper using the following bibtex

    title = "CartaGenie: Context-Driven Synthesis of City-Scale Mobile Network Traffic Snapshots",
    author = "Kai Xu and Rajkarn Singh and Hakan Bilen and Marco Fiore and Marina, {Mahesh K.} and Yue Wang",
    year = "2022",
    month = jan,
    day = "23",
    booktitle = "Proceedings of The 20th International Conference on Pervasive Computing and Communications (PerCom 2022)",

or other options from the Research Explorer page of University of Edinburgh.

Context data

We provide the context data for 5 cities in Germany: Aachen, Bonn, Dresden, Frankfurt and Munich. The context data is a set of layers of publicly available information of a city as 2D images, one for each of the 27 contextual attributes. This includes census data (i.e. population), 11 types of land uses (e.g. where or not a location is green area) and 14 types of points of interest (e.g. whether or not a location is a cafe). See Section III.A and Figure 4 in the paper for the full list with description of the context data as well as how it was obtained.

Synthetic mobile traffic data

We provide synthetic spatiotemporal mobile traffic data that is generated by inputting the provided context data into the pre-trained CartaGenie model. The spatial size of the traffic data for each city varies depending on the spatial size of context data for that city. Specifically, the size of the traffic data is smaller than that of the corresponding context data to ensure that each pixel in the traffic data has sufficient surrounding context data. For the time dimension, the total duration of the traffic data is 3 weeks long, i.e. 7x3 = 21 time steps; the starting day is a Sunday. For granularity, we provide both aggregated daily data and peak hour data separately.

The data generation process

The provided synthetic data is generated in two steps to comply with the NDAs that cover the operator provided original mobile traffic dataset and support for this work. We train the CartaGenie model using one-month long original mobile traffic data for 4 cities in Country 1 (City A to City D as described in Section III.A of the paper) and their associated publicly available context data. We use the trained CartaGenie model with either aggregated daily or peak hour data to generate corresponding mobile traffic data for the above mentioned 5 cities in Germany by giving their respective context data as input to the model.


For "CartaGenie: Context-Driven Synthesis of City-Scale Mobile Network Traffic Snapshots" at PerCom 2022






No releases published


No packages published