# Visualizing OD mobilities & edgebundling in QGIS

## Intro

Data showing spatial relations between places is common – in fact, we visualized flight connections between airports during week 2 and plotted mobilities between Flickr users' homecountries and protected areas as chord diagrams in non-carto-vis-Python.ipynb. Origin and destination (OD) are the minimum locations is the minimum information needed to draw a trajectory.

Working with big (or even small) OD line data can overcrowd a static map very quickly. For example, a naïve "hairball" visualization ([Poorthuis, 2018](https://doi.org/10.22224/gistbok/2018.3.5)) of this practical's data without any styling looks like this:

![Hairball visualization of lines](Figures/Practical_3_edgebundling_hairball.png)

Our example data only contains 472 mobilities, heading to one of three regions in Germany. A map of even this simple data above tells us very little of those mobility dynamics. Which of the three is the most popular? Where do the mobilities originate from?

We've seen in previous tutorials how finetuning the symbology by reducing linewidth, adding transparency and choosing a more pleasant color scheme improves the readability of such mobility maps. However, we can do more.

OD data capture simple, abstracted mobilities (e.g., [city bike trips between stations](https://hri.fi/data/en_GB/dataset/helsingin-ja-espoon-kaupunkipyorilla-ajatut-matkat)). In these cases, it's beneficial to show not just which places are connected but the magnitude of mobilities.

In this tutorial, you will learn:

1. Two approaches for visualizing OD data in QGIS.
2. The principles of edgebundling and when it may the tool to go with.

The tutorial closes by pointing to resources and tools for getting deeper into the subjects.

## Prerequisites

### Plugin
- [Edgebundling plugin](https://github.com/ait-energy/qgis-edge-bundling/tree/master) by Anita Graser. 
This plugin **is not available** in the public plugin repository. Instead, follow the instructions below to install it:

<div style="background-color: #66f2ff; padding: 10px; border-left: 5px solid #00a4b3; margin-bottom: 10px;">
    
1. Download *processing_edgebundling.zip* from [this link](QGIS-files/processing_edgebundling.zip) or from the path *QGIS-files/processing_edgebundling.zip*
    
2. Open the plugin installation window (*Plugins > Manage and Install plugins*)
    
3. Go to *Install from zip*, select the zip file you just downloaded and install the plugin. 
    
4. The function *Force-directed edge bundling* has been added to the *Processing Toolbox*.
</div>

### Data
We'll work with data that describes [student mobilities in the European Union's Erasmus](http://data.europa.eu/88u/dataset/erasmus-mobility-raw-data) exhange program at the level of the statistical NUTS2 regions. The full dataset has been processed by Tuomas Väisänen and colleagues as part of [Mobi-Twin](https://www.helsinki.fi/en/researchgroups/digital-geography-lab/projects/mobi-twin) project at the [Digital Geography Lab](https://www.helsinki.fi/en/researchgroups/digital-geography-lab/), University of Helsinki. Find the [full dataset here](https://zenodo.org/records/14332354) and a [data description paper here](https://doi.org/10.1038/s41597-025-04789-0).

- [Download an excerpt we'll be using (erasmus-mobility-data.gpkg) here](data/erasmus-mobility-data.gpkg)

### QGIS files
As always, there are several style files and a QGIS processing model file that runs the whole processing chain.

- You can download all the files from [this link](QGIS-files/QGIS-files-week3.zip) or download them individually in the folder QGIS-files.

## Graduated line map

One way to represent quantity is, naturally, to make certain connections more prominent. With a graduated line map, we can use width and color for that purpose.

Add data from *erasmus-mobility-data* to the project. Examine line layer on the map view and its attribute table.

The geopackage contains two layers:
- *2018_student_mobility_NUTS2_germany_top3*: Erasmus student exhanges that have their destination in German NUTS2 regions. The data has been filtered to only include mobilities towards the three most visited [NUTS2 areas in Germany](https://en.wikipedia.org/wiki/NUTS_statistical_regions_of_Germany) – Berlin (DE30), Köln (DEA2) and Oberbayern (DE21).
- *NUTS2 Centroids*: Centroids of the three layers. Used for labelling.

Let's calculate how many connections are between the each origin and destination region, similarly to how we did it in week 2's global map:

<div style="background-color: #66f2ff; padding: 10px; border-left: 5px solid #00a4b3; margin-bottom: 10px;">
   
1. Run the processing tool *Aggregate*.
2. Parameters:
    - Input layer: *2018_student_mobility_NUTS2_germany_top3*
    - Group by expression: *OD_ID*
    - Aggregates:
        1. Keep only *OD_ID* and *fid*, remove others.
        2. Aggregate function
            - OD_ID: *first_value*
            - fid: *count*
        3. Name:
            - fid: *mobilities*
    
</div>

### Styling
QGIS offers two ways to emphasize a *graduated* style: size or color. This will usually be enough, but what if you'd want to use both variables at the same time? It's very much doable, but we'll need a bit more fidgeting.

<div style="background-color: #66f2ff; padding: 10px; border-left: 5px solid #00a4b3; margin-bottom: 10px;">
   
1. Apply a graduated style to the *Aggregated* layer.
2. Choose color as the styling method. Select what you think is an appropiate number of classes, classification method and color scale. (Example: *4*, *Natural breaks* and *Reds*.
3. To modify linewidth, we'll use data-defined overrides. 
    1. Open up *Symbol > Configure symbol > Width > DD override symbol > Edit*.
    2. Paste the expression `scale_linear( "mobilities", minimum("mobilities"), maximum("mobilities"), 0.2,2.5)`
    
        a. Read this expression as: use values from the field *mobilities* and scale them to a new value between 0.2 and 2.5.
    3. Modify the transparency and other style definitions as you wish.
    
    
</div>

This data-defined method differs from rule-based or graduated approaches in that we're not classifying the data, but rather smoothly scaling the linewidth from the minimum value (0.2) to the maximum (2.5 millimeters). However, doing styling like this complicates some other aspects: for example, automatically creating a legend that would accurately show 

<div style="background-color: #39f98f; padding: 10px; border-left: 5px solid #059445; margin-bottom: 10px;">
This example uses the following styles:
    
- *graduated_line_style.qml*
- *nuts2_centroids_label_style.qml*
    
</div>

![Graduated line map](Figures/graduated_line_map.jpeg)

#### QGIS tip

Wonder where that background world map came from?

Simply type "world" into the field that shows the current coordinate locations and press Enter. This will add a simple world map of country boundaries, most likely based on [Natural Earth](https://www.naturalearthdata.com/).

## Edgebundling
Instead of aggregating by attribute information, there are methods to aggregate, or cluster, by location. Edgebundling is a clustering technique for line features (see [Graser, 2019](https://doi.org/10.1177/1473871617738122)). It can be used to lessen visual clutter in linemaps.

We'll be using a plugin that implements force-directed edgebundling for QGIS. This is, to our knowledge, the only edgebundling implementation that has been published for QGIS, although it is by no means the only edgebundling technique available (see examples of EB algorithms in [Wallinger 2021](https://arxiv.org/pdf/2108.05467)).

<div style="background-color: #66f2ff; padding: 10px; border-left: 5px solid #00a4b3; margin-bottom: 10px;">
   
1. Run *Force-directed edge bundling* from the processing toolbox.
2. Parameters:
    - Input layer: *2018_student_mobility_NUTS2_germany_top3*
    - Use cluster field: Leave this deselected
    - Initial step size: 1000
    - You may leave the other parameters as-is.
    
</div>

### Edgebundling and parameters
Finding description of these parameters was a bit tough, but these are their effects to our understanding based on [Graser et al. (2019)](https://doi.org/10.1177/1473871617738122):

- `Initial step size` \[map units\]: Larger values will cause more distortion, possibly also artifacts. It uses map units (meters in the data's CRS).
- `Compability` \[0–1\]: Defines how many edges are involved in the bundling. Lower values will take more processing time but have stronger bundling outcomes.
- `Cycles` & `iterations` \[>0\]: higher values will result in better outcomes at the expense of processing time.
- `Cluster field`\[Yes / No\]: A feature in this implementation of edgebundling – instead of bundling all lines, it can bundle a set number of clusters. May reduce computation time at the expense of accuracy.

Edgebundling is a fickle craft. Good parameter values will be dependent on the dataset and its scale – finding a good mix will likely be a process of trial and error.

Of these, `initial step size` will be especially influential for the outlook of the map. 

Below is our data processed with initial step sizes of 1000, 2000, 5000 and 10,000 while keeping other parameters constant. Notice how the larger values will cause larger distortions and larger bundles whereas the smaller values will produce a more conservative outcome.

<div style="background-color: #39f98f; padding: 10px; border-left: 5px solid #059445; margin-bottom: 10px;">

These examples use the style: **bundled_edges_thin.qml**
    
</div>

![Edgebundling starting step comparison](Figures/Practical_3_edgebundling_starting_step_comparison.png)

### Styling
Bundling the lines only helps somewhat to distinguish the routes (and even that's up for debate!). We'll still need smart styling of the layer to make our map more useful.

Some of the ideas that went into this style:

<div style="background-color: #ffa64d; padding: 10px; border-left: 5px solid #cc6600; margin-bottom: 10px;">

- Categorized layer style with *DESTINATION* as the value field. A qualitative color scheme should be used here. For example:
    - DE21: `#ffa719` (orange)
    - DE30: `#ff23e9` (violet)
    - DEA2: `#63bbff` (blue)
- High transparency (opacity 20 %) to make the clusters of lines stand out.
- Exaggerated linewidth (0.5 mm)
- Remember to consider layouting and map elements!
    - For example, remember to fit the printout page to the data. In this case, it's rather square.
    
</div>

![Edgebundling final](Figures/edgebundling_final_map.jpeg)

Compare the edgebundled linemap to the graduated one. What do they highlight well? What are their weaknesses? When would, for example, a chord diagram be better suited to describe mobilities between places?

### Where to dig deeper into edgebundling?
Edgebundling can be used create some really striking flow maps – with the right data and a lot of parameter fidgeting. Force-directed edgebundling has some downsides, as well. For one, it doesn't scale particularly well to large datasets. This is why the example data we used is a small extract of the whole with only some hundreds of lines to three destinations – processing anything bigger might take from minutes up to days. Also note that this example data only has movements to Germany: usually, OD data has mobilities to and from! Finally, having pre-made implementations of cutting-edge algorithms in QGIS is not likely. For that, the programmatic way is usually wise. 

Below are a few examples of tools for programming languages. The repositories linked have various cool examples that use other algorithms, as well!

- Python:
    - [Edge-bundling tool](https://doi.org/10.5281/zenodo.14532547) by Väisänen et al., the same group that created the dataset we worked on
        - Check out this [very cool EB visualization by them](https://vis.social/@waeiski/114352700408484081).
    - [Edge-path bundling](https://github.com/xpeterk1/edge-path-bundling).
        - A implementation of a paper by [Wallinger et al. 2021](https://arxiv.org/pdf/2108.05467)
        - Check out [this demo](https://mwallinger-tu.github.io/edge-path-bundling/)
- Tools for R:
    - [Edgebundle](https://github.com/schochastics/edgebundle)
    - [Implementation in GGraph](https://r-graph-gallery.com/hierarchical-edge-bundling.html)

## Replicating the processing flow of this notebook
To replicate this processing flow, run the processing model *lines-edgebundling-model.model3*. Open the model in QGIS from the leftmost button below *Processing toolbox* -> *Open existing model*. 

You will need to add the example data and have the style files shared in the folder QGIS-files at the ready to run the model. Please also note that this model includes some hard-coded field-names (as do most of these models!). They are meant for replicating this notebook, and repurposing them for general use might require some modifications.

![Line map model](Figures/lines-and-bundling-model.png)