Distances on Directed Graphs in R
Clone or download

README.md

Build Status AppVeyor Build Status codecov Project Status: Active CRAN_Status_Badge CRAN Downloads CII Best Practices

dodgr: Distances on Directed Graphs in R

R package for calculating pairwise distances on dual-weighted directed graphs using Priority Queue Shortest Paths. Dual-weighted directed graphs are directed graphs with two sets of weights so that weight1(A->B) != weight1(B->A)—the directed property—and weight2(A->B) != weight1(A->B)—the dual property. dodgr calculates shortest paths according to one weight, while distances along paths are calculating using the other weight. A canonical example of a dual-weighted directed graph is a street network to be used for routing. Routes are usually calculated by weighting different kinds of streets or ways according to a particular mode of transport, while the desired output is a direct, unweighted distance.

Installation

You can install dodgr from github with:

# install.packages("devtools")
devtools::install_github("ATFutures/dodgr")

Note on macOS

On macOS, if you need "gfortran", you could install "gcc" from brew or similar package manager, but on brew "gfortran" is included in "gcc". After installing gcc you may still have R complaining about library not found for -lgfortran. Others have written about this issue in detail, we recommend this quick answer on SO and adding a line FLIBS=-L/usr/local/lib/gcc/8/ (given that you installed gcc8) to your ~/.R/Makevars file for R to be aware of your gcc path.

Usage

The primary function,

d <- dodgr_dists (graph = graph, from = pts, to = pts)

produces a square matrix of distances between all points listed in pts and routed along the dual-weighted directed network given in graph. An even simpler usage allows calculation of pair-wise distances between a set of geographical coordinates (here, for a sizey chunk of New York City):

xlim <- c (-74.12931, -73.99214)
ylim <- c (40.70347, 40.75354)
npts <- 1000
pts <- data.frame (x = xlim [1] + runif (npts) * diff (xlim),
                   y = ylim [1] + runif (npts) * diff (ylim))
st <- Sys.time ()
d <- dodgr_dists (from = pts)
Sys.time () - st
#> Time difference of 9.473597 secs
range (d, na.rm = TRUE)
#> [1]  0.00000 16.64285

This will automatically download the street network (using osmdata), and even then calculating distances between 1,000 points – that’s 1,000,000 pairwise distances! – can be done in around 10 seconds.

Other Functions

The other main functions of dodgr are dodgr_paths, to return the actual paths between pairs of vertices, and dodgr_flows to aggregate flows as routed throughout a network according to sets of start and end points (origins and destinations), and associated densities or numbers travelling between each of these.

The dodgr graph structure

A graph or network in dodgr is represented as a flat table (data.frame, tibble, data.table, whatever) of minimally four columns: from, to, weight, and distance. The first two can be of arbitrary form (numeric or character); weight is used to evaluate the shortest paths, and the desired distances are evaluated by summing the values of distance along those paths. For a street network example, weight will generally be the actual distance multiplied by a priority weighting for a given mode of transport and type of way, while distance will be the pysical distance.

dodgr includes the conversion functions:

  1. dodgr_to_sfc to convert spatial dodgr graphs into Simple Features format used by the sf package.
  2. dodgr_to_igraph to convert (not necessarily spatial) dodgr graphs into igraph format (not yet implemented); and
  3. dodgr_to_tidygraph to convert (not necessarily spatial) dodgr graphs into tidygraph format (not yet implemented).

Further detail

For more detail, see the main package vignette, along with a second vignette detailing benchmark timings, showing that under many circumstances, dodgr performs considerably faster than equivalent routines from the igraph package.