Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to calculate the effective distance between nodes in a graph? #3

Open
joheli opened this issue Sep 10, 2020 · 4 comments
Open

How to calculate the effective distance between nodes in a graph? #3

joheli opened this issue Sep 10, 2020 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@joheli
Copy link
Owner

joheli commented Sep 10, 2020

Folks I would like to understand how to correctly apply the function eff_dist of package NetOrigin and there is one step I am not sure of (see below the paragraph in bold).

First, let's create a few nodes with random edges and weights:

# set seed using R version 4.0.2 (2020-06-22)
set.seed(321)

# helper function to exclude non-unique links, see below
sl <- function(a, b) {
  v <- c(a, b)
  s <- sort(v)
  paste(s, collapse = "")
}

# create an undirected graph with random edges and weights
nodes <- letters[1:6]
edges0 <- data.frame(from = sample(nodes, 30, TRUE), to = sample(nodes, 30, TRUE)) %>%
  rowwise() %>%
  mutate(link = sl(from, to)) %>% # exclude non-unique links using function 'sl' (see above)
  filter(to != from) %>% # exclude links to self
  mutate(weight = runif(n = 1))

edges1 <- edges0
edges1$dupllink <- duplicated(edges1$link)

edges <- filter(edges1, !dupllink) %>%
  select(from, to, weight) %>%
  arrange(from, to, weight)

edges now looks like this:

> edges
# A tibble: 14 x 3
# Rowwise: 
   from  to    weight
   <chr> <chr>  <dbl>
 1 a     e     0.596 
 2 b     a     0.979 
 3 b     c     0.0852
 4 b     e     0.162 
 5 b     f     0.792 
 6 c     a     0.914 
 7 c     f     0.165 
 8 d     a     0.634 
 9 d     b     0.187 
10 d     c     0.957 
11 d     f     0.997 
12 e     c     0.599 
13 e     f     0.311 
14 f     a     0.531 

Now, let's proceed to create a beautiful graph:

# gr is an undirected graph
gr <- igraph::graph_from_data_frame(edges, directed = FALSE, vertices = nodes)

# for plotting use tidygraph
gr.tidy <- tidygraph::as_tbl_graph(gr)

# create a ggraph object
p.gr <- ggraph(gr.tidy, layout = "graphopt") +
  # geom_edge_link(aes(width = weight), alpha = .2, lineend = "round") + 
  geom_edge_link(aes(width = weight, alpha = weight), lineend = "round", color = "red") + 
  scale_edge_width(range = c(0.1, 12)) +
  scale_edge_alpha(range = c(.05, .4)) +
  geom_node_point(size = 5) + 
  geom_node_text(aes(label = name), repel = T, size = 10, color = "darkgreen") +
  theme_graph()

# print on screen
print(p.gr)

The result (p.gr) looks like this:

Rplot

The final step(s) are where I am unsure:

# Prepare calculation of effective distance
amx <- igraph::as_adjacency_matrix(gr, attr = "weight", sparse = T) %>% as.matrix
# Calculate transition probability matrix - this is where I am lost, because a direction is introduced:
p <- amx/rowSums(amx)
# Calculate effective distances
ed <- NetOrigin::eff_dist(p)

ed looks like this:

> ed
         a        b        c        d        e        f
a 0.000000 2.317359 2.385861 2.752019 2.812498 2.928399
b 1.812437 0.000000 4.198298 3.467534 3.607921 2.024182
c 2.090858 4.408217 0.000000 2.044702 2.513411 3.800440
d 2.476735 3.697172 2.064421 0.000000 4.577832 2.023613
e 2.028844 3.329189 2.024760 4.069462 0.000000 2.680275
f 2.660908 2.261614 3.827953 2.031407 3.196439 0.000000

My question revolves around the creation of p: up to this point the matrices have been 'symmetrical' (i.e. the graph has been 'undirected') but now e.g. 'a to b' is not equivalent to 'b to a'. E.g. if you look at the first column (heading 'a') the ranking of the distances (b < e < c) do not correspond to image p.gr or table edges (where b < c < e).

Is this a feature of the function or am I interpreting the output in wrong way?

Thank you for any hints.

@joheli joheli added the question Further information is requested label Sep 10, 2020
@joheli joheli pinned this issue Sep 11, 2020
@jmanitz
Copy link

jmanitz commented Sep 14, 2020

@jmanitz
Copy link

jmanitz commented Sep 15, 2020

Dear Johannes,

Thanks a lot for your comment and careful considerations. You are correct, the effective distance is not a metric in the mathematical sense as it does not fullfill symmetry, other conditions are fullfilled:

  • non-negative,
  • identity of indiscernibles
  • symmetry
  • triangle in equality

Some more theory on the derivation can be found here:

  • Manitz, J., Kneib, T., Schlather, M., Helbing, D. and Brockmann, D. (2014) Origin detection during food-borne disease outbreaks - a case study of the 2011 EHEC/HUS outbreak in Germany. PLoS Currents Outbreaks, 1. <DOI: 10.1371/currents.outbreaks.f3fdeb08c5b9de7c09ed9cbcef5f01f2> url: http://currents.plos.org/outbreaks/index.html%3Fp=36515.html
  • Manitz, J., Harbering, J., Schmidt, M., Kneib, T. and Schöbel, A., 2017. Source estimation for propagation processes on complex networks with an application to delays in public transportation systems. Royal Statistical Society. Journal. Series C: Applied Statistics, 66(3), pp.521-536. url: https://core.ac.uk/download/pdf/154416059.pdf

Does that clarify your question?

Best regards from Boston
Juliane

@joheli
Copy link
Owner Author

joheli commented Sep 15, 2020

Dear Juliane,

thank you so much for your comment and the references - I obviously have to delve deeper into the matter.

Could you maybe comment on the following to further send me on the right track?

  • If effective distances are non-symmetric by nature, does that mean that I can only estimate them on a directed graph?
  • Could you also give an example (or use the one above) - how do I estimate effective distances with NetOrigin::eff_dist given a directed, weighted graph?

Also, how do I interpret the output, i.e. in a matrix of effective distances generated by NetOrigin::eff_dist

A B
A X1 X3
B X2 X4

Is X2 B to A or A to B?

Kind regards,

Johannes

@jmanitz
Copy link

jmanitz commented Sep 16, 2020

Hi Johannes,

no worries at all, I am happy to help!

The effective distance works for directed and undirected graphs. The implementation is able to accommodate that too. Here a little toy example:

# adjacency matrix with random weights
net <- matrix(nrow=6, ncol=6, data=runif(36))
isSymmetric(net)

# transition probabilties
p <- net/rowSums(net)
image(p)

# effective distance
eff <- eff_dist(p)
image(eff)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants