/
indirect_relations.Rmd
128 lines (110 loc) · 5.35 KB
/
indirect_relations.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
title: "Indirect relations in networks"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{04 indirect relations in networks}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
This vignette describes the importance of indirect relations on networks, how
they are used in centrality indices and how they are implemented in the `netrankr` package.
________________________________________________________________________________
## Theoretical Background
A one-mode network can be described as a *dyadic variable* $x\in \mathcal{W}^\mathcal{D}$,
where $\mathcal{W}$ is the value range of the network (in the simple case of
unweighted networks $\mathcal{W}=\{0,1\}$) and $\mathcal{D}=\mathcal{N}\times\mathcal{N}$
describes the dyadic domain of actors $\mathcal{N}$.
\
\
Observed presence or absence of ties (the value range is binary) is usually not
the relation of interest for network analytic tasks. Instead, mostly implicitly,
relations are *transformed* into a new set of *indirect* relations on the basis
of the *observed* relations. As an example, consider (shortest path) distances in the
underlying graph. While they are fairly easy to derive from an observed network
of contacts, it is impossible for actors in a network to answer the question
"How far away are you from others you are not connected with?". We denote generic
transformed networks from an observed network $x$ as $\tau(x)$.
\
\
With this notion of indirect relations, we can express centrality indices in
a common framework as
$$
c_\tau(i)=\sum\limits_{t \in \mathcal{N}} \tau(x)_{it}
$$
Degree and closeness centrality, for instance, can be obtained by setting $\tau=id$
and $\tau=dist$, respectively. Others need several additional specifications which
can be found in [Brandes (2016)](https://dx.doi.org/10.1177/2059799116630650) or
[Schoch & Brandes (2016)](https://doi.org/10.1017/S0956792516000401).
\
With this framework, we can characterize centrality indices as degree-like
measures in a suitably transformed network $\tau(x)$.
________________________________________________________________________________
## Indirect relations in the `netrankr` package
```{r setup, warning=FALSE,message=FALSE}
library(netrankr)
library(igraph)
```
The `netrankr` package implements a great variety of indirect relations that are
(or could be) used for centrality related considerations in a network. All indirect
relations can be computed with the `indirect_relations()` function, by specifying
the `type` parameter.
```{r indirstandard}
data("dbces11")
g <- dbces11
# adjacency
A <- indirect_relations(g, type = "adjacency")
# shortest path distances
D <- indirect_relations(g, type = "dist_sp")
# dyadic dependencies (as used in betweenness centrality)
B <- indirect_relations(g, type = "depend_sp")
# resistance distance (as used in information centrality)
R <- indirect_relations(g, type = "dist_resist")
# Logarithmic forest distance (parametrized family of distances)
LF <- indirect_relations(g, type = "dist_lf", lfparam = 1)
# Walk distance (parametrized family of distances)
WD <- indirect_relations(g, type = "dist_walk", dwparam = 0.001)
# Random walk distance
WD <- indirect_relations(g, type = "dist_rwalk")
# See ?indirect_relations for further options
```
Indirect relations are represented as matrices, similar to the adjacency matrix. The below matrices show
the distance matrix based on sahortest paths, and the pairwise dependencies (used for e.g. betweenness).
```{r example_mat}
D
B
```
The function takes an additional parameter `FUN` which can be used to pass a function
to further transform relations. The main use is to obtain indirect relations based on walk counts.
```{r indirwalks}
# count the limit proportion of walks (used for eigenvector centrality)
W <- indirect_relations(g, type = "walks", FUN = walks_limit_prop)
# count the number of walks of arbitrary length between nodes, weighted by
# the inverse factorial of their length (used for subgraph centrality)
S <- indirect_relations(g, type = "walks", FUN = walks_exp)
```
Additional parameters can also be passed to calculate parameterized versions of
relations.
```{r indirparam}
# Calculate dist(s,t)^-alpha
D <- indirect_relations(g, type = "dist_sp", FUN = dist_dpow, alpha = 2)
```
To view all predefined transformation functions see `?transform_relations`. The
predefined functions follow the naming scheme `<relation>_<transformation>`.
The functions `dist_` are thus only meaningful fordistance type relations such as
`type="dist_sp"` or `type="dist_resist"`. Equivalently, `walks_` for `type="walks"`.
The predefined functions are not exhaustive and just constitute the most common
transformations. It is, however, straightforward to pass your own transformation function.
```{r own_func}
dist_integration <- function(x) {
x <- 1 - (x - 1) / max(x)
}
D <- indirect_relations(g, type = "dist_sp", FUN = dist_integration)
```
The function `dist_integration()` computes
$$
\tau(x)_{ij}=1-\frac{dist(i,j)-1}{max_{i,j}\; dist(i,j)}
$$
which is used in the centrality index *integration* defined by [Valente and Foreman (1998)](https://doi.org/10.1016/S0378-8733(97)00007-5)
The computed relations CAN be used to build centrality indices (e.g. with the provided Rstudio
`index_builder()`), but also to derive partial rankings with `positional_dominance()`.
Consult the respective [vignette](positional_dominance.html) for help.