# lovre / intermediacy

Java program for computing the intermediacy of the nodes in a graph for the specified source and target nodes.
Java
Lovro Šubelj Initial commit
Latest commit 847dbf1 Aug 10, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
nets Aug 10, 2018
src Aug 10, 2018
Intermediacy.jar

## Intermediacy of the nodes in a graph

The repository includes Java program for computing the intermediacy of the nodes in a graph for the specified source and target nodes. The intermediacy is defined as the probability that a node is located on a directed path between the source and target nodes whereby each edge of the graph is sampled independently with some probability.

Particularly, the intermediacy was developed to estimate the importance of scientific publications based on a network of citations between these publications. Given two publications dealing with a specific research topic, an older publication and a more recent one, intermediacy can be used to identify publications that appear to play a major role in the intellectual development from the older to the more recent publication. These are publications that, based on citation links, are important in connecting the older and the more recent publication.

The measure of intermediacy is fully described in a preprint at arxiv.org (available soon).

### Command line usage

The implementation has no dependencies and does not require any compiling. It can be run as:

java -jar Intermediacy.jar -i <file> -s <source> -t <target> [-p <list>] [-z <samples>] [-h]

The command line arguments are as follows:

-i,--input <file> Pajek or TSV file (example: 'nets/toy.net')

-s,--source <source> source node index (example: '1')

-t,--target <target> target node index (example: '2')

-p,--probability <list> edge probability (default: '0.3,0.5,0.7')

-z,--samples <samples> Monte Carlo samples (default: '100000')

-h,--help prints help and usage message

### Details of the implementation

The implementation first parses the command line arguments as described above.

The implementation then reads a graph from the specified Pajek or TSV file. If the filename ends with the extension .net, the Pajek method is called. Otherwise, the TSV method is called. Next, the graph is reduced to the intermediate graph consisting only of nodes located on a directed path between the source and target nodes, since the intermediacy of the remaining nodes equals zero by definition.

For each of the specified probabilities of the edges in the graph, the intermediacy of the nodes is approximated by Monte Carlo sampling. The implementation approximates the intermediacy using the specified number of Monte Carlo samples $z$ with the standard error $\sqrt{\phi(1-\phi)/z}$, where $\phi$ is the intermediacy of a node.

Finally, the intermediacies of the nodes for each of the specified probabilities are written to a TSV file. Only the nodes in the intermediate graph are included in the TSV file. The file is put in the same folder as the graph file, while the filename is set to the name of the graph followed by the extension _phi.tsv.

The implementation runs in time $\mathcal{O}(pzm)$, where $m$ is the number of edges in the intermediate graph, $z$ is the number of Monte Carlo samples and $p$ is the number of the specified probabilities.

The source code of the implementation is located in folder src.

### Example input and output

Example Pajek input file included as nets/toy.net:

*vertices 5
1 "s"
2 "u"
3 "v"
4 "w"
5 "t"
*arcs
1 2
1 3
2 3
2 4
3 5
4 3
4 5


Example command line usage of the implementation:

java -jar Intermediacy.jar -i nets/toy.net -s 1 -t 5


Example command line printout of the execution:


... | INTERMEDIACY

Date | 8/10/18 10:31 PM

... | NETWORK

Network | 'toy'
Source | '1'
Target | '5'

Nodes | 5 (5)
Edges | 7 (7)
Degree | 2.800 (2.800)

Time | 0.01 sec
Memory | 0.00 GB

... | MONTE CARLO

Probability | 0.300
Samples | 100,000

Intermediacy | ...
Source | 0.13440 ± 0.00108
Target | 0.13440 ± 0.00108
'3' | 0.11295 ± 0.00196
'2' | 0.05589 ± 0.00142
'4' | 0.03245 ± 0.00110

Time | 0.19 sec
Memory | 0.05 GB

... | MONTE CARLO

Probability | 0.500
Samples | 100,000

Intermediacy | ...
Source | 0.39649 ± 0.00155
Target | 0.39649 ± 0.00155
'3' | 0.32659 ± 0.00291
'2' | 0.23282 ± 0.00262
'4' | 0.15583 ± 0.00225

Time | 0.08 sec
Memory | 0.06 GB

... | MONTE CARLO

Probability | 0.700
Samples | 100,000

Intermediacy | ...
Source | 0.72411 ± 0.00141
Target | 0.72411 ± 0.00141
'3' | 0.61524 ± 0.00302
'2' | 0.53938 ± 0.00309
'4' | 0.41563 ± 0.00305

Time | 0.09 sec
Memory | 0.01 GB

... | INTERMEDIACY

Time | 0.40 sec
Memory | 0.01 GB



Example TSV output file included as nets/toy_phi.tsv:

id	in_degree	out_degree	phi_0.3	phi_0.5	phi_0.7
1	0	2	0.1344	0.39649	0.72411
2	1	2	0.05589	0.23282	0.53938
3	3	1	0.11295	0.32659	0.61524
4	1	2	0.03245	0.15583	0.41563
5	2	0	0.1344	0.39649	0.72411

You can’t perform that action at this time.