GPLATNET is a method for discovering the connections between the nodes in a network. That is, the given continues-valued observations from each node over time, GPLATNET can discover which nodes are connected to each other. GPLATNET is composed of GP+LATNET. The GP component refers to Gaussian Process, and LATNET refers to LATent NETwork. We refer to GPLATNET simply as LATNET in the paper and below.
The model is introduced in the below paper:
Variational Network Inference: Strong and Stable with Concrete Support. Amir Dezfouli, Edwin V. Bonilla, Richard Nock . ICML (2018)
We have a network with N nodes and we have T observations from each node. Observations are denoted by matrix Y (with size N by T) and the times of observations are denoted by vector t (vector with size T). The aim of LATNET is to find (i) which nodes are connected to each other and (ii) what are the strengths of the connections.
In the examples mentioned in the experiment section, the results of running experiments are saved into several files, as below:
-
alpha.csv. This file represents a matrix of size N by N. Column i and row j corresponds to in the paper, which is of the posterior Concrete distribution over element ij of A.
-
p.csv. This is file represents a matrix of size N by N and entry ij can be interpreted as the probability that node j is connected to node i. Each element of this matrix is . See the paper for more description.
-
mu.csv. This file represents a matrix of size N by N, which is the mean of W, i.e., the mean of the Normal distribution that determines the strength of connection from node j to i. That is, column i and row j corresponds to in the paper.
-
sigma2.csv. This is a matrix of size N by N, which is the variance of W, i.e., the variance of the Normal distribution that determines the strength of connection from node j to i. That is, column i and row j corresponds to in the paper.
-
hyp.csv. This file contains optimized hyper-parameters and has four elements, as follows:
Folder experiments
contains the data and code for running the experiments
reported in the paper. There are four experiments mentioned below. Note that for the
ease of running the experiments the data are included in the repository.
-
fun_conn.py. In this experiment the aim is to recover which brain regions are connected to each other, given the activity of each region over time. For running this experiment you can try,
python -m experiments.fun_conn <n_threads>
n_threads
refers to the number of threads/processes used to run in parallel. The data use for the experiment was downloaded from here, which is reported in this paper: Smith SM, Miller KL, Salimi-Khorshidi G, et al. Network modelling methods for FMRI. Neuroimage. 2011;54(2):875-891. -
prices.py. Given the median prices of different suburbs in Sydney, the aim is to recover which suburbs are connected to each other in terms of their median prices. The data was downloaded from http://www.housing.nsw.gov.au.
-
gene_reg.py. Given the activity of each gene, the aim of this experiment is to find which genes influence each other. The data contains the activity of 800 genes. Please refer to the paper for the source of the data.
-
gene_reg_full.py. Given the activity of each gene, the aim of this experiment is to find which genes influence each other. The data contains the activity of 6178 genes. Please refer to the paper for the source of the data.
Inside the experiments
folder above, there is folder called R
which contains the baseline methods and graphs. It should be clear from
file names to which experiment they belong. Note that files pwling.R
and mentappr.R
are re-implemented in R based on their corresponding Matlab codes
and they are used for running PW-LiNGAM algorithm.