# DBN Structure Inference

The idea is to infer a posterior for the *structure* of a Dynamic Bayesian Network (DBN), given some data.

We formulate this task with the following model:

$$ P(G | X) \propto P(X | G) \cdot P(G) $$

* $P(G)$ is a prior distribution over DBN structures. We'll assume it has the form
$$P(G) \propto \exp \left( -\lambda |G \setminus G^\prime| \right)$$
where $|G \setminus G^\prime|$ denotes the number of edges in the graph, which are not present in some reference graph $G^\prime$.
* $P(X | G)$ is the marginal likelihood of the DBN structure. That is, it's the likelihood of the DBN after the network parameters have been integrated out -- it scores network *structure*. 
* If we assume some reasonable priors for network parameters, $P(X|G)$ can be obtained in closed form. In this work, we'll use the following marginal likelihood:
    
    $$P(X | G) \propto \prod_{i=1}^p (1 + n)^{-(2^{|\pi(i)|} - 1)/2} \left( X_i^{+ T} X_i^+ - \frac{n}{n+1} X_i^{+ T} B_i (B_i^T B_i)^{-1} B_i^T X_i^+ \right)^{-\frac{n}{2}}$$ 
    where $X$ and $B$ are matrices obtained from data; and $n$ is the total number of timesteps in the dataset. This marginal likelihood results from an empirical prior over the regression coefficients, and an improper ($\propto 1/\sigma^2$) prior for the regression "noise" variables.

## Get some data

For now, we'll work with some data used by Hill et al. in their 2012 paper, _Bayesian Inference of Signaling Network Topology in a Cancer Cell Line_.

It gives the differential phosphorylation levels of 20 proteins, in a cancer cell line perturbed by EGF. This is a well-studied signaling pathway; the goal is to produce a graph describing the dependencies between proteins in this pathway. 

NOTE: I have confirmed that the ordering of proteins in `protein_vec` is identical to the ordering in the columns of `timeseries_data`. So we can depend on that.

## Build the model

Implement the graph prior distribution:

$$P(G) \propto \exp \left( -\lambda |G \setminus G^\prime| \right)$$

Implement the DBN's marginal distribution:

$$P(X | G) \propto \prod_{i=1}^p (1 + n)^{-(2^{|\pi(i)|} - 1)/2} \left( X_i^{+ T} X_i^+ - \frac{n}{n+1} X_i^{+ T} B_i (B_i^T B_i)^{-1} B_i^T X_i^+ \right)^{-\frac{n}{2}}$$

Some things to note:
* We're kind of shoe-horning this marginal likelihood into Gen. The probabilistic programming ethos entails modeling the entire data-generating process. This ought to provide better performance during inference, though.

A BUNCH OF HELPER FUNCTIONS:

THE MARGINAL LIKELIHOOD DISTRIBUTION

## Inference

### Metropolis-Hastings over directed graphs

Proposal distribution:

Involution function:

### Our inference program

# TESTING THE NEW MODEL

In [None]:
using Gen
using GLMNet
include("PSDiGraph.jl")
using .PSDiGraphs
include("dbn_preprocessing.jl")
include("dbn_models.jl")
include("dbn_proposals.jl")
include("dbn_inference.jl")
using PyPlot
using Profile
using ProfileView

In [None]:
timeseries_data_path = "data/mukherjee_data.csv"
protein_names_path = "data/protein_names.csv"
reference_adj_path = "data/prior_graph.csv"
timesteps_path = "data/time.csv"

In [None]:
(timeseries_vec, protein_vec, ref_adj, timesteps) = hill_2012_preprocess(timeseries_data_path, 
                                                                         protein_names_path, 
                                                                         reference_adj_path, 
                                                                         timesteps_path);

In [None]:
Gen.load_generated_functions()

In [None]:
clear_caches()
regression_deg = 1
lambda_max = 10.0
n_samples = 200
fixed_lambda = 3.0
burnin = 100
thinning = 10
lambda_step = 0.3
V = length(protein_vec)
t = 0.4

results, acc = dbn_mcmc_inference(ref_adj, timeseries_vec, regression_deg, lambda_max,
                                  n_samples, burnin, thinning, 
                                  ps_smart_swp_update_loop,
                                  update_results_z_lambda!, #TODO
                                  (lambda_step,),
                                  ((V, t), V));
                                  #update_lambda=false,
                                  #fixed_lambda=fixed_lambda,
                                  #track_acceptance=true,
                                  #update_acceptances! =update_acc_z_lambda!) # TODO

In [None]:
transpose(results[1])

In [None]:
hist(results[2])
show()

In [None]:
# typeof(update_results_z_lambda!) <: Function

In [None]:
# matshow(transpose(ref_adj), cmap="Greys")
# matshow(transpose(edge_posterior), cmap="Greys")
# transpose(edge_posterior)[:,5]

In [None]:
# hill_result = convert(Matrix{Float64}, CSV.read("data/edge_prob_matrix.csv"))
# matshow(hill_result, cmap="Greys")
# hill_result[:,5]

In [None]:
using JSON

In [None]:
d = Dict("cat" => [1;2;3;], "dog" => Dict("woof" => [4;5;6], "bow-wow" => ["ruff";nothing]))

In [None]:
d["dog"]

In [None]:
JSON.json(d)

In [None]:
f = open("temp.json", "w")

In [None]:
write(f, JSON.json(d))

In [None]:
close(f)

In [None]:
time()

In [None]:
abs2.([-1.0; 0.0; 1.0] .- 0.0)

In [None]:
typeof([[[1.0];[[3.0;4.0]]];[5.0]])

In [1]:
function myreduce(f::Function, vec::Vector{T}) where {T<:Number}
    return f(vec)
end

function myreduce(f::Function, vec::Vector{Vector{T}}) where T
    return [myreduce(f, [item[field] for item in vec]) for (field, _) in enumerate(vec[1])]
end

myreduce (generic function with 2 methods)

In [2]:
"""
Compute the sample averages over a vector of MCMC results
"""
function sample_average(sample_vec::Vector)
    return myreduce(v->1.0*sum(v)/length(v), sample_vec)
end

"""
Compute the sample variances over a vector of MCMC results
"""
function sample_variance(sample_vec::Vector)
    mu = sample_average(sample_vec)
    return myreduce(x->sum(abs2.(x))/length(sample_vec), sample_vec .- mu)
end


sample_variance

In [3]:
"""
Perform a binary operation f(vec1, vec2) on arbitrarily nested vectors
(assume identical structure between vec1 and vec2, though)
"""
function mybinop(f::Function, vec1::Vector{T}, vec2::Vector{U}) where {T,U}
    return [mybinop(f, v1i, vec2[i]) for (i,v1i) in enumerate(vec1)]
end

"""
Perform a binary operation f(vec1, vec2) on arbitrarily nested vectors
(assume identical structure between vec1 and vec2, though)
"""
function mybinop(f::Function, v1::T, v2::U) where {T<:Number,U<:Number}
    return f(v1, v2)
end

mybinop

In [4]:
"""
Compute the sample variances over a vector of MCMC results
"""
function sample_variance(sample_vec::Vector)
    mu = sample_average(sample_vec)
    centered = [mybinop(-, s, mu) for s in sample_vec]
    return myreduce(x->sum(abs2.(x))/(length(centered)-1.0), centered)
end


sample_variance

In [5]:
arr = [[[[1.0]], [true]], [[[1.0]], [true]], [[[4.0]], [true]], [[[4.0]], [true]]]
println(typeof(arr))
sample_average(arr)

Array{Array{Array{T,1} where T,1},1}


2-element Array{Array{T,1} where T,1}:
 Array{Float64,1}[[2.5]]
 [1.0]                  

In [None]:
typeof(arr)

In [6]:
sample_variance(arr)

2-element Array{Array{T,1} where T,1}:
 Array{Float64,1}[[3.0]]
 [0.0]                  