##### Copyright 2018 The TensorFlow Authors.

Licensed under the Apache License, Version 2.0 (the "License");

In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License"); { display-mode: "form" }
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Latent Space Models for Neural Data with TFP

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://drive.google.com/file/d/18Vs8j8SYrO9jw6bPK8c4JlAS5uE6Kwat/view?usp=sharing"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href=""><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>
<br>
<br>
<br>


Original content [this Repository](https://github.com/blei-lab/edward) and [this tutorial](http://edwardlib.org/tutorials/latent-space-models), created by [the Blei Lab](http://www.cs.columbia.edu/~blei/). The initial version of this tutorial was written by Maja Rudolph.

Ported to Tensorflow Probability by Matthew McAteer ([`@MatthewMcAteer0`](https://twitter.com/MatthewMcAteer0)), with help from the TFP team at  Google ([`tfprobability@tensorflow.org`](mailto:tfprobability@tensorflow.org)).

---

>[Dependencies & Prerequisites](#scrollTo=2ZtWUjXYRXQi)

>[Introduction](#scrollTo=2ZtWUjXYRXQi)

>>[Data](#scrollTo=2ZtWUjXYRXQi)

>>[Model](#scrollTo=2ZtWUjXYRXQi)

>>[Inference](#scrollTo=2ZtWUjXYRXQi)

>>[Criticism](#scrollTo=2ZtWUjXYRXQi)

>[References](#scrollTo=2ZtWUjXYRXQi)

## Dependencies & Prerequisites

In [0]:
!pip3 install -q tfp-nightly
!pip3 install -q observations

In [0]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

# import edward as ed
import numpy as np
import tensorflow as tf

# from edward.models import Normal, Poisson
from observations import celegans

In [0]:
def session_options(enable_gpu_ram_resizing=True, enable_xla=True):
    """
    Allowing the notebook to make use of GPUs if they're available.
    
    XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear 
    algebra that optimizes TensorFlow computations.
    """
    config = tf.ConfigProto()
    config.log_device_placement = True
    if enable_gpu_ram_resizing:
        # `allow_growth=True` makes it possible to connect multiple colabs to your
        # GPU. Otherwise the colab malloc's all GPU ram.
        config.gpu_options.allow_growth = True
    if enable_xla:
        # Enable on XLA. https://www.tensorflow.org/performance/xla/.
        config.graph_options.optimizer_options.global_jit_level = (
            tf.OptimizerOptions.ON_1)
    return config


def reset_sess(config=None):
    """
    Convenience function to create the TF graph & session or reset them.
    """
    if config is None:
        config = session_options()
    global sess
    tf.reset_default_graph()
    try:
        sess.close()
    except:
        pass
    sess = tf.InteractiveSession(config=config)

    
def evaluate(tensors):
    """
    A "Universal" evaluate function for both running either Graph mode (default)
    or Eager mode (https://www.tensorflow.org/guide/eager) in Tensorflow.
    """
    if context.executing_eagerly():
        return (t.numpy() for t in tensprs)
    with tf.get_default_session() as sess:
        return sess.run(tensors)

reset_sess()


def strip_consts(graph_def, max_const_size=32):
  """
  Strip large constant values from graph_def.
  """
  strip_def = tf.GraphDef()
  for n0 in graph_def.node:
    n = strip_def.node.add()
    n.MergeFrom(n0)
    if n.op == 'Const':
      tensor = n.attr['value'].tensor
      size = len(tensor.tensor_content)
      if size > max_const_size:
        tensor.tensor_content = bytes("<stripped %d bytes>"%size, 'utf-8')
  return strip_def


def draw_graph(model, *args, **kwargs):
  """
  Visualize TensorFlow graph.
  """
  graph = tf.Graph()
  with graph.as_default():
    model(*args, **kwargs)
  graph_def = graph.as_graph_def()
  strip_def = strip_consts(graph_def, max_const_size=32)
  code = """
      <script>
        function load() {{
          document.getElementById("{id}").pbtxt = {data};
        }}
      </script>
      <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
      <div style="height:600px">
        <tf-graph-basic id="{id}"></tf-graph-basic>
      </div>
  """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

  iframe = """
      <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
  """.format(code.replace('"', '&quot;'))
  IPython.display.display(IPython.display.HTML(iframe))

## Introduction

Many scientific fields involve the study of network data, including social networks, networks in statistical physics, biological networks, and information networks (Goldenberg, Zheng, Fienberg, & Airoldi, 2010; Newman, 2010).

What we can learn about nodes in a network from their connectivity patterns? We can begin to study this using a latent space model (Hoff, Raftery, & Handcock, 2002). Latent space models embed nodes in the network in a latent space, where the likelihood of forming an edge between two nodes depends on their distance in the latent space.

We will analyze network data from neuroscience.

## Data

The data comes from [Mark Newman's repository](http://www-personal.umich.edu/~mejn/netdata/).
It is a weighted, directed network representing the neural network of
the nematode
[C. Elegans](https://en.wikipedia.org/wiki/Caenorhabditis_elegans)
compiled by Watts & Strogatz (1998) using experimental data
by White, Southgate, Thomson, & Brenner (1986).

The neural network consists of around $300$ neurons. Each connection
between neurons
is associated with a weight (positive integer) capturing the strength
of the connection.

First, we load the data.

In [0]:
x_train = celegans("~/data")

## Model

What can we learn about the neurons from their connectivity patterns? Using
a latent space model (Hoff et al., 2002), we will learn a latent
embedding for each neuron to capture the similarities between them.

Each neuron $n$ is a node in the network and is associated with a latent
position $z_n\in\mathbb{R}^K$.
We place a Gaussian prior on each of the latent positions.

The log-odds of an edge between node $i$ and
$j$ is proportional to the Euclidean distance between the latent
representations of the nodes $|z_i- z_j|$. Here, we
model the weights ($Y_{ij}$) of the edges with a Poisson likelihood.
The rate is the reciprocal of the distance in latent space. The
generative process is as follows:

1. 
For each node $n=1,\ldots,N$,
\begin{align}
z_n \sim N(0,I).
\end{align}
2. 
For each edge $(i,j)\in\{1,\ldots,N\}\times\{1,\ldots,N\}$,
\begin{align}
Y_{ij} \sim \text{Poisson}\Bigg(\frac{1}{|z_i - z_j|}\Bigg).
\end{align}

In TFP, we write the model as follows.

In [0]:
N = x_train.shape[0]  # number of data points
K = 3  # latent dimensionality

z = Normal(loc=tf.zeros([N, K]), scale=tf.ones([N, K]))

# Calculate N x N distance matrix.
# 1. Create a vector, [||z_1||^2, ||z_2||^2, ..., ||z_N||^2], and tile
# it to create N identical rows.
xp = tf.tile(tf.reduce_sum(tf.pow(z, 2), 1, keep_dims=True), [1, N])
# 2. Create a N x N matrix where entry (i, j) is ||z_i||^2 + ||z_j||^2
# - 2 z_i^T z_j.
xp = xp + tf.transpose(xp) - 2 * tf.matmul(z, z, transpose_b=True)
# 3. Invert the pairwise distances and make rate along diagonals to
# be close to zero.
xp = 1.0 / tf.sqrt(xp + tf.diag(tf.zeros(N) + 1e3))

x = Poisson(rate=xp)

## Inference

Maximum a posteriori (MAP) estimation is simple in Edward. Two lines are
required: Instantiating inference and running it.

In [0]:
# No MAP
inference = ed.MAP([z], data={x: x_train})

See this extended tutorial about
[MAP estimation in Edward](http://edwardlib.org/tutorials/map).

One could instead run variational inference. This requires specifying
a variational model and instantiating `KLqp`.

In [0]:
import tensorflow as tf
import tensorflow_probability as tfp
# Assumes user supplies `likelihood`, `prior`, `surrogate_posterior`
# functions and that each returns a 
# tf.distribution.Distribution-like object.
elbo_loss = tfp.vi.monte_carlo_csiszar_f_divergence(
    f=tfp.vi.kl_reverse,  # Equivalent to "Evidence Lower BOund"
    p_log_prob=lambda z: likelihood(z).log_prob(x) + prior().log_prob(z),
    q=surrogate_posterior(x),
    num_draws=1)
train = tf.train.AdamOptimizer(
    learning_rate=0.01).minimize(elbo_loss)

In [0]:
# Alternatively, run Variational inference with the KL divergence
qz = Normal(loc=tf.get_variable("qz/loc", [N * K]),
            scale=tf.nn.softplus(tf.get_variable("qz/scale", [N * K])))
inference = ed.KLqp({z: qz}, data={x: x_train})

See this extended tutorial about
[variational inference in Edward](http://edwardlib.org/tutorials/variational-inference).

Finally, the following line runs the inference procedure for 2500
iterations.

In [0]:
# Add variational inference from TFP example
inference.run(n_iter=2500)

2500/2500 [100%] ██████████████████████████████ Elapsed: 11s | Loss: 35984.855


In [0]:
# Visualizing the graph we've constructed
# draw_graph(linear_mixed_effects_model, features_train)

## References
1. Goldenberg, A., Zheng, A. X., Fienberg, S. E., & Airoldi, E. M. (2010). [A survey of statistical network models.](https://www.nowpublishers.com/article/Details/MAL-005) Foundations and Trends in Machine Learning.
2. Hoff, P. D., Raftery, A. E., & Handcock, M. S. (2002). [Latent space approaches to social network analysis.](https://amstat.tandfonline.com/doi/abs/10.1198/016214502388618906#.W2X9PN-YW8g) Journal of the American Statistical Association, 97(460), 1090–1098.
3. Newman, M. (2010). [Networks: An introduction.](https://books.google.com/books?hl=en&lr=&id=YdZjDwAAQBAJ&oi=fnd&pg=PP1&dq=networks+an+introduction+newman&ots=V_IZ1Qkcrv&sig=e32XDZs8gzBAHzsBGEYVmWQhsis#v=onepage&q=networks%20an%20introduction%20newman&f=false) Oxford University Press.
4. Watts, D. J., & Strogatz, S. H. (1998). [Collective dynamics of ‘small-world’networks.](https://www.nature.com/articles/30918) Nature, 393(6684), 440–442.
5. White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1986). [The structure of the nervous system of the nematode caenorhabditis elegans.](https://pdfs.semanticscholar.org/0bb2/1a76b5604211927cbc3c18f64437adbd834a.pdf) Philos Trans R Soc Lond B Biol Sci, 314(1165), 1–340.

In [0]:
from IPython.core.display import HTML
def css_styling():
    styles = open("../styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

#  "#F15854",  // red
#  "#5DA5DA",  // blue
#  "#FAA43A",  // orange
#  "#60BD68",  // green
#  "#F17CB0",  // pink
#  "#B2912F",  // brown
#  "#B276B2",  // purple
#  "#DECF3F",  // yellow
#  "#4D4D4D",  // gray
