This notebook is for demonstration purposes only. Most of the simulation models shown here have been shown only for undirected, loopless graphs. Simple add `directed=True` to make them directed or `loops=True` to make them loopy. Note that additional parameters (for example, `Wt`, `P`, and `Wtargs`) may be required to also be provided as symmetric, and will otherwise throw an error if not properly specified, when an undirected graph is requested.

In [91]:
from graphstats.simulations.simulations import *
from graphstats.utils.utils import *
import pandas as pd
from plotly.offline import plot, iplot, init_notebook_mode
import plotly.graph_objs as go
import math

init_notebook_mode(connected=True)


def plot_mtx(A, title=""):
    """
    A basic function to plot an adjacency matrix.
    """
    Adf = pd.DataFrame(A).stack().rename_axis(['y', 'x']).reset_index(name="Weight")
    trace = go.Heatmap(x=Adf.x, y=Adf.y, z=Adf.Weight)
    data = [trace]
    layout=go.Layout(width=550, height=550,
                     title=title,
                     xaxis=dict(title="Node Out"),
                     yaxis=dict(title="Node In", autorange="reversed"))
    fig = go.Figure(data=data, layout=layout)
    iplot(fig)

# Simulated Graph Models

## ER

Simulate from $ER(n, p)$ or $ER(n, M)$:

Examples shown are with $p = 0.3$ and $M = 300$.

In [92]:
n = 100; p = 0.3; M = 500
Anp = er_np(n, p)
plot_mtx(Anp, "ER({:d}, {:.1f})".format(n, p))
Anm = er_nm(n, M)
plot_mtx(Anm, "ER({:d}, {:d})".format(n, M))

In [93]:
# basic checks
print(is_symmetric(Anp)); print(is_symmetric(Anm))
print(is_loopless(Anp)); print(is_loopless(Anm))
print(math.isclose(np.mean(Anp), p, abs_tol=0.02)); print(Anm.sum()/2 == M)

True
True
True
True
True
True


## zi-Dist Models

Construct a zero-inflated model using any canonical distribution function, as long as it accepts a `size` argument. This provides limitless flexibility with defining zero-inflated graph models by either user-defined custom distributions or using any of the `numpy.random.*` canonical distributions. For this simple example, we show a graph sampled from a zero-inflated poisson (using edge-probability):

\begin{align*}
    G \in \mathcal{G} \sim ZIP(n, p, \lambda)
\end{align*}

and a graph from a zero-inflated normal distribution (using edge count):

\begin{align*}
    G \in \mathcal{G} \sim ZIN(n, M, \mu, \sigma)
\end{align*}

where $n=100$, $p=0.4$, $M=400$, $\lambda = 5$, $\mu=10$, $\sigma=4$:

In [110]:
n = 100; p=0.4; M=400; lam = 5; loc=10; scale=4
# note that np.random.poisson takes arguments for size and lam; we pass lam
# as a kwarg and the size is handled by the zi_np function to ensure consistency
# with other model parameters specified
Azip_np = zi_np(n, p, np.random.poisson, lam=lam)
plot_mtx(Azip_np, "ZIP({:d}, {:.1f}, {:d})".format(n, p, lam))
Azip_nm = zi_nm(n, M, np.random.normal, loc=loc, scale=scale)
plot_mtx(Azip_nm, "ZIN({:d}, {:d}, {:.1f}, {:.1f})".format(n, M, loc, scale))

In [95]:
# basic checks
print(is_symmetric(Azip_np)); print(is_symmetric(Azip_nm))
print(is_loopless(Azip_np)); print(is_loopless(Azip_nm))
print(math.isclose(np.mean(Azip_np != 0), p, abs_tol=0.02)); print((Azip_nm != 0).sum()/2 == M)

# expectations of params
print(math.isclose(np.mean(Azip_np[Azip_np != 0]), lam, abs_tol=0.5))
print(math.isclose(np.mean(Azip_nm[Azip_nm != 0]), loc, abs_tol=0.5))
print(math.isclose(np.std(Azip_nm[Azip_nm != 0]), scale, abs_tol=0.5))

True
True
True
True
True
True
True
True
True


## Binary SBM

Here, we define the SBM:

\begin{align*}
    G \in G \sim SBM(N, C, P)
\end{align*}

where: 
+ $N \in \mathbb{Z}_+$ where vertex set $V = \{v_i\}_{i=1}^N$ are the vertices,
+ $C = \{C_i\}_{i=1}^K, C_i \subseteq V, C_i \cap C_j = \emptyset$ if $i \neq j$,
+ $P \in [0, 1]^{K \times K}$ where entry $P_{ij}$ indicates the probability of an edge between communities $i, j$

below we show where $N = 100$,

$C_1 = \{v_i\}_{i=1}^30$, $C_2 = \{v_i\}_{i=31}^70$, and $C_3 = \{v_i\}_{i=1}^{100}$

\begin{align*}
    P = \begin{bmatrix}
        0.6 & 0.3 & 0.2 \\
        0.3 & 0.7 & 0.6 \\
        0.2 & 0.6 & 0.1
    \end{bmatrix}
\end{align*}

See the [exhaustive tests](https://github.com/neurodata/graspy/tree/master/tests) for a detailed look to verify that the model recovers appropriate parameters; we exempt them for brevity from here on.

In [109]:
n = [30, 40, 30]
P = np.vstack(([0.6, 0.3, 0.2], [0.3, 0.7, 0.6], [0.2, 0.6, 0.1]))
Asbm = binary_sbm(n, P)
plot_mtx(Asbm, "SBM(n, C, P)")

## Zero-Inflated SBM with simple function

Here, we define the SBM:

\begin{align*}
    G \in G \sim wSBM(N, C, P, w)
\end{align*}

where: 
+ $N \in \mathbb{Z}_+$ where vertex set $V = \{v_i\}_{i=1}^N$ are the vertices,
+ $C = \{C_i\}_{i=1}^K, C_i \subseteq V, C_i \cap C_j = \emptyset$ if $i \neq j$,
+ $P \in [0, 1]^{K \times K}$ where entry $P_{ij}$ indicates the probability of an edge between communities $i, j$
+ $w: V \times V \rightarrow \mathbb{R}$ is a weight function.

below we show where $N = 100$,

$C_1 = \{v_i\}_{i=1}^30$, $C_2 = \{v_i\}_{i=31}^70$, and $C_3 = \{v_i\}_{i=1}^{100}$

\begin{align*}
    P = \begin{bmatrix}
        0.6 & 0.3 & 0.2 \\
        0.3 & 0.7 & 0.6 \\
        0.2 & 0.6 & 0.1
    \end{bmatrix}
\end{align*}

and weights samples from $\mathcal{N}(4, 4)$. Note that providing the trailing arguments for the weight function can no longer be passed as kwargs, but instead must be passed as a dictionary via the named argument `Wtargs`.

See the [exhaustive tests](https://github.com/neurodata/graspy/tree/master/tests) for a detailed look to verify that the model recovers appropriate parameters; we exempt them for brevity from here on.

In [108]:
n = [30, 40, 30]
P = np.vstack(([0.6, 0.3, 0.2], [0.3, 0.7, 0.6], [0.2, 0.6, 0.1]))
loc=4; scale=4
Asbm = weighted_sbm(n, P, Wt=np.random.normal, Wtargs={'loc': loc, 'scale': scale})
plot_mtx(Asbm, "wSBM(n, C, P, w)")

## Zero-Inflated SBM with function for each block

Here, we define the SBM:

\begin{align*}
    G \in G \sim wSBM(N, C, P, w)
\end{align*}

where: 
+ $N \in \mathbb{Z}_+$ where vertex set $V = \{v_i\}_{i=1}^N$ are the vertices,
+ $C = \{C_i\}_{i=1}^K, C_i \subseteq V, C_i \cap C_j = \emptyset$ if $i \neq j$,
+ $P \in [0, 1]^{K \times K}$ where entry $P_{ij}$ indicates the probability of an edge between communities $i, j$
+ $W_{i, j}: V \times V \rightarrow \mathbb{R}$ is a weight function for block $i, j$.

below we show where $N = 100$,

$C_1 = \{v_i\}_{i=1}^30$, $C_2 = \{v_i\}_{i=31}^70$, and $C_3 = \{v_i\}_{i=1}^{100}$

\begin{align*}
    P = \begin{bmatrix}
        0.6 & 0.3 & 0.2 \\
        0.3 & 0.7 & 0.6 \\
        0.2 & 0.6 & 0.1
    \end{bmatrix}
\end{align*}

and weight matrix:
\begin{align*}
    W = \begin{bmatrix}
        \mathcal{N}(10, 3) & \mathcal{N}(10, 3) & \mathcal{N}(10, 3) \\
        \mathcal{Pois}(5) & \mathcal{Pois}(5) & \mathcal{Pois}(5) \\
        \mathcal{Unif}(3, 4) & \mathcal{Unif}(3, 4) & \mathcal{Unif}(3, 4)
    \end{bmatrix}
\end{align*}

and weights samples from $\mathcal{N}(4, 4)$. Note that again providing the trailing arguments for the weight function can no longer be passed as kwargs, but instead must be passed as a dictionary via the named argument `Wtargs`. Also, since our weight function and weight arguments are asymmetric, we must specify that the graph is `directed`.

See the [exhaustive tests](https://github.com/neurodata/graspy/tree/master/tests) for a detailed look to verify that the model recovers appropriate parameters; we exempt them for brevity.

In [111]:
n = [30, 40, 30]
P = np.vstack(([0.6, 0.3, 0.2], [0.3, 0.7, 0.6], [0.2, 0.6, 0.1]))
loc = 10; scale = 3; lam = 2; low = -15; high = -10
# build weight matrix as 3x3
Wt = np.vstack(([np.random.normal, np.random.normal, np.random.normal],
               [np.random.poisson, np.random.poisson, np.random.poisson],
               [np.random.uniform, np.random.uniform, np.random.uniform]))

# build parameter matrix as 3x3
nparams = {'loc': loc, 'scale': scale}; pparams = {'lam': lam}
uparams = {'low': low, 'high': high}

Wtargs = np.vstack(([nparams, nparams, nparams],
                  [pparams, pparams, pparams],
                  [uparams, uparams, uparams]))

Asbm = weighted_sbm(n, P, directed=True, Wt=Wt, Wtargs=Wtargs)
plot_mtx(Asbm, "WSBM(n, C, P, W)")