Skip to content

Commit

Permalink
Merge branch 'dev' into ldt-parallel
Browse files Browse the repository at this point in the history
  • Loading branch information
bdpedigo committed May 19, 2021
2 parents ad04089 + 4d97391 commit bcc4466
Show file tree
Hide file tree
Showing 6 changed files with 375 additions and 143 deletions.
154 changes: 127 additions & 27 deletions docs/tutorials/embedding/OutOfSampleEmbed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Suppose we've embedded the nodes of a graph into Euclidean space using Adjacency Spectral Embedding (ASE). \n",
"Suppose we've embedded the nodes of a graph into Euclidean space using Adjacency Spectral Embedding (ASE) or Laplacian Spectral Embeding (LSE). \n",
"Then, suppose we gain access to new nodes not seen in the original graph. We sometimes wish to determine their latent positions without the computationally-expensive task of re-embedding an entirely new adjacency matrix.\n",
"\n",
"More formally, suppose we have computed the embedding $\\hat{X} \\in \\textbf{R}^{n \\times d}$ from some adjacency matrix $A \\in \\textbf{R}^{n \\times n}$. \n",
Expand All @@ -21,7 +21,7 @@
"\n",
"$W \\in \\textbf{R}^{m \\times n}$ is a matrix with each row being an adjacency vector, for $m$ out-of-sample vertices.\n",
"\n",
"We can obtain this estimation with ASE's `transform` method. \n",
"We can obtain this estimation with ASE's or LSE's `transform` method. \n",
"Running through the Adjacency Spectral Embedding tutorial is recommended prior to this tutorial."
]
},
Expand All @@ -40,10 +40,11 @@
"\n",
"from graspologic.simulations import sbm\n",
"from graspologic.embed import AdjacencySpectralEmbed as ASE\n",
"from graspologic.embed import LaplacianSpectralEmbed as LSE\n",
"from graspologic.plot import heatmap, pairplot\n",
"from graspologic.utils import remove_vertices\n",
"\n",
"np.random.seed(9002)\n",
"np.random.seed(1234)\n",
"import warnings\n",
"warnings.filterwarnings('ignore')"
]
Expand Down Expand Up @@ -92,7 +93,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Embedding"
"### Embedding (ASE)"
]
},
{
Expand All @@ -110,11 +111,11 @@
"source": [
"# Generate an embedding with ASE\n",
"ase = ASE(n_components=2)\n",
"X_hat = ase.fit_transform(A)\n",
"X_hat_ase = ase.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w = ase.transform(a)\n",
"w"
"w_ase = ase.transform(a)\n",
"w_ase"
]
},
{
Expand Down Expand Up @@ -159,7 +160,39 @@
" return plot\n",
"\n",
"# Plot all latent positions\n",
"plot_oos(X_hat, w, labels=labels, oos_labels=[0], title=\"Out-of-Sample Embeddings (2-block SBM)\");"
"plot_oos(X_hat_ase, w_ase, labels=labels, oos_labels=[0], title=\"ASE Out-of-Sample Embeddings (2-block SBM)\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Embeding (LSE)\n",
"Similarly, we can also use Laplacian Spectral Embedding (LSE). We generate an embedding with its transform method to determine our best estimate for the latent position of the out-of-sample vertex."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Generate an embedding with ASE\n",
"lse = LSE(n_components=2)\n",
"X_hat_lse = lse.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w_lse = lse.transform(a)\n",
"w_lse"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot_oos(X_hat_lse, w_lse, labels=labels, oos_labels=[0], title=\"LSE Out-of-Sample Embeddings (2-block SBM)\");"
]
},
{
Expand Down Expand Up @@ -200,15 +233,34 @@
"source": [
"# Generate an embedding with ASE\n",
"ase = ASE(n_components=2)\n",
"X_hat = ase.fit_transform(A)\n",
"X_hat_ase = ase.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w_ase = ase.transform(a)\n",
"print(f\"The out-of-sample prediction output has dimensions {w_ase.shape}\\n\")\n",
"\n",
"# Plot all latent positions\n",
"plot_oos(X_hat_ase, w_ase, labels, oos_labels=oos_labels,\n",
" title=\"ASE Out-of-Sample Embeddings (2-block SBM)\");"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Generate an embedding with LSE\n",
"lse = LSE(n_components=2)\n",
"X_hat_lse = lse.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w = ase.transform(a)\n",
"print(f\"The out-of-sample prediction output has dimensions {w.shape}\\n\")\n",
"w_lse = lse.transform(a)\n",
"print(f\"The out-of-sample prediction output has dimensions {w_lse.shape}\\n\")\n",
"\n",
"# Plot all latent positions\n",
"plot_oos(X_hat, w, labels, oos_labels=oos_labels,\n",
" title=\"Out-of-Sample Embeddings (2-block SBM)\");"
"plot_oos(X_hat_lse, w_lse, labels, oos_labels=oos_labels,\n",
" title=\"LSE Out-of-Sample Embeddings (2-block SBM)\");"
]
},
{
Expand Down Expand Up @@ -252,20 +304,20 @@
"outputs": [],
"source": [
"# Fit our directed graph\n",
"X_hat, Y_hat = ase.fit_transform(A)\n",
"X_hat_ase, Y_hat_ase = ase.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w = ase.transform(a)\n",
"print(f\"output of `ase.transform(a)` is {type(w)}\", \"\\n\")\n",
"print(f\"out latent positions: \\n{w[0]}\\n\")\n",
"print(f\"in latent positions: \\n{w[1]}\")"
"w_ase = ase.transform(a)\n",
"print(f\"output of `ase.transform(a)` is {type(w_ase)}\", \"\\n\")\n",
"print(f\"out latent positions: \\n{w_ase[0]}\\n\")\n",
"print(f\"in latent positions: \\n{w_ase[1]}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plotting directed latent predictions"
"### Plotting directed ASE latent predictions"
]
},
{
Expand All @@ -274,8 +326,41 @@
"metadata": {},
"outputs": [],
"source": [
"plot_oos(X_hat, w[0], labels, oos_labels=oos_labels, title=\"Out Latent Predictions\")\n",
"plot_oos(Y_hat, w[1], labels, oos_labels=oos_labels, title=\"In Latent Predictions\")"
"plot_oos(X_hat_ase, w_ase[0], labels, oos_labels=oos_labels, title=\"ASE Out Latent Predictions\")\n",
"plot_oos(Y_hat_ase, w_ase[1], labels, oos_labels=oos_labels, title=\"ASE In Latent Predictions\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fit our directed graph\n",
"X_hat_lse, Y_hat_lse = lse.fit_transform(A)\n",
"\n",
"# predicted latent positions\n",
"w_lse = lse.transform(a)\n",
"print(f\"output of `ase.transform(a)` is {type(w_lse)}\", \"\\n\")\n",
"print(f\"out latent positions: \\n{w_lse[0]}\\n\")\n",
"print(f\"in latent positions: \\n{w_lse[1]}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plotting directed LSE latent predictions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot_oos(X_hat_lse, w_lse[0], labels, oos_labels=oos_labels, title=\"LSE Out Latent Predictions\")\n",
"plot_oos(Y_hat_lse, w_lse[1], labels, oos_labels=oos_labels, title=\"LSE In Latent Predictions\")"
]
},
{
Expand Down Expand Up @@ -320,12 +405,27 @@
"outputs": [],
"source": [
"# Embed and transform\n",
"X_hat, Y_hat = ase.fit_transform(A)\n",
"w = ase.transform(a)\n",
"X_hat_ase, Y_hat_ase = ase.fit_transform(A)\n",
"w_ase = ase.transform(a)\n",
"\n",
"# Plot\n",
"plot_oos(X_hat_ase, w_ase[0], labels, oos_labels=oos_labels, title=\"ASE Out Latent Predictions\")\n",
"plot_oos(Y_hat_ase, w_ase[1],labels, oos_labels=oos_labels, title=\"ASE In Latent Predictions\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Embed and transform\n",
"X_hat_lse, Y_hat_lse = lse.fit_transform(A)\n",
"w_lse = lse.transform(a)\n",
"\n",
"# Plot\n",
"plot_oos(X_hat, w[0], labels, oos_labels=oos_labels, title=\"Out Latent Predictions\")\n",
"plot_oos(Y_hat, w[1],labels, oos_labels=oos_labels, title=\"In Latent Predictions\")"
"plot_oos(X_hat_lse, w_lse[0], labels, oos_labels=oos_labels, title=\"LSE Out Latent Predictions\")\n",
"plot_oos(Y_hat_lse, w_lse[1],labels, oos_labels=oos_labels, title=\"LSE In Latent Predictions\")"
]
}
],
Expand All @@ -345,9 +445,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
"version": "3.6.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
}
105 changes: 13 additions & 92 deletions graspologic/embed/ase.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,8 @@
# Copyright (c) Microsoft Corporation and contributors.
# Licensed under the MIT License.

import warnings
import numpy as np
from sklearn.utils.validation import check_is_fitted
import networkx as nx

from .base import BaseSpectralEmbed
from ..utils import (
import_graph,
is_fully_connected,
augment_diagonal,
pass_to_ranks,
is_unweighted,
)
from ..utils import augment_diagonal


class AdjacencySpectralEmbed(BaseSpectralEmbed):
Expand Down Expand Up @@ -70,14 +59,16 @@ class AdjacencySpectralEmbed(BaseSpectralEmbed):
to the ground truth.
concat : bool, optional (default False)
If graph is directed, whether to concatenate left and right (out and in) latent positions along axis 1.
If graph is directed, whether to concatenate left and right (out and in) latent
positions along axis 1.
Attributes
----------
n_features_in_: int
Number of features passed to the :func:`~graspologic.embed.AdjacencySpectralEmbed.fit` method.
Number of features passed to the
:func:`~graspologic.embed.AdjacencySpectralEmbed.fit` method.
latent_left_ : array, shape (n_samples, n_components)
Estimated left latent positions of the graph.
latent_right_ : array, shape (n_samples, n_components), or None
Expand Down Expand Up @@ -161,93 +152,23 @@ def fit(self, graph, y=None):

self._reduce_dim(A)

# for out-of-sample
inv_eigs = np.diag(1 / self.singular_values_)
self._pinv_left = self.latent_left_ @ inv_eigs
if self.latent_right_ is not None:
self._pinv_right = self.latent_right_ @ inv_eigs

self.is_fitted_ = True

return self

def transform(self, X):
def _compute_oos_prediction(self, X, directed):
"""
Obtain latent positions from an adjacency matrix or matrix of out-of-sample
vertices. For more details on transforming out-of-sample vertices, see the
:ref:`tutorials <embed_tutorials>`. For mathematical background, see [2].
Computes the out-of-sample latent position estimation.
Parameters
----------
X : array-like or tuple, original shape or (n_oos_vertices, n_vertices).
The original fitted matrix ("graph" in fit) or new out-of-sample data.
If ``X`` is the original fitted matrix, returns a matrix close to
``self.fit_transform(X)``.
If ``X`` is an out-of-sample matrix, n_oos_vertices is the number
of new vertices, and n_vertices is the number of vertices in the
original graph. If tuple, graph is directed and ``X[0]`` contains
edges from out-of-sample vertices to in-sample vertices.
X: np.ndarray
Input to do oos embedding on.
directed: bool
Indication if graph is directed or undirected
Returns
-------
array_like or tuple, shape (n_oos_vertices, n_components)
or (n_vertices, n_components).
Array of latent positions. Transforms the fitted matrix if it was passed
in.
If ``X`` is an array or tuple containing adjacency vectors corresponding to
new nodes, returns the estimated latent positions for the new out-of-sample
adjacency vectors.
If undirected, returns array.
If directed, returns ``(X_out, X_in)``, where ``X_out`` contains
latent positions corresponding to nodes with edges from out-of-sample
vertices to in-sample vertices.
Notes
-----
If the matrix was diagonally augmented (e.g., ``self.diag_aug`` was True), ``fit``
followed by ``transform`` will produce a slightly different matrix than
``fit_transform``.
To get the original embedding, using ``fit_transform`` is recommended. In the
directed case, if A is the original in-sample adjacency matrix, the tuple
(A.T, A) will need to be passed to ``transform`` if you do not wish to use
``fit_transform``.
out : array_like or tuple, shape
"""

# checks
check_is_fitted(self, "is_fitted_")
if isinstance(X, nx.classes.graph.Graph):
X = import_graph(X)
directed = self.latent_right_ is not None

# correct types?
if directed and not isinstance(X, tuple):
if X.shape[0] == X.shape[1]: # in case original matrix was passed
msg = """A square matrix A was passed to ``transform`` in the directed case.
If this was the original in-sample matrix, either use ``fit_transform``
or pass a tuple (A.T, A). If this was an out-of-sample matrix, directed
graphs require a tuple (X_out, X_in)."""
raise TypeError(msg)
else:
msg = "Directed graphs require a tuple (X_out, X_in) for out-of-sample transforms."
raise TypeError(msg)
if not directed and not isinstance(X, np.ndarray):
raise TypeError("Undirected graphs require array input")

# correct shape in y?
latent_rows = self.latent_left_.shape[0]
_X = X[0] if directed else X
X_cols = _X.shape[-1]
if _X.ndim > 2:
raise ValueError("out-of-sample vertex must be 1d or 2d")
if latent_rows != X_cols:
msg = "out-of-sample vertex must be shape (n_oos_vertices, n_vertices)"
raise ValueError(msg)

# workhorse code
if not directed:
return X @ self._pinv_left
elif directed:
Expand Down

0 comments on commit bcc4466

Please sign in to comment.