Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flag persistence generators #29

Merged
Merged
Show file tree
Hide file tree
Changes from 53 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
9ff0131
WIP first sketch (won't compile)
ulupo Sep 24, 2021
14dc467
[CPP] Fix compilation issue, refactor retrieving representative simpl…
Oct 5, 2021
65aeaf7
[CPP] Retrieve representative simplices from computation in CPP
Oct 5, 2021
964f6b6
[CPP] Disable sorting of barcodes to keep alignment with representati…
Oct 5, 2021
1fbe5c2
[CPP] Fix not enough memory pre-allocated for all dimensions
Oct 5, 2021
5e3a590
[PY] Add draft of Python interface and Bindings support
Oct 5, 2021
ff7bd9b
[CPP] Retrieve only "True" representative simplices
Oct 5, 2021
339472c
[CPP] Retrieve representative simplices with an indexing 1:1 with bar…
Oct 5, 2021
1e1eb92
[CPP] Refactor with clang-format
Oct 5, 2021
8b1c0cb
[CPP] Update size of representative simplices, to go from dim 1 to dim N
Oct 5, 2021
4a97c28
[CPP] Allow to sort barcodes and reproduce the same sort on rep simpl…
Oct 5, 2021
1072f52
[CPP] Add representative simplicies for dim 0
Oct 7, 2021
95b57cf
[PY] Add output of representative simplices for dim 0 (finite and ess)
Oct 7, 2021
61840e9
[TEST] Add test for verify that all edges in rpsm are valid in the dm
Oct 7, 2021
bf69be1
Remove array conversions, enforce int type for indices, ensure array …
ulupo Oct 7, 2021
0baafe4
[PY] Change dtype to int64 in flag generators
ulupo Oct 18, 2021
6b347f4
[CPP] Refactor get_youngest_edge_simplex function
Oct 19, 2021
37225a2
[CPP] Refactor code related to representative simplicies
Oct 19, 2021
1b2aa7e
[CPP] Fix vertices order in dimension 0 for representative simplicies
Oct 19, 2021
473f8bc
[PY/TEST] Add exception if user uses collapser + representatives.
Oct 19, 2021
2743ffb
[DOC] Add draft for input and output documentation of representative
Oct 19, 2021
eca9f2a
[EXAMPLE] Add a section about representative simplices example
Oct 19, 2021
ad79ea6
Fix birth simplices for finite 0-dimensional bars
ulupo Oct 22, 2021
6627e7f
Actual fix for 0-dimensional persistence generator
ulupo Oct 23, 2021
bd8b361
[CPP] Refactor @ulupo fix for representative in dim 0
Oct 24, 2021
60483d6
Add @ctralie in copyright notice
ulupo Oct 24, 2021
d64fbba
Add inline comment as suggested by @MonkeyBreaker
ulupo Oct 24, 2021
b52e25e
[CPP] Update as suggested by @ulupo method link of class union_find
Oct 24, 2021
d5a9356
[CPP] Revert latest commit
Oct 24, 2021
f84563e
[CPP] Attempt fix for link_and_get_birth_vertex
ulupo Oct 24, 2021
3c90e3a
Fix _link_and_get_birth_vertex
ulupo Oct 25, 2021
c3ce584
Address else branch forgotten in e04203ce31d8efecf9c31dde53f50ec507ce…
ulupo Oct 25, 2021
fd4e651
[CPP] Refactor union_find related code
Oct 25, 2021
45e7377
[CPP] Add @ulupo update on comment for link_roots_and_get_birth_vertex
Oct 25, 2021
8f937d8
[TEST] Add regression test for rpsm with non 0 values on diagonal
Oct 26, 2021
ab3a15f
[CPP] Fix suggested by @ulupo for non 0 values on diagonal of dm
Oct 26, 2021
afa3d04
[TEST] Update comment and add additional test in test_rpsm_non_0_diag…
Oct 26, 2021
3c944a3
Rename ret_representative_simplices to return_generators and refactor…
ulupo Oct 26, 2021
7cbe8f4
Update kwarg name in tests
ulupo Oct 26, 2021
c1edba1
Change name to return_generators in bindings
ulupo Oct 26, 2021
77b6051
Mathematically intelligible solution to 0-dimensional generators comp…
ulupo Oct 28, 2021
0bd5334
Improve jupyter tutorial
ulupo Oct 29, 2021
8006541
Use np.inf instead of np.infty
ulupo Oct 29, 2021
d270ebd
Notebook refactor
ulupo Oct 29, 2021
48eaf72
Mention of ripser.py in basic tutorial
ulupo Oct 29, 2021
f5dfcc6
[CPP] Add inline comments in link_roots_and_get_birth_vertex
ulupo Oct 29, 2021
01b57af
[TEST] add to test_..._diagonal_dim0 check for the number of points
Oct 29, 2021
4024080
[TEST] Add simple test for finite generate at dim > 0 suggested by @u…
Oct 29, 2021
b8d70bb
[CPP] Rename representative by generator
Oct 29, 2021
102fda7
[CPP] Fix typos in two inline comments
ulupo Oct 31, 2021
cbfbd23
[CPP] fix small errors in inline comments
ulupo Oct 31, 2021
0abb23b
[TEST] Update test for higher dimensions as suggested by @ulupo
Oct 31, 2021
4dd2c3a
Fix import in flag_persistence_generators.ipynb
ulupo Nov 1, 2021
8dfc412
[PY] Improve test docstring
ulupo Nov 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
213 changes: 98 additions & 115 deletions examples/basic_tutorial.ipynb

Large diffs are not rendered by default.

168 changes: 168 additions & 0 deletions examples/flag_persistence_generators.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a40d1d2e-c23e-4a89-9a0c-694ce37fb8f0",
"metadata": {},
"source": [
"# Tutorial: Flag persistence generators\n",
"\n",
"In this notebook, we show how to retrieve the vertices and edges responsible for the creation/destruction of persistent topological features in a Vietoris–Rips filtration. We use the same setup and kind of dataset as in the [basic tutorial](https://github.com/giotto-ai/giotto-ph/blob/main/examples/basic_tutorial.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c821577f-4928-4403-90c3-d4b6ef4c5440",
"metadata": {},
"outputs": [],
"source": [
"# Install missing dependencies\n",
"import sys\n",
"!{sys.executable} -m pip install giotto-tda"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e4bf17b-f6b4-444c-a344-38ff096c7f4a",
"metadata": {},
"outputs": [],
"source": [
"# Giotto-ph\n",
"from gph import ripser_parallel\n",
"\n",
"# Import utils\n",
"import numpy as np\n",
"from gtda.homology._utils import _postprocess_diagrams\n",
"\n",
"# To generate dataset\n",
"from sklearn import datasets\n",
"\n",
"# Plotting\n",
"from plotly import graph_objects as go\n",
"from gtda.plotting import plot_diagram, plot_point_cloud"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "af416ee2-3e72-4e43-b919-78647c90e4f6",
"metadata": {},
"outputs": [],
"source": [
"# Make a noisy circle point cloud\n",
"data = datasets.make_circles(n_samples=100, noise=0.1, factor=0.7, random_state=42)[0]\n",
"\n",
"# Plot the point cloud\n",
"plot_point_cloud(data)"
]
},
{
"cell_type": "markdown",
"id": "37c9dda6-a36b-4615-a2ab-0f48d5149773",
"metadata": {},
"source": [
"The information on vertices and edges responsible for the creation/destruction of persistent topological features can be retrieved by passing `return_generators=True` to `ripser_parallel`. A new entry is added to the output dictionary, with key `\"gens\"` and corresponding value a tuple of length 4 organized schematically as follows:\n",
"\n",
" 0. vertices creating and edges destroying *finite* $0$-dimensional features;\n",
" 1. edges creating and destroying *finite* $d$-dimensional features, $d \\geq 1$;\n",
" 2. vertices creating *infinite* $0$-dimensional features;\n",
" 3. edges creating *infinite* $d$-dimensional features, $d \\geq 1$.\n",
"\n",
"(Vertices are encoded by their indices and edges as pairs of vertex indices.) In the case of entries 1 and 3 (higher dimensions), that information is organized by homology dimension. So, for example, calling `gens` this tuple (value in the dictionary):\n",
"\n",
" - `gens[1]` and `gens[3]` are lists containing `maxdim` 2D integer `numpy` arrays, while `gens[0]` and `gens[2]` are `numpy` arrays;\n",
" - the edges creating and destroying finite features in dimension 1 are stored in `gens_finite_1 = gens[1][0]`;\n",
" - `gens_finite_1` is a 2D integer `numpy` array with as many rows as there are finite features in dimension 1;\n",
" - The `i`th finite feature in the 1-dimensional barcode is created by edge `gens_finite_1[i, :2]` and destroyed by edge `gens_finite_1[i, 2:]`.\n",
"\n",
"This way of presenting persistence birth and death vertices/edges agrees with other persistent homology packages and in particular with [GUDHI](http://gudhi.gforge.inria.fr/)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a3dd8319-d0dc-4cb1-98ce-b30b126cf62b",
"metadata": {},
"outputs": [],
"source": [
"# Compute the persistence information\n",
"persistence_info = ripser_parallel(data, return_generators=True)\n",
"gens = persistence_info['gens']"
]
},
{
"cell_type": "markdown",
"id": "a4f4cb5a-1db4-43dc-b3a6-90682cb1bfe6",
"metadata": {},
"source": [
"Let us visualize the 1-dimensional generators for our point cloud, labelling them by the positional index of the corresponding feature in the persistence diagram and using blue for creation and red for destruction:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b655265-23f2-4821-80f2-555db7970e35",
"metadata": {},
"outputs": [],
"source": [
"# Plot the point cloud\n",
"fig = plot_point_cloud(data)\n",
"\n",
"# In blue are the edges that create a finite persistent topological features in dimension 1.\n",
"# In red are the edges that destroy a finite persistent topological feature in dimension 1.\n",
"for i, edges in enumerate(gens[1][0]):\n",
" birth_edge = edges[0:2]\n",
" death_edge = edges[2:4]\n",
" x0_create, y0_create = data[birth_edge[0]]\n",
" x1_create, y1_create = data[birth_edge[1]]\n",
" x0_destroy, y0_destroy = data[death_edge[0]]\n",
" x1_destroy, y1_destroy = data[death_edge[1]] \n",
"\n",
" fig.add_shape(type='line',\n",
" x0=x0_create,\n",
" y0=y0_create,\n",
" x1=x1_create,\n",
" y1=y1_create,\n",
" line=dict(color='Blue'))\n",
" fig.add_shape(type='line',\n",
" x0=x0_destroy,\n",
" y0=y0_destroy,\n",
" x1=x1_destroy,\n",
" y1=y1_destroy,\n",
" line=dict(color='Red'))\n",
" fig.add_trace(\n",
" go.Scatter(x=[0.5 * (x0_create + x1_create) + 0.05, 0.5 * (x0_destroy + x1_destroy) + 0.05],\n",
" y=[0.5 * (y0_create + y1_create), 0.5 * (y0_destroy + y1_destroy)],\n",
" text=[str(i), str(i)],\n",
" mode=\"text\")\n",
" )\n",
"\n",
"fig.update_layout(showlegend=False)\n",
"fig.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
22 changes: 16 additions & 6 deletions gph/bindings/ripser_bindings.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,26 @@ PYBIND11_MODULE(gph_ripser, m)
using namespace pybind11::literals;
m.doc() = "Ripser python interface";

py::class_<flagPersGen>(m, "flagPersGen", py::module_local())
.def_readwrite("finite_0", &flagPersGen::finite_0)
.def_readwrite("finite_higher", &flagPersGen::finite_higher)
.def_readwrite("essential_0", &flagPersGen::essential_0)
.def_readwrite("essential_higher", &flagPersGen::essential_higher);
// Because `ripser` could have two different modules after compilation
// It's necessary to add `py::module_local()` to prevent following issue:
// ImportError: generic_type: type "ripserResults" is already registered!
// When same python module imports gtda_ripser and gtda_ripser_coeff
py::class_<ripserResults>(m, "ripserResults", py::module_local())
.def_readwrite("births_and_deaths_by_dim",
&ripserResults::births_and_deaths_by_dim)
.def_readwrite("flag_persistence_generators_by_dim",
&ripserResults::flag_persistence_generators)
.def_readwrite("num_edges", &ripserResults::num_edges);

m.def(
"rips_dm",
[](py::array_t<value_t>& D, int N, int modulus, int dim_max,
float threshold, int num_threads) {
float threshold, int num_threads, bool return_generators) {
// Setup distance matrix and figure out threshold
auto D_ = static_cast<value_t*>(D.request().ptr);
std::vector<value_t> distances(D_, D_ + N);
Expand All @@ -57,20 +64,21 @@ PYBIND11_MODULE(gph_ripser, m)
ripserResults res;
ripser<compressed_lower_distance_matrix> r(std::move(dist), dim_max,
threshold, ratio,
modulus, num_threads);
modulus, num_threads,
return_generators);
r.compute_barcodes();
r.copy_results(res);
res.num_edges = num_edges;
return res;
},
"D"_a, "N"_a, "modulus"_a, "dim_max"_a, "threshold"_a, "num_threads"_a,
"ripser distance matrix");
"return_generators"_a, "ripser distance matrix");

m.def(
"rips_dm_sparse",
[](py::array_t<index_t>& I, py::array_t<index_t>& J,
py::array_t<value_t>& V, int NEdges, int N, int modulus, int dim_max,
float threshold, int num_threads) {
float threshold, int num_threads, bool return_generators) {
auto I_ = static_cast<index_t*>(I.request().ptr);
auto J_ = static_cast<index_t*>(J.request().ptr);
auto V_ = static_cast<value_t*>(V.request().ptr);
Expand All @@ -80,7 +88,8 @@ PYBIND11_MODULE(gph_ripser, m)
// Setup distance matrix and figure out threshold
ripser<sparse_distance_matrix> r(
sparse_distance_matrix(I_, J_, V_, NEdges, N, threshold),
dim_max, threshold, ratio, modulus, num_threads);
dim_max, threshold, ratio, modulus, num_threads,
return_generators);
r.compute_barcodes();
// Report the number of edges that were added
int num_edges = 0;
Expand All @@ -95,5 +104,6 @@ PYBIND11_MODULE(gph_ripser, m)
return res;
},
"I"_a, "J"_a, "V"_a, "NEdges"_a, "N"_a, "modulus"_a, "dim_max"_a,
"threshold"_a, "num_threads"_a, "ripser sparse distance matrix");
"threshold"_a, "num_threads"_a, "return_generators"_a,
"ripser sparse distance matrix");
}