Skip to content

gabbyangeloro/Pyscapes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pyscapes: Persistence landscapes in python

This packages implements persistence landscapes in a user-friendly fashion in python. Persistence landscapes are a vectorization scheme for persistent homology, providing a robust set of feature maps.

Persistence Landscapes were introduced in Bubenik. These provide a functional representation of persistence diagrams amenable to statistical analysis. They have been proven to be very useful in a variety of empirical and theoretical applications. See the examples provided in Bubenik.

Our implementation follows the algorithms provided in Bubenik, Dlotko. Dlotko has also implemented many of these algorithms in C++ directly in the Persistence Landscape Toolbox, available here.

Why persistence landscapes?

  • They were one of the first vectorization schemes introduced for persistent homology, yet we aren't aware of any implementations in python TDA libraries, e.g., scikit-tda.
  • They are (essentially) non-parametric feature maps. Hence they do not require any hyperparameter tuning.
  • They have already been shown to be quite successful in a variety of applications.

Goals of this implementation

  • To interface cleanly with existing libraries for computing persistent homology (e.g., ripser.py).
  • To interface cleanly with existing libraries for machine learning (e.g., scikit-learn).

Example

While the code is not yet ready for public consumption, here we give a brief overview of how it can be used.

Exact vs Grid

We provide two different implementations of Persistence Landscapes:

  • PersLandscapeExact provides an exact implementation. All methods and computations done in this class are as accurate as the floating point arithmetic of Python. In particular, there are no approximations in our calculations. Landscape functions are internally stored as a list of their critical points. Instantiation is fast but arithmetic operations, such as summing and averaging can be quite slow because of this.

  • PersistenceLandscapeGrid provides an approximate implementation. The landscape functions are sampled on an equally spaced grid, defined by start, stop, and num_steps. This approximation allows us to easily embed the landscape in a num_steps-dimensional real vector space. Instantiating these can be slower than their exact counterparts due to this sampling. However, arithmetic operations tend to be much faster because of this approximation, especially because of Numpy's vectorized operations.

The API is designed to be as consistent as possible between the two instantiations. Arithmetic operations and norms are defined for both. Each has a compute flag

Examples of both are provided below.

Sample code, exact

import numpy as np
from ripser import ripser
from PersistenceLandscapeExact import PersLandscapeExact  # to be updated

data = np.random.random_sample((200,2)) # generate random points
diagrams = ripser(data)['dgms'] # compute persistent homology
M = PersistenceLandscapeGrid(dgms=diagrams, hom_deg=1) # instantiate
M.compute_landscape() # compute the landscape

Computing the actual functions that comprise a persistence landscape can be computationally intensive, so we don't compute them upon instantiation by default. Instead they're only computed after the compute_landscape method is called, serving as an initialization method.

Basic arithmetic operations are implemented. This allows for arbitrary linear combinations of persistence landscapes, including averages. Norms are implemented for quantifying differences and to ease the use of permutation tests.

L = 2*M
J = M + M
K = M/5
L.p_norm(p=2) # 0.02026
L.infinity_norm() # 0.063817

Persistence Landscapes can also be plotted with plot_landscape. Here are some examples.

Sample code, grid

from tadasets import torus
from PersistenceLandscapeGrid import PersistenceLandscapeGrid  # to be updated

torus_pts = torus(n=1000) # 1000 points on a torus
torus_dgms = ripser(torus_pts)['dgms'] # compute PH
tor_pl = PersistenceLandscapeGrid(dgms=torus_dgms, hom_deg=1, compute=True) # compute and instantiate

plot_landscape(landscape=tor_pl, title=r"PL for a torus in degree 1")

See the documentation and examples for more details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published