This repository contains data and code for computing single-parameter persistent homology rank functions and bi-parameter persistent homology rank invariants, also reproducing the figures and results in the paper ``Computational Stability for Persistence Rank Function Machine Learning".
Persistent homology is one of the most important tools in topological data analysis, studying how homological features of data persist over scale. Commonly, persistent homology is represented using the persistence diagram or the persistence barcode. However, these representations do not lend themselves naturally to statistical and machine learning algorithms.
We explore the persistent homology rank functions as equivalent functional representations that are naturally suited to methods in functional data analysis, which can also be extended to the multiparameter setting.
This repository is split into the following:
HRV_Application
: contains code and data related to the application of classifying HRV between healthy individuals and post-stroke patients in Section 3.1 of the paper.Classification on HRV data.ipynb
: notebook to reproduce all figures and results from Section 3.1.Classification of HRV data with persistence images.ipynb
: notebook to reproduce results for classification using persistence imagesinterpolatedRR_CON.csv
andnormalRRs_CON.csv
are the data required.
Compute_Biparameter_Rank_Invariant
: contains Python scripts for computing biparameter rank invariants (which are adapted from the original code for computing multiparameter persistence landscapes found here)Simulation_Study
: contains code for simulation study found in Section 3.2Simulation.ipynb
: notebook to reproduce all figures and results from Section 3.2.simulation_tools.py
andcompute_rank_function_from_barcode.py
contain supplementary code for the notebook
Lung_CT_Application
: contains code and data related to the application of classifying states of lung tumours from MRIs found in Section 3.3 of the paper.Application in Lung Tumour Classification.ipynb
,dcm to point clouds to rank functions.ipynb
andFor visualization.ipynb
are notebooks to reproduce all figures and results from Section 3.3. More explanations can be found within the notebooks.Diagnosis.csv
: contains lung tumour diagnosisMeta.csv
: contains lung tumour MRI information including whether images contain contrasting materialmasks
: contains tumour segmented masks.point_couds
: contains point clouds from the tumour surface.small_point_clouds
: contains subsampled point clouds.Functions
: contains supplementary scripts to generate point clouds from tumour surface from CT scans sourced from (https://github.com/robinvndaele/TDA_LungLesion).
Single parameter persistent homology rank functions can be computed both in Julia and Python and we compute two parameter persistent homology rank functions using RIVET in combination with python scripts. To run the ipython notebooks within this repository, we require the following libraries.
Julia is a high-level, general-purpose dynamic programming language. We compute single-parameter persistent homology using the Ripserer package and we also require the following packages:
- CSV
- DataFrames
- Plots
- MultivariateStats
- Statistics
- GLM
- StatBase
- Lathe
- MLBase
- ClassImbalance
- ROCAnalysis
- LIBSVM
- Random
Python is a high-level, versatile, and easy-to-read programming language widely used for various applications, including data analysis, machine learning. We require from python the following packages:
RIVET is a tool for computing two-parameter persistent homology and we provide code to compute the rank invariants from the RIVET outputs in the folder Compute_Biparameter_Rank_Invariant.