Add Persistent Homology Computation Backend for VR #1

MonkeyBreaker · 2021-05-31T12:00:03Z

This PR add the first version of the HPC backend for compute PH on VR filtration.
This backend is based on RIpser and Ripser Lock-free version.

The Python Interface is the one used in giotto-tda, which is based on Ripser.py Python interface.

Remaining tasks:

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* Fix windows compilation * Fix hash_map when indices have a value of (0, 0) * Remove optimization for max dimension, we return the number of dimensions as input * Essential pairs prevent adding one with wrong value Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* Disable Thresh optimization due to unexpected results with negative edges Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* Fix a test sorting an unexpected value Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Currently experience segmentation faults ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* Uniformed chunk size computation * Use index_pivot instead of pivot (primitive type vs class) Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

…ilable Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* remove wrong fix in compute_pairs * Only sort and create hashmap if at least one column is present * compressed_distance_matrix from class to struct Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

… function ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

ulupo · 2021-06-30T09:58:04Z

In terms of adapting the giotto-tda tests to make tests for the weights here, you could start by adapting this (there are probably small syntax errors as I have not tested it)

X_pc = np.array([[2., 2.47942554],
                 [2.47942554, 2.84147098],
                 [2.98935825, 2.79848711],
                 [2.79848711, 2.41211849],
                 [2.41211849, 1.92484888]])

X_dist = squareform(pdist(X_pc))

X_pc_sparse = csr_matrix(X_pc)
X_dist_sparse = csr_matrix(X_dist)

X_dist_disconnected = np.array([[0, np.inf], [np.inf, 0]])

X_vrp_exp = [
    np.array([[0., 0.43094373],
              [0., 0.5117411],
              [0., 0.60077095],
              [0., 0.62186205]]),
    np.array([[0.69093919, 0.80131882]])
    ]


def test_wrp_notimplemented_string_weights():
    with pytest.raises(ValueError, match="'foo' passed for `weights` but the "
                                         "only allowed string is 'DTM'"):
        ripser(X_pc, weights="foo")


def test_wrp_notimplemented_p():
    with pytest.raises(NotImplementedError):
       ripser(X_pc, weights="DTM", weight_params={"p": 1.2})


@pytest.mark.parametrize("X, metric", [(X_pc, "euclidean"),
                                       (X_pc_sparse, "euclidean"),
                                       (X_dist, "precomputed"),
                                       (X_dist_sparse, "precomputed")])
@pytest.mark.parametrize("weight_params", [{"p": 1}, {"p": 2}, {"p": np.inf}])
@pytest.mark.parametrize("collapse_edges", [True, False])
@pytest.mark.parametrize("thresh", [np.inf, 0.8])
def test_wrp_same_as_vrp_when_zero_weights(X, metric, weight_params,
                                           collapse_edges, thresh):
    X_wrp = ripser(weights=lambda x: np.zeros(x.shape[0]),
                 weight_params=weight_params,
                 metric=metric,
                 collapse_edges=collapse_edges,
                 thresh=thresh)

    for i in range(2):
        assert_almost_equal(X_wrp[i], X_vrp_exp[i])


X_dtm_exp = {1: [np.array([[0.95338798, 1.474913],
                           [1.23621261, 1.51234496],
                           [1.21673107, 1.68583047],
                           [1.30722439, 1.73876917]]),
                 np.array([[]])],
             2: [np.array([[0.95338798, 1.08187652],
                           [1.23621261, 1.2369417],
                           [1.21673107, 1.26971364],
                           [1.30722439, 1.33688354]]),
                 np.array([[]])],
             np.inf: [np.array([[]]),
                      np.array([[]])]}


@pytest.mark.parametrize("X, metric", [(X_pc, "euclidean"),
                                       (X_pc_sparse, "euclidean"),
                                       (X_dist, "precomputed"),
                                       (X_dist_sparse, "precomputed")])
@pytest.mark.parametrize("weight_params", [{"p": 1}, {"p": 2}, {"p": np.inf}])
@pytest.mark.parametrize("collapse_edges", [True, False])
def test_dtm(X, metric, weight_params, collapse_edges):
    X_dtm = ripser(X, weights="DTM", weight_params=weight_params,
                   metric=metric, collapse_edges=collapse_edges)
    
    for i in range(2):
        assert_almost_equal(X_dtm[i], X_dtm_exp[weight_params["p"]][i])

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Honestly I do not understand why it failing ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

MonkeyBreaker · 2021-06-30T13:53:06Z

@ulupo, thank you for the code snippet.
I rework it in order to pass !
I needed to do some adaptations, I made the most obvious one:

Add a function that do similar steps than _postprocess_diagrams method in WeightedRipsPersistence.
It was weird but in the test for DTM, two empty arrays did not have the same shape, in order to not add to much the code, I just check if both arrays are empty, I do not run the assert.

ulupo · 2021-06-30T14:31:25Z

Thanks @MonkeyBreaker, and sorry I made you do this extra work. I can suggest to replace np.array([[]]) with np.empty((0, 2)) which will have the right shape, and to put np.inf instead of the threshold values in the *_exp arrays so you don't have to code the replace inf function.

MonkeyBreaker · 2021-06-30T14:32:27Z

Let me give it a shot, if it can make things cleaner, better to go this way :).

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

* Adapt test for wrp because of precision * When no threshold provided and input is not sparse, compute enclosing radius and represent data in a dense way * When threshold is provided and input is sparse, don't compute enclosing radius and represent data in a sparse way Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

…thread Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

fix img and gif links

GitHub pages

…into tutorial_nb

Tutorial nb

review-notebook-app · 2021-07-02T14:21:17Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

julian added 30 commits May 31, 2021 13:46

[GIT] Add junction and turf as submodules

385c5a9

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Move Collapser from cpp to src

24dd149

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Add ripser lock-free backend first version

3f2fe77

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Add ripser test on Python

6c605be

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CI] Add installation of hypothesis for Python test

d3ab341

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[PY] Update interface to support number of threads

aab285d

* Disable Thresh optimization due to unexpected results with negative edges Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Disable >2 coefficient fields support

a80de1b

* Fix a test sorting an unexpected value Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[GIT] Add ignore of hypothesis folder created during test

30651a1

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

Add sklearn as a requirement

021d863

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Remove pthread include

7f4cfc7

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Fix issues with MSVC

70ac414

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Fix error on MSVC compilator

1d7770c

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[BUILD] Enable C++14 wide compilation flag

1d96f62

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Disable static thread_local

a10cb5b

Currently experience segmentation faults ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CI] Remove on push action, only keep for pull request

78ff3d3

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Fix data generation to be bounded in 32 bit floating point

1236eee

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Fix bug of adding an essential when the column was already reduced

83d5ce2

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[PY] Add num_threads for dense case

e055efe

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CI] Add limit to CI for 10 min maximum

0ac5a71

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Update ripser backend

4e2f54f

* Uniformed chunk size computation * Use index_pivot instead of pivot (primitive type vs class) Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Generate datasets with values > 0 instead of >= 0

600621d

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Fix issue when nb elems to process is lower than nb threads ava…

7e26e62

…ilable Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Update ripser backend

7f13e13

* remove wrong fix in compute_pairs * Only sort and create hashmap if at least one column is present * compressed_distance_matrix from class to struct Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Ensure when calling collapser that data is float32

2ce8ffa

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[PY] Update enclosing radius

7912b4c

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Update dataset generation to be bounded in 32-bit floats

cf5f310

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Add test for varying number of threads

885d205

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Rename ripser.cpp into ripser.h, move bindings from ripser.h

d99fffb

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Update types for bindings and remove code from dev

3c29c1b

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

julian added 4 commits June 29, 2021 22:01

[CPP] Fix Windows compilation due to not explicitely template sorting…

0e4e268

… function ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[DOC] Add RELEASE.rst to allow backlog of changes

937d84b

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[DOC] Update version of the package

0cddae3

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Document hash map hacks

2a17575

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

julian added 4 commits June 30, 2021 13:24

[CI] Remove "needs" in order to build wheels when needed

c4f5a18

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Fix issue compiling on Windows

42e8b96

Honestly I do not understand why it failing ... Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[PY] Refactor Weight Vietoris-Rips code

0c762cd

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[TEST] Add test suite for Weighted Vietoris-Rips

d2c5b53

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

julian and others added 7 commits June 30, 2021 16:43

[TEST] Add @ulupo suggestion for wrp test

c89190b

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[CPP] Add support to number of threads=-1 => using maximal number of …

a91c077

…thread Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

[DOC] Update headers, contribution of users

77e5133

Signed-off-by: julian <julian.burellaperez@heig-vd.ch>

add documetation generation

9003865

fix img and gif links

2092990

Merge pull request #4 from giotto-ai/local_nb

263edd6

fix img and gif links

MonkeyBreaker mentioned this pull request Jul 2, 2021

GitHub pages #3

Merged

giotto-learn and others added 6 commits July 2, 2021 15:15

Merge pull request #3 from giotto-ai/github_pages

8928430

GitHub pages

Merge branch 'c++_backend' into tutorial_nb

9d5d861

implement review comments

71582c2

Merge branch 'tutorial_nb' of https://github.com/giotto-ai/giotto-ph …

d8b1ee0

…into tutorial_nb

implement review comments, part 2

bb68c23

Merge pull request #5 from giotto-ai/tutorial_nb

f8ce2a5

Tutorial nb

MonkeyBreaker changed the title ~~[WIP] Add Persistent Homology Computation Backend for VR~~ Add Persistent Homology Computation Backend for VR Jul 2, 2021

MonkeyBreaker merged commit 37f2b17 into main Jul 2, 2021

MonkeyBreaker deleted the c++_backend branch July 2, 2021 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Persistent Homology Computation Backend for VR #1

Add Persistent Homology Computation Backend for VR #1

MonkeyBreaker commented May 31, 2021 •

edited

ulupo commented Jun 30, 2021 •

edited

MonkeyBreaker commented Jun 30, 2021

ulupo commented Jun 30, 2021

MonkeyBreaker commented Jun 30, 2021

review-notebook-app bot commented Jul 2, 2021

Add Persistent Homology Computation Backend for VR #1

Add Persistent Homology Computation Backend for VR #1

Conversation

MonkeyBreaker commented May 31, 2021 • edited

ulupo commented Jun 30, 2021 • edited

MonkeyBreaker commented Jun 30, 2021

ulupo commented Jun 30, 2021

MonkeyBreaker commented Jun 30, 2021

review-notebook-app bot commented Jul 2, 2021

MonkeyBreaker commented May 31, 2021 •

edited

ulupo commented Jun 30, 2021 •

edited