[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31

ulupo · 2021-10-22T13:51:35Z

Am i right to think that this comment

Lines 936 to 937 in c7f5d46

    
           // TODO: Get correct birth times if the edges are negative (required for 
        
           // lower star)

is referring to an issue that is now resolved?

ulupo · 2021-10-22T13:58:54Z

Actually I now worry that this is still unaddressed, because of

giotto-ph/gph/src/ripser.h

Line 961 in c7f5d46

if (get_diameter(e) != 0) {

ulupo · 2021-10-24T14:01:56Z

Continuing from #29 (comment)

About the check if (get_diameter(e) != 0) {, I am not sure I understand the issue.
It just ensure that the diameter of the edge is not 0.

There is an important conceptual difference between ripser.py and original ripser, and it is that original ripser assumes all inputs are matrices with zero along the diagonal and non-negative off-diagonal entries. This corresponds to assuming all inputs are distance matrices, since there cannot be negative distances and the distance of a point from itself is zero for any reasonable notion of distance. Original ripser in dimension 0 allows for the possibility that some off-diagonal distances are zero, but if that is the case it does not create a bar (because it would be a trivial bar which is born at 0 and dies at 0). This is enforced by the if (get_diameter(e) != 0) {, check.

But this is conceptually incorrect for us and for ripser.py. Our assumptions for the input are just that entry (i, j) is greater than or equal to both entry (i, i) and entry (j, j). There should be nothing special about zero off-diagonal entries and we should even allow for negative entries (on or off-diagonal). I think we already do this correctly in higher dimensions, but it seems to me that there might still be issues in dimension 0.

There is a unit test that really should be added to this repository and it is the following:

Create a dense distance matrix dm by running squareform(pdist(X)) where X is a point cloud. Compute its diagrams dgms (up to dimension 1 or 2).
Create dm_neg_vert = dm - 1. Compute its diagrams dgms_neg_vert. Check that dgms_neg_vert[d] = dgms[d] - 1 for all dimensions d.
Check that dm[0, 1] != 0 and create dm_one_zero_edge = dm - dm[0, 1]. Compute its diagrams dgms_one_zero_edge. Check that dgms_one_zero_edge [d] = dgms_one_zero_edge[d] - dm[0, 1] for all dimensions d.

I suspect that we currently fail test 3 in dimension 0.

ulupo · 2021-11-02T18:37:36Z

@MonkeyBreaker here's a complete test demonstrating that there is a bug:

import numpy as np
from scipy.spatial.distance import pdist, squareform
from gph import ripser_parallel

np.random.seed(0)

maxdim = 0
kwargs = {"metric": "precomputed", "maxdim": maxdim}

shape = (5, 4)

X = np.random.random(shape)
dm = squareform(pdist(X))

dgms_orig = ripser_parallel(dm, **kwargs)["dgms"]

offset = 1
dm_neg_vert = dm - offset
dgms_neg_vert = ripser_parallel(dm_neg_vert, **kwargs)["dgms"]

edge_val = dm[0, 1]
assert edge_val

dm_one_zero_edge = dm - edge_val

dgms_one_zero_edge = ripser_parallel(dm_one_zero_edge, **kwargs)["dgms"]

for dim in range(maxdim + 1):
    print(f"""# DIMENSION {dim}

Original dm:
{dgms_orig[dim]}
Subtract {offset} from dm, and add {offset} back to the diagram:
{dgms_neg_vert[dim] + np.float32(offset)}
Subtract dm[0, 1] from dm (make one edge zero), and add dm[0, 1] back to the diagram:
{dgms_one_zero_edge[dim] + np.float32(edge_val)}
    
    """)

All diagrams should be identical, but the last one has one fewer bar (!) because we removed an edge just because it was zero. ~~Furthermore, it has incorrect birth times.~~ (see edit below)

I'm thinking we have TWO issues here: the TODO comment is not dead (there is an issue with birth values clearly), AND we should not remove edges just because they are zero. The second issue is solved in this example by removing the if (get_diameter(e) != 0) check as I mentioned above, but not the first! (see edit below)

ulupo · 2021-11-02T18:45:15Z

EDIT TO PREVIOUS MESSAGE: It turns out the wrong birth times were just about float64 to float32 conversions, the test works now that I've added float32 casting explicitly. This reminds me that we really should address #19!

Fixes issue giotto-ai#31

@MonkeyBreaker

…i#31 Also revert import ordering change as suggested by @MonkeyBreaker

@ubauer

* Remove check for zero-diameter edge in compute_dim_0_pairs Fixes issue #31 * Avoid get_dense_distance_matrices producing overflow warnings This was due to adding non-finite diagonal elements to themselves when doing dm + dm.T, but we can avoid having those elements in the first place * Add unit test/regression test for equivariance * [CPP] Add inline comment to explain divergence from @ubauer's ripser * [CPP] code style changes after @MonkeyBreaker's review * Add separate regression test for matrix of the kind in issue #31 Also revert import ordering change as suggested by @MonkeyBreaker

ulupo mentioned this issue Oct 24, 2021

Flag persistence generators #29

Merged

10 tasks

ulupo added this to To do in v0.2.0 via automation Oct 25, 2021

ulupo added the bug Something isn't working label Oct 25, 2021

ulupo mentioned this issue Oct 31, 2021

Dead comments in 0-dimensional computation/preferential treatment of edges with zero value scikit-tda/ripser.py#135

Open

ulupo mentioned this issue Nov 2, 2021

Remove check for zero diameters in 0-dimensional code #41

Closed

2 tasks

MonkeyBreaker moved this from To do to In progress in v0.2.0 Nov 4, 2021

ulupo changed the title ~~Dead comment?~~ [BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value Nov 8, 2021

ulupo added a commit to ulupo/giotto-ph that referenced this issue Nov 8, 2021

Remove check for zero-diameter edge in compute_dim_0_pairs

f16e637

Fixes issue giotto-ai#31

ulupo mentioned this issue Nov 8, 2021

[CPP] Remove zero diameter check for edges in 0-dim computation #43

Merged

ulupo added a commit to ulupo/giotto-ph that referenced this issue Nov 8, 2021

Add separate regression test for matrix of the kind in issue giotto-a…

08b6a43

…i#31 Also revert import ordering change as suggested by @MonkeyBreaker

ulupo closed this as completed in #43 Nov 8, 2021

v0.2.0 automation moved this from In progress to Done Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31

[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31

ulupo commented Oct 22, 2021

ulupo commented Oct 22, 2021

ulupo commented Oct 24, 2021 •

edited

ulupo commented Nov 2, 2021 •

edited by MonkeyBreaker

ulupo commented Nov 2, 2021

[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31

[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31

Comments

ulupo commented Oct 22, 2021

ulupo commented Oct 22, 2021

ulupo commented Oct 24, 2021 • edited

ulupo commented Nov 2, 2021 • edited by MonkeyBreaker

ulupo commented Nov 2, 2021

ulupo commented Oct 24, 2021 •

edited

ulupo commented Nov 2, 2021 •

edited by MonkeyBreaker