-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Dead comment in compute_dim_0_pairs/incorrect removal of edges with zero filtration value #31
Comments
Actually I now worry that this is still unaddressed, because of Line 961 in c7f5d46
|
Continuing from #29 (comment)
There is an important conceptual difference between ripser.py and original ripser, and it is that original ripser assumes all inputs are matrices with zero along the diagonal and non-negative off-diagonal entries. This corresponds to assuming all inputs are distance matrices, since there cannot be negative distances and the distance of a point from itself is zero for any reasonable notion of distance. Original ripser in dimension 0 allows for the possibility that some off-diagonal distances are zero, but if that is the case it does not create a bar (because it would be a trivial bar which is born at 0 and dies at 0). This is enforced by the But this is conceptually incorrect for us and for ripser.py. Our assumptions for the input are just that entry (i, j) is greater than or equal to both entry (i, i) and entry (j, j). There should be nothing special about zero off-diagonal entries and we should even allow for negative entries (on or off-diagonal). I think we already do this correctly in higher dimensions, but it seems to me that there might still be issues in dimension 0. There is a unit test that really should be added to this repository and it is the following:
I suspect that we currently fail test 3 in dimension 0. |
@MonkeyBreaker here's a complete test demonstrating that there is a bug: import numpy as np
from scipy.spatial.distance import pdist, squareform
from gph import ripser_parallel
np.random.seed(0)
maxdim = 0
kwargs = {"metric": "precomputed", "maxdim": maxdim}
shape = (5, 4)
X = np.random.random(shape)
dm = squareform(pdist(X))
dgms_orig = ripser_parallel(dm, **kwargs)["dgms"]
offset = 1
dm_neg_vert = dm - offset
dgms_neg_vert = ripser_parallel(dm_neg_vert, **kwargs)["dgms"]
edge_val = dm[0, 1]
assert edge_val
dm_one_zero_edge = dm - edge_val
dgms_one_zero_edge = ripser_parallel(dm_one_zero_edge, **kwargs)["dgms"]
for dim in range(maxdim + 1):
print(f"""# DIMENSION {dim}
Original dm:
{dgms_orig[dim]}
Subtract {offset} from dm, and add {offset} back to the diagram:
{dgms_neg_vert[dim] + np.float32(offset)}
Subtract dm[0, 1] from dm (make one edge zero), and add dm[0, 1] back to the diagram:
{dgms_one_zero_edge[dim] + np.float32(edge_val)}
""") All diagrams should be identical, but the last one has one fewer bar (!) because we removed an edge just because it was zero.
|
EDIT TO PREVIOUS MESSAGE: It turns out the wrong birth times were just about float64 to float32 conversions, the test works now that I've added float32 casting explicitly. This reminds me that we really should address #19! |
…i#31 Also revert import ordering change as suggested by @MonkeyBreaker
* Remove check for zero-diameter edge in compute_dim_0_pairs Fixes issue #31 * Avoid get_dense_distance_matrices producing overflow warnings This was due to adding non-finite diagonal elements to themselves when doing dm + dm.T, but we can avoid having those elements in the first place * Add unit test/regression test for equivariance * [CPP] Add inline comment to explain divergence from @ubauer's ripser * [CPP] code style changes after @MonkeyBreaker's review * Add separate regression test for matrix of the kind in issue #31 Also revert import ordering change as suggested by @MonkeyBreaker
Am i right to think that this comment
giotto-ph/gph/src/ripser.h
Lines 936 to 937 in c7f5d46
is referring to an issue that is now resolved?
The text was updated successfully, but these errors were encountered: