Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the indices of the vectors forming a hole #138

Closed
AlkanGoktug opened this issue Jan 28, 2022 · 2 comments
Closed

Get the indices of the vectors forming a hole #138

AlkanGoktug opened this issue Jan 28, 2022 · 2 comments

Comments

@AlkanGoktug
Copy link

Hello,

I am using ripser to analyze holes in my dataset. For this, I use the persistence homology with Betti number 1.

I would like to know if there is a way to access the indices of the vectors from the dataset which form the holes. Unfortunately, I can just access the birth and death values of the holes. I would like to access also the indices of the vectors that form each hole.

Thanks in advance.

@outlace
Copy link

outlace commented Mar 14, 2023

Yes, right now it seems Ripser can only compute the persistence diagrams but you have no way of analyzing the the individual data points that compose the various topological components found. If my PD shows my data has two cycles, then I want to know which points are on cycle 1 and which points are on cycle 2, for example.

@catanzaromj
Copy link
Contributor

There are ways to determine which points might lie near the hole which the persistence diagram is detecting. Please see this page in the docs for an example of how to use ripser.py to find representative co-cycles.

However, translating from a (co)homological feature, like a point in a persitence diagram, to a feature within the actual data, like a (co)cycle representative or list of the data points which "gave rise" to that feature, can be very subtle. Homology and cohomology are equivalence relations and what you both are asking for are representatives within an equivalence class. A priori, no one representative is better than any other. This is analogous to asking for a representative of the class of 1 within the integers modulo 5. You could say 1 should be the representative, but 6, 11, 16, and -4 are all just as fair answers. To make the topological situation even more difficult, the cycle representative need not be stable. This means that if your data is perturbed slightly, there are theoretical guarantees that the points within the persistence diagram will only move slightly, but no guarantee that your cycle representative will only move slightly--in fact it can move arbitrarily far away from the initial representative.

The domain-specific/scientific aspect of the problem you're solving might lead you to natural choices for which representatives to choose (e.g., minimal length, minimal energy, minimal cost, etc.)

@scikit-tda scikit-tda locked and limited conversation to collaborators Jul 6, 2024
@catanzaromj catanzaromj converted this issue into discussion #168 Jul 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants