Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_element_instances and get_centroids should not (?) return zero element #614

Open
giovp opened this issue Jul 4, 2024 · 4 comments
Open

Comments

@giovp
Copy link
Member

giovp commented Jul 4, 2024

if isinstance(element, DataArray):
# get unique labels value (including 0 if present)
instances = da.unique(element.data).compute()

I think that is counter-intuitive to return zero, since it's not an instance. I would maybe consider to add an optional argument return_zero: bool = False wdyt?

@giovp
Copy link
Member Author

giovp commented Jul 4, 2024

I think we also don't allow to have "0" as a value in the index of shapes, but I am not entirely sure about that @LucaMarconato ? ~~ If not, I think we should probably also enforce that in order to avoid misunderstanding between instances retrieved from labels or shapes, wdyt? ~~

EDIT: we don't enforce it, I wonder, should we?

@giovp
Copy link
Member Author

giovp commented Jul 4, 2024

same here I think, what is the "centroid" value of an empty space in a label?

u = da.unique(portion, return_counts=True)

@giovp giovp changed the title get_element_instancesshould not (?) return zero get_element_instances and get_centroids should not (?) return zero element Jul 4, 2024
@LucaMarconato
Copy link
Member

LucaMarconato commented Jul 5, 2024

I agree, we should probably not return it for Labels, and for shapes it could indeed be confusing to have it. The problem is that for shapes there are some readers where we set instance_key to np.arange(len(shapes)) (as opposed to have the string values that were default). So, if we set an error for shapes we would break the compatibility with (at least) Visium.

I suggest the following:

  • add an optional argument to get_centroids() include_background: bool = False to ignore the background for labels. This will close the current issue.
  • for the moment, or for good, keep allowing 0 for shapes.

In this way, the inconvenient case that we leave uncovered is when a user takes some shapes data that has instances named 0 (such as Visium), and rasterizes the shapes into labels. In this way in fact the shapes 0 will turn into the background.

@LucaMarconato
Copy link
Member

I suggest to open a separate issue to discuss a strategy to deal with 0 in shapes and points (we can add an intro to this coment and turn it into an issue):

  • remove all the np.arange(len(...)) from spatialdata-io and spatialdata-sandbox and make the indices start from 1 instead.

This keeps the shapes<>tables consistent but create one problem: if the user parsed some data with an earlier version of spatialdata-io and had hardcoded some indices, now the indices are shifted. Not sure what's the best way to deal with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants