Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PY-611] Silently ignore "empty" masks during import #741

Merged
merged 2 commits into from Dec 15, 2023
Merged

Conversation

saurbhc
Copy link
Member

@saurbhc saurbhc commented Dec 8, 2023

Problem

Silently ignore "empty" masks (i.e. masks that do not have a corresponding raster layer) during import

Solution

During import, checking if the mask annotation is present in raster_layer - then further checking the same in the whole dense_rle flat array, if not present, don't import the mask (i.e., remove the mask annotation from import request data and also remove it from raster_layer mask-mapping)

Changelog

Silently ignore "empty" masks (i.e. masks that do not have a corresponding raster layer) during import

@saurbhc saurbhc added the enhancement New feature or request label Dec 8, 2023
@saurbhc saurbhc self-assigned this Dec 8, 2023
_annotation_mask_reference_i = mask_annotation_ids_mapping[_annotation_id]

if dense_rle is not None:
for i in range(0, len(dense_rle), 2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this loop may take few seconds, in the end this in extreme case is as long as many pixels we have on the image. Meanwhile, for each mask we're doing this search, so quite high time complexity. Maybe I'm a bit too paranoid, but wdyt about building first some kind of hash lookup table for those IDs instead of looping though same dense_rle so many times?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Optimised the code in a new commit - introduced 2 variables (so that we don't have to iterate annotations every single time):

  1. rl: stores raster_layer annotation
  2. rl_dense_rle_ids: stores a simple hash-set of dense_rle - only pair[0] unique keys. Example: {0, 2}

Iteration on dense_rle is optimised as well, was iterating on every single element in the list before, now moved to a simple set conversion like rl_dense_rle_ids = set(rl.data["dense_rle"][::2]) - this will collect all unique pair[0] keys that we need.

Please have a look and let me know your thoughts?

Copy link
Contributor

@Nathanjp91 Nathanjp91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something we should test. Not here, but e2e should include an import raster mask with/without empty layers. Can you add tickets to linear under Py Grooming and Backlog for this.

@saurbhc saurbhc merged commit a2b61f6 into master Dec 15, 2023
13 checks passed
@saurbhc saurbhc deleted the PY-611 branch December 15, 2023 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants