Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: post-processing transform doesn't work with boundary labels #767

Open
NickleDave opened this issue Jul 19, 2024 · 0 comments
Open

BUG: post-processing transform doesn't work with boundary labels #767

NickleDave opened this issue Jul 19, 2024 · 0 comments

Comments

@NickleDave
Copy link
Collaborator

Currently the output of transforms.frame_labels.functional.postprocess is the same as the input when we pass in boundary_labels to use to segment.

The root cause is that transforms.frame_labels.boundary_inds_from_boundary_labels does not check whether the input boundary_labels is a 1-d vector. And then it return nonzero(boundary_labels)[0]. Right now we pass in boundary_labels as a 2-d vector with the first dimension/axis of size 1, so this gives us back a vector of zeros, because zero is the correct index along the first axis for every occurrence of a 1--the 2nd axis is the vector of 1s and 0s.

The short term fix will be to call row_or_1d with the boundary_labels argument to convert it to a vector (or raise an error if it's some other unexpected shape). We already do this with the frame_labels argument at the top of postprocess and we should do the same with boundary_labels. We should also do this inside the boundary_inds_from_boundary_labels function itself. And we should add tests that the functions are now working as expected--something I expect that I failed to do when I first added this functionality.

Long term, I kind of don't like this "validate and convert if necessary" approach. I copied it from scikit-learn. It just feels weird to have the side effect of converting to a 1-d vector. It also feels weird that we validate/convert in these two functions but not in the function "between" them, segment_inds_list_from_boundary_labels, that actually produces the list of indexing vectors used by the postprocess function. I guess my logic is that we want to do the validation inside the top-level function, and we also need to make sure that the lowest level helper function works on its own, so we double-extra-validate there. If we do it inside the function between them, now we are really being redundant. I don't have any better ideas for how to do this right now, just making note.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant