Skip to content

RSDK-10765 - Let mlvision resize to mlmodel's input size #5002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 27, 2025

Conversation

kharijarrett
Copy link
Member

Hey, so this fix comes from a bug Nick ran into. Basically if someone uses a custom mlmodel module with a set input size different from the original image, we mess up the bounding box. Like, any absolute bbox coordinates that were read in were "normalized" using the original image and not the resized one. I think we've been so far getting lucky either with proportional bbox tensors or mlmodels that have [1 -1 -1 3] (variable) size input.

Here we just fix it by checking if we resized the image and using the appropriate image dims for normalization.

Tested via Python SDK with Nick's funky ncnn-ml module (0.1.1-rc1) and with our classic EffDet mlmodel setup. Nick's module used to not work now it does! And EffDet didn't break!

@kharijarrett kharijarrett requested a review from bhaney May 21, 2025 18:35
@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label May 21, 2025
@@ -110,7 +112,7 @@ func attemptToBuildDetector(mlm mlmodel.Service,
return nil, err
}

boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels)
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels, wasResized, resizeW, resizeH)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since resizeW and resizeH could be set to the origW and origH, then just passing resizeW and resizeH in place of origW and origH in this method should fix the issue without any change to the method.

Suggested change
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels, wasResized, resizeW, resizeH)
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, resizeW, resizeH, boxOrder, labels)

Copy link
Member

@bhaney bhaney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a unit test with a fake ML Model service that does this exact thing:

  1. has a fixed input size which requires resize
  2. outputs absolute bounding box coordinates relative to the resized image
  3. Check that the bounding boxes from the vision service are what you expect

Copy link
Member

@bhaney bhaney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing -- public functions in the ml directory in RDK are used by ML on cloud inference, Do Not change these public functions, as it will introduce a breaking change.

Instead, feed in the resized dimensions into FormatDetectionOutputs's origW and origH

ml/detections.go Outdated
@@ -21,7 +21,7 @@ const (

// FormatDetectionOutputs formats the output tensors from a model into detections.
func FormatDetectionOutputs(outNameMap *sync.Map, outMap Tensors, origW, origH int,
boxOrder []int, labels []string,
boxOrder []int, labels []string, wasResized bool, resizedW, resizedH int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot change the signature of this public function - I believe it used by ML @tahiyasalam

@kharijarrett
Copy link
Member Author

Sure. Made the fix without changing the method. Still working on writing the test.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 21, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 27, 2025
@kharijarrett kharijarrett requested a review from bhaney May 27, 2025 19:30
Copy link
Member

@bhaney bhaney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! LGTM!

@kharijarrett kharijarrett merged commit fea8c59 into viamrobotics:main May 27, 2025
18 checks passed
@kharijarrett kharijarrett deleted the RSDK-10765 branch May 27, 2025 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants