-
Notifications
You must be signed in to change notification settings - Fork 123
RSDK-10765 - Let mlvision resize to mlmodel's input size #5002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
services/vision/mlvision/detector.go
Outdated
@@ -110,7 +112,7 @@ func attemptToBuildDetector(mlm mlmodel.Service, | |||
return nil, err | |||
} | |||
|
|||
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels) | |||
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels, wasResized, resizeW, resizeH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since resizeW
and resizeH
could be set to the origW
and origH
, then just passing resizeW
and resizeH
in place of origW
and origH
in this method should fix the issue without any change to the method.
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, origW, origH, boxOrder, labels, wasResized, resizeW, resizeH) | |
boundingBoxes, err := ml.FormatDetectionOutputs(outNameMap, outMap, resizeW, resizeH, boxOrder, labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a unit test with a fake ML Model service that does this exact thing:
- has a fixed input size which requires resize
- outputs absolute bounding box coordinates relative to the resized image
- Check that the bounding boxes from the vision service are what you expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing -- public functions in the ml
directory in RDK are used by ML on cloud inference, Do Not change these public functions, as it will introduce a breaking change.
Instead, feed in the resized dimensions into FormatDetectionOutputs
's origW and origH
ml/detections.go
Outdated
@@ -21,7 +21,7 @@ const ( | |||
|
|||
// FormatDetectionOutputs formats the output tensors from a model into detections. | |||
func FormatDetectionOutputs(outNameMap *sync.Map, outMap Tensors, origW, origH int, | |||
boxOrder []int, labels []string, | |||
boxOrder []int, labels []string, wasResized bool, resizedW, resizedH int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot change the signature of this public function - I believe it used by ML @tahiyasalam
Sure. Made the fix without changing the method. Still working on writing the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! LGTM!
Hey, so this fix comes from a bug Nick ran into. Basically if someone uses a custom mlmodel module with a set input size different from the original image, we mess up the bounding box. Like, any absolute bbox coordinates that were read in were "normalized" using the original image and not the resized one. I think we've been so far getting lucky either with proportional bbox tensors or mlmodels that have [1 -1 -1 3] (variable) size input.
Here we just fix it by checking if we resized the image and using the appropriate image dims for normalization.
Tested via Python SDK with Nick's funky ncnn-ml module (0.1.1-rc1) and with our classic EffDet mlmodel setup. Nick's module used to not work now it does! And EffDet didn't break!