fix: SDPose - resize input always#13349
Conversation
📝 WalkthroughWalkthroughThree modules received refinements to pose estimation and detection inference. In the Gaussian blur operation of pose heatmap processing, the truncation parameter was explicitly set to 2.5. In RT-DETR detection inference, execution was restructured to process batches in 32-item sub-batches instead of processing the entire input at once. In SDPose keypoint extraction, device and dtype handling was added, two helper functions were introduced for image resizing and keypoint coordinate remapping, and VAE encoding calls were refactored to work with resized images across different processing modes. 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The model can't handle too large images, so it should force resize the input always, currently it's only done when using bboxes. Also fixed small error in the heatmap postprocessing, to make it match the original cv2 method better, and added batching to RT-DETR as it would OOM on huge amount of input frames.
And also added support for the intermediate dtype on the pose drawing node.
Issue brought up by a user here:
#12661 (comment)