-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907
base: main
Are you sure you want to change the base?
Conversation
CLA Assistant Lite bot All Contributors have signed the CLA. ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 Hello @pbanavara, thank you for submitting an Ultralytics YOLOv8 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:
- ✅ Verify your PR is up-to-date with
ultralytics/ultralytics
main
branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by runninggit pull
andgit merge main
locally. - ✅ Verify all YOLOv8 Continuous Integration (CI) checks are passing.
- ✅ Update YOLOv8 Docs for any new or updated features.
- ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee
See our Contributing Guide for details and let us know if you have any questions!
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #8907 +/- ##
=======================================
Coverage 75.99% 76.00%
=======================================
Files 121 121
Lines 15332 15332
=======================================
+ Hits 11652 11653 +1
+ Misses 3680 3679 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I have read the CLA Document and I sign the CLA |
@pbanavara thanks for the PR! I think this might be a bit much to incorporate though. Do you have any ONNX benchmarks on iOS? We have an open-source iOS framework for deploying YOLOv8 in https://github.com/ultralytics/yolo-ios-app that I think is much, much simpler than this PR. Perhaps you could take a look and update that repo with additional functionality for pose? |
Hi @glenn-jocher sorry I was out for a few days. Was in surgery. Will update this PR shortly by incorporating this with the YOLO iOS app. |
@pbanavara hi there! No worries at all, I hope everything went well with your surgery and you're on the path to a quick recovery 🙏. Looking forward to your updates on integrating the PR with the YOLO iOS app. If you have any questions or need any assistance along the way, feel free to reach out. Take care! |
@glenn-jocher quick update I am able to get the model working in the yolo-ios-app just figuring out how to draw the rect and the keypoints. I am reusing your boundingboxes, but the rect is not painting. However, the processing is very slow. Around 3.5 to 4 fps. I don't think this ONNX model is practical at such speeds. The object detection coreML model for comparison runs at almost 30fps. So the only scalable option is to convert the pose model into CoreML - I will start working on that and see if I can complete the body pose painting in the previewLayer. |
@pbanavara hi there! 🌟 Great to hear you've made progress with integrating the model into the yolo-ios-app. Drawing keypoints and bounding boxes certainly requires some tweaks, especially in rendering them efficiently. The discrepancy in FPS between the ONNX model and the CoreML model is indeed significant. CoreML is highly optimized for iOS devices, providing a more seamless integration and faster processing times, so your plan to convert the pose model into CoreML sounds like the right approach to enhance performance. For the drawing issue, ensure your drawing code executes on the main thread, something like this for keypoints: DispatchQueue.main.async {
// Your code to draw keypoints here
} This might also solve the slow rendering issues if they're not related to model inference speed. Keep us updated on your progress with the CoreML conversion, and feel free to reach out if you hit any snags! Looking forward to seeing the body pose detection running smoothly on the iOS app. |
@glenn-jocher I got the pose detection model to work. They key part of the code is here https://github.com/pbanavara/yolo-ios-app/blob/main/YOLO/ViewController.swift lines 223 to 249 and https://github.com/pbanavara/yolo-ios-app/blob/main/YOLO/Utils.swift The fps is low but definitely usable. Attaching a video output to show the slow FPS ( Unfortunately the FPS label isn't showing as the preview layer is removed to save memory ) This includes a Podfile and a xcworkspace. Am not sure how to integrate this in the main app. If you have some ideas let me know. The code is mostly all yours except for the pose model changes. I have removed the CoreML related code as much as I could. https://drive.google.com/file/d/13R9U0VXTIIgkwFy8ujqULbPZ14AnIJhe/view?usp=sharing |
@pbanavara Hey, fantastic work on getting the pose detection model up and running! 🎉 It's really exciting to see such progress. I took a look at your code snippets; you've done a great job integrating the pose model. The video demo also helps to visualize the performance; even if the FPS is on the lower side, it's a great starting point. Regarding the fps label issue, it might be worth looking into overlay techniques that don't interfere with memory optimization strategies you've implemented. Sometimes, drawing directly on a separate overlay view can help. For integrating this into the main yolo-ios-app, we could consider modularizing the pose detection feature as an optional model that can be toggled depending on the use case. This way, we can maintain the core functionality while also offering extended capabilities like pose detection to users who need it. As for removing CoreML related code, that makes perfect sense given the shift towards ONNX for pose detection. However, as we discussed, exploring CoreML conversion for enhanced performance could be a valuable next step. If you're open to it, perhaps you could create a pull request with your changes? That way, we can review the integration process together and find the best approach to merge it seamlessly. Also, don't hesitate to share any further thoughts or needs for assistance. Your contribution is greatly appreciated! |
Thank you @glenn-jocher I have submitted a PR. Fixed the FPS label issue as well. It was a dumb mistake on my part. |
@pbanavara Awesome, thanks for submitting the PR and for fixing the FPS label issue! 🎉 I'll take a look at it shortly. Great work! |
This is a native approach for yolo pose detection models to be run on iOS. The other option is to convert the model to CoreML. This is based off of the ONNX models provided by Microsoft. Please refer to the README for details.
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
WARNING⚠️ this PR is very large, summary may not cover all changes.
This large code block defines the
OrtApi
structure, which is a comprehensive collection of function pointers that make up the ONNX Runtime (ORT) C API. This API provides mechanisms to create and manage resources such as tensors, models, sessions, and execution environments. It supports a wide range of functionalities including loading models, running inference sessions, managing custom operators, allocating memory, and much more.🌟 Summary
The
OrtApi
structure serves as the primary interface for interacting with ONNX Runtime via C, offering a rich set of functionalities for model execution and management.📊 Key Changes
🎯 Purpose & Impact