New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907

pbanavara · 2024-03-13T03:20:50Z

This is a native approach for yolo pose detection models to be run on iOS. The other option is to convert the model to CoreML. This is based off of the ONNX models provided by Microsoft. Please refer to the README for details.

🛠️ PR Summary

WARNING ⚠️ this PR is very large, summary may not cover all changes.

This large code block defines the OrtApi structure, which is a comprehensive collection of function pointers that make up the ONNX Runtime (ORT) C API. This API provides mechanisms to create and manage resources such as tensors, models, sessions, and execution environments. It supports a wide range of functionalities including loading models, running inference sessions, managing custom operators, allocating memory, and much more.

🌟 Summary

The OrtApi structure serves as the primary interface for interacting with ONNX Runtime via C, offering a rich set of functionalities for model execution and management.

📊 Key Changes

Provides functions for session management, including creating and running sessions.
Offers tensor manipulation capabilities, such as creation, data setting and getting.
Supports memory management, including custom allocators.
Allows for the customization and extension via custom operators.
Enables configuration of session options and execution providers (such as CUDA, TensorRT).
Contains utilities for asynchronous execution and shape inference.

🎯 Purpose & Impact

Ease of Use: Simplifies interaction with ONNX Runtime by providing C API access, making it accessible for applications written in C or other languages capable of interfacing with C.
Extensibility: Facilitates the integration of custom operations and execution providers, enhancing flexibility and supporting specialized use cases.
Performance Optimization: Offers detailed configuration options for sessions and execution providers, allowing users to finely tune performance to their specific needs.
Cross-Platform and Cross-Device: Supports a variety of execution providers, enabling efficient model execution across different hardware platforms.

github-actions · 2024-03-13T03:21:07Z

CLA Assistant Lite bot All Contributors have signed the CLA. ✅

github-actions

👋 Hello @pbanavara, thank you for submitting an Ultralytics YOLOv8 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

✅ Verify your PR is up-to-date with ultralytics/ultralytics main branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge main locally.
✅ Verify all YOLOv8 Continuous Integration (CI) checks are passing.
✅ Update YOLOv8 Docs for any new or updated features.
✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

See our Contributing Guide for details and let us know if you have any questions!

codecov · 2024-03-13T03:23:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.00%. Comparing base (d608565) to head (05a9187).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8907   +/-   ##
=======================================
  Coverage   75.99%   76.00%           
=======================================
  Files         121      121           
  Lines       15332    15332           
=======================================
+ Hits        11652    11653    +1     
+ Misses       3680     3679    -1

Flag	Coverage Δ
Benchmarks	`36.02% <ø> (ø)`
GPU	`37.92% <ø> (-0.02%)`	⬇️
Tests	`71.30% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pbanavara · 2024-03-13T03:24:02Z

I have read the CLA Document and I sign the CLA

glenn-jocher · 2024-03-13T15:31:07Z

@pbanavara thanks for the PR! I think this might be a bit much to incorporate though.

Do you have any ONNX benchmarks on iOS? We have an open-source iOS framework for deploying YOLOv8 in https://github.com/ultralytics/yolo-ios-app that I think is much, much simpler than this PR. Perhaps you could take a look and update that repo with additional functionality for pose?

pbanavara · 2024-03-18T02:13:22Z

Hi @glenn-jocher sorry I was out for a few days. Was in surgery. Will update this PR shortly by incorporating this with the YOLO iOS app.

glenn-jocher · 2024-03-18T10:08:32Z

@pbanavara hi there! No worries at all, I hope everything went well with your surgery and you're on the path to a quick recovery 🙏.

Looking forward to your updates on integrating the PR with the YOLO iOS app. If you have any questions or need any assistance along the way, feel free to reach out. Take care!

pbanavara · 2024-03-25T10:33:46Z

@glenn-jocher quick update

I am able to get the model working in the yolo-ios-app just figuring out how to draw the rect and the keypoints. I am reusing your boundingboxes, but the rect is not painting. However, the processing is very slow. Around 3.5 to 4 fps. I don't think this ONNX model is practical at such speeds.

The object detection coreML model for comparison runs at almost 30fps. So the only scalable option is to convert the pose model into CoreML - I will start working on that and see if I can complete the body pose painting in the previewLayer.

glenn-jocher · 2024-03-25T12:00:12Z

@pbanavara hi there! 🌟 Great to hear you've made progress with integrating the model into the yolo-ios-app. Drawing keypoints and bounding boxes certainly requires some tweaks, especially in rendering them efficiently.

The discrepancy in FPS between the ONNX model and the CoreML model is indeed significant. CoreML is highly optimized for iOS devices, providing a more seamless integration and faster processing times, so your plan to convert the pose model into CoreML sounds like the right approach to enhance performance.

For the drawing issue, ensure your drawing code executes on the main thread, something like this for keypoints:

DispatchQueue.main.async {
    // Your code to draw keypoints here
}

This might also solve the slow rendering issues if they're not related to model inference speed.

Keep us updated on your progress with the CoreML conversion, and feel free to reach out if you hit any snags! Looking forward to seeing the body pose detection running smoothly on the iOS app.

pbanavara · 2024-03-27T05:47:27Z

@glenn-jocher I got the pose detection model to work.

They key part of the code is here

https://github.com/pbanavara/yolo-ios-app/blob/main/YOLO/ViewController.swift

lines 223 to 249

and

https://github.com/pbanavara/yolo-ios-app/blob/main/YOLO/Utils.swift

The fps is low but definitely usable. Attaching a video output to show the slow FPS ( Unfortunately the FPS label isn't showing as the preview layer is removed to save memory )

This includes a Podfile and a xcworkspace. Am not sure how to integrate this in the main app. If you have some ideas let me know. The code is mostly all yours except for the pose model changes. I have removed the CoreML related code as much as I could.

https://drive.google.com/file/d/13R9U0VXTIIgkwFy8ujqULbPZ14AnIJhe/view?usp=sharing

glenn-jocher · 2024-03-27T15:56:34Z

@pbanavara Hey, fantastic work on getting the pose detection model up and running! 🎉 It's really exciting to see such progress. I took a look at your code snippets; you've done a great job integrating the pose model. The video demo also helps to visualize the performance; even if the FPS is on the lower side, it's a great starting point.

Regarding the fps label issue, it might be worth looking into overlay techniques that don't interfere with memory optimization strategies you've implemented. Sometimes, drawing directly on a separate overlay view can help.

For integrating this into the main yolo-ios-app, we could consider modularizing the pose detection feature as an optional model that can be toggled depending on the use case. This way, we can maintain the core functionality while also offering extended capabilities like pose detection to users who need it.

As for removing CoreML related code, that makes perfect sense given the shift towards ONNX for pose detection. However, as we discussed, exploring CoreML conversion for enhanced performance could be a valuable next step.

If you're open to it, perhaps you could create a pull request with your changes? That way, we can review the integration process together and find the best approach to merge it seamlessly. Also, don't hesitate to share any further thoughts or needs for assistance. Your contribution is greatly appreciated!

pbanavara · 2024-03-27T18:19:03Z

Thank you @glenn-jocher I have submitted a PR. Fixed the FPS label issue as well. It was a dumb mistake on my part.

ultralytics/yolo-ios-app#13

glenn-jocher · 2024-03-27T20:52:47Z

@pbanavara Awesome, thanks for submitting the PR and for fixing the FPS label issue! 🎉 I'll take a look at it shortly. Great work!

Changing repository to YOLO main under examples

32e6503

Auto-format by https://ultralytics.com/actions

7bea517

github-actions bot reviewed Mar 13, 2024

View reviewed changes

pbanavara mentioned this pull request Mar 13, 2024

Yolov8 iOS How to interpret and scale pose keyPoint and bounding box values ? #8740

Closed

1 task

Burhan-Q added the pose Related to pose Task label Mar 15, 2024

Merge branch 'main' into main

05a9187

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907

New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907

pbanavara commented Mar 13, 2024 •

edited by github-actions bot

github-actions bot commented Mar 13, 2024 •

edited

github-actions bot left a comment

codecov bot commented Mar 13, 2024 •

edited

pbanavara commented Mar 13, 2024

glenn-jocher commented Mar 13, 2024

pbanavara commented Mar 18, 2024

glenn-jocher commented Mar 18, 2024

pbanavara commented Mar 25, 2024

glenn-jocher commented Mar 25, 2024

pbanavara commented Mar 27, 2024

glenn-jocher commented Mar 27, 2024

pbanavara commented Mar 27, 2024

glenn-jocher commented Mar 27, 2024

New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907

Are you sure you want to change the base?

New PR for native pose detection on iOS using ONNX yolov8 pose model. #8907

Conversation

pbanavara commented Mar 13, 2024 • edited by github-actions bot

This is a native approach for yolo pose detection models to be run on iOS. The other option is to convert the model to CoreML. This is based off of the ONNX models provided by Microsoft. Please refer to the README for details.

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

github-actions bot commented Mar 13, 2024 • edited

github-actions bot left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 13, 2024 • edited

Codecov Report

pbanavara commented Mar 13, 2024

glenn-jocher commented Mar 13, 2024

pbanavara commented Mar 18, 2024

glenn-jocher commented Mar 18, 2024

pbanavara commented Mar 25, 2024

glenn-jocher commented Mar 25, 2024

pbanavara commented Mar 27, 2024

glenn-jocher commented Mar 27, 2024

pbanavara commented Mar 27, 2024

glenn-jocher commented Mar 27, 2024

pbanavara commented Mar 13, 2024 •

edited by github-actions bot

github-actions bot commented Mar 13, 2024 •

edited

codecov bot commented Mar 13, 2024 •

edited