Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace yolo model #9

Open
TongyuanLiu opened this issue Mar 8, 2024 · 10 comments
Open

Replace yolo model #9

TongyuanLiu opened this issue Mar 8, 2024 · 10 comments

Comments

@TongyuanLiu
Copy link

Hi, I'm thoroughly impressed with the work you've accomplished. It's been incredibly helpful for my school's design project.

We want to use HoloLens to recognize basketball, so we decided to change to yolo model. I am wondering if you convert to .pt file to the .onnx file or downloaded the .onnx file from other github repository. When we tried to replace the yolo model, we had an error that seems to because of the output size doesn't match. Your yolo model (256x320) outputs float32[1,5040,85], but ours outputs float32[Concatoutput_dim_0,7].

We use the following code to convert .pt to .onnx.
click on: https://github.com/PINTO0309/PINTO_model_zoo/tree/main/307_YOLOv7
then from demo folder, we found YOLOv7 with ONNX in Python: https://github.com/ibaiGorordo/ONNX-YOLOv7-Object-Detection
then we found Original YOLOv7 model: https://github.com/WongKinYiu/yolov7
from this repository we found "Pytorch to ONNX with NMS (and inference)" colab notebook: https://colab.research.google.com/github/WongKinYiu/yolov7/blob/main/tools/YOLOv7onnx.ipynb
We use this notebook to convert the yolov7 .pt file to .onnx with input size 256x320, but the output is different from yours as metioned above.

Thanks in advance.

@LocalJoost
Copy link
Owner

Hi, I am glad my little sample worked for you. I did not make the model myself, I downloaded it. If you want to use a different model, I would suggest reading https://localjoost.github.io/HoloLens-AI-training-a-YoloV8-model-locally-on-custom-pictures-to-recognize-objects-in-3D-space/ and using this branch. https://github.com/LocalJoost/YoloHolo/tree/airplanedetection

@TongyuanLiu
Copy link
Author

Thank you. That's very helpful.

@Sheltim233
Copy link

Sheltim233 commented Mar 29, 2024

I'm also involved in this project. Following your guidance, we've successfully trained and integrated our model into HoloLens 2. However, we now aim to recognize two objects simultaneously, whereas the HoloLens 2 currently supports recognition of only one object at a time. Could you advise on any necessary modifications to the Unity C# files to facilitate this? Your suggestions would be invaluable, and I sincerely appreciate your time.

We changed V8AirplaneTranslator, in which we modified detectableObjects to two new labels, but it can only detect one object and never detect another.

@LocalJoost
Copy link
Owner

V8AirplaneTranslator is only a very simple class that translates the detected object's class id into a text, as you probably have noticed. I am not quite sure what your problem is. Do you want it to be able to recognize different objects, like I showed in https://localjoost.github.io/HoloLens-AI-using-Yolo-ONNX-models-to-localize-objects-in-3D-space/, or do you want it to recognize different objects at the same time? I think the code should support that.

@Sheltim233
Copy link

Thank you for your response. Our objective is to simultaneously detect two distinct objects. Despite modifying the detectableObjects, we're encountering a problem where only one type of object is recognized. Specifically, we can detect multiple instances of object A, but object B remains undetected on the HoloLens 2. Interestingly, when we evaluate the .pt file independently, both objects A and B are correctly identified. We're keen on understanding your insights or recommendations on addressing this challenge.

@LocalJoost
Copy link
Owner

@Sheltim233 Sorry for the slow response. I have spent quite some time debugging this and I have to ascertain my code does work indeed, and recognizes multiple objects in one go. But apparently, that only works when you set the probability filter lower than my default settings. When I take my default 0.65, I almost always get only one object:
image
However, if I lower the Minimum Probability to 0.3, I get multiple objects per picture. However, you also get a lot bigger false positive rate. So that's the thing you need to play with, I guess

@TongyuanLiu
Copy link
Author

TongyuanLiu commented Apr 19, 2024

Thanks for your reply. May I ask what's the difference between the confidence and classProbabilities?
Screenshot 2024-04-18 234915
I noticed that your airplan yolo model has output tensor of [1, 5, 1680]. 4 for the bounding box, 1 for the confidence, and the for loop won't be entered since i started from 5 (if the model can only detect one class). Here is why I got confused because it seems the yolov8 model uses the first four values for bounding box and the reset of values for probabilities of each class. (ultralytics/ultralytics#751 and ultralytics/ultralytics#8421)
image

Our trained model can detect 2 classes (ball, hoop). The output tensor is [1, 6(4 for bounding box, 2 for class probabilities), 1680].However, when we changed the V8AirplaneTranslator, it can still only detect the ball (I guess because it is index 0). I'm not sure if this is related to the different format of yolov8 output tensor or the overlap threshold.

@LocalJoost
Copy link
Owner

LocalJoost commented Apr 19, 2024

@TongyuanLiu:

May I ask what's the difference between the confidence and classProbabilities?

There is none. At least not in my code. It's a different word for the same thing.

As to you remark about V8: I am not quite sure what you mean. I found out the difference between the YoloV7 and the YoloV8 created by the Ultralytics tools to be thus. https://localjoost.github.io/HoloLens-AI-training-a-YoloV8-model-locally-on-custom-pictures-to-recognize-objects-in-3D-space/#v7-versus-v8-. I'd suggest you lower the minimum probability (which equals minimal confidence) and see what happens

@TongyuanLiu
Copy link
Author

TongyuanLiu commented Apr 19, 2024

Thank you for the quick reply. I edited my previous comment. I am confused about if the yolo model can only detect the one class, it will assign the probability to the confidence variable and the for loop won't be entered. If the yolo model can detect two classes, the for loop will be entered one time, which makes the maxIndex still be 0 because there is only 1 probability in the classProbabilities array. Suppose my classes are defined as 0:ball, 1: hoop, and I want to detect a hoop image. Will the classProbabilities.IndexOf(classProbabilities.Max()) will gives 0, making maxIndex 0, which will make the MostLiklyObject be ball?
image

I also met another problem that if I give a picture that only have a hoop, the program cannot label it. (It can only detect balls).

Thank you so much for your help in advance.

@TongyuanLiu
Copy link
Author

I think I might know how to fix this issue. The reason why the program cannot show class1 is because the confidence is always be the probability of the class0. Therefore, when I want to let it recognize class1, the ProcessV8Item function will never add it to boxesMeetingConfidenceLevel. Therefore, I changed the YoloV8Item.cs and it fixed this issue.
Screenshot 2024-04-19 012157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants