Skip to content
This repository has been archived by the owner on Aug 28, 2024. It is now read-only.

Android App crash after click DETECT button #102

Open
navidnayyem opened this issue Mar 2, 2021 · 26 comments
Open

Android App crash after click DETECT button #102

navidnayyem opened this issue Mar 2, 2021 · 26 comments

Comments

@navidnayyem
Copy link

navidnayyem commented Mar 2, 2021

We are working on a custom dataset model and also trained our model in a kaggle notebook. From the kaggle notebook, we got our model weights file in the outputs which is best.pt and last.pt after finishes training. Now, we are trying to integrate best.pt file into the pytorch android demo app (Github Link: https://github.com/pytorch/android-demo-app/tree/master/ObjectDetection). I converted best.pt file to best.torchscipt.pt by following the link (ultralytics/yolov5#251) and after export I copy it to asset file and also changes the name of our classes in the classes.txt file. The problem is that first of all, the apps runs fine but when I click on the DETECT button after sometime ,the app closes and shut down. I then watch the logcat of Android Studio and see some error messages which I attached it below. Actually, we want a solution for this error.
Capture1
Capture2

@IvanKobzarev
Copy link
Contributor

cc @jeffxtang

@jeffxtang
Copy link
Contributor

Looks like your model's output is a tensor list instead of a tuple that the model that comes with the demo app outputs. Below is from IValue.java which you can enter from Android Studio by holding down the Cmd key (on Mac) and click the toTuple method:

  private static final int TYPE_CODE_TUPLE = 7;
  private static final int TYPE_CODE_BOOL_LIST = 8;
  private static final int TYPE_CODE_LONG_LIST = 9;
  private static final int TYPE_CODE_DOUBLE_LIST = 10;
  private static final int TYPE_CODE_TENSOR_LIST = 11;
  private static final int TYPE_CODE_LIST = 12;

You need to make the output of your model the same type as the output of the model used for the demo app.

@navidnayyem
Copy link
Author

First I downloaded the repository zip from https://github.com/jeffxtang/yolov5 and extract it then I open the anaconda prompt
and try to convert the best.pt to best.torchscript.pt but now before export torchscipt.pt file, some error message is showing in the anaconda prompt. I save my best.pt file in the folder and run the following code:

python models/export.py --weights best.pt --img 512 --batch 16

After showing some error messages then it shows that 'TorchScript export success, saved as best.torchscript.pt'. Need solution for this problem. @jeffxtang

Capture3

@jeffxtang
Copy link
Contributor

jeffxtang commented Mar 3, 2021

  1. Have you tried python models/export.py and used the generated yolov5s.torchscript.pt to see if it works with the app?

  2. Add the following two lines after y = model(img) # dry run and run python models/export.py --weights best.pt --img 512 --batch 16, do you see torch.Size([16, 25200, 85])?

    a,b = y
    print(a.shape)

We don't know how your custom model is trained, but most likely your custom model inference result y is not a tuple but a list. You need to debug by comparing the model(img) results when running python models/export.py and python models/export.py --weights best.pt --img 512 --batch 16.

@navidnayyem
Copy link
Author

navidnayyem commented Mar 3, 2021

  1. I have tried python models/export.py and used the generated yolov5s.torchscript.pt and its also shows the same error message but at last it exported as yolov5s.torchscript.pt. Then when I add it in the asset folder, the app works and detect perfectly from the image given in the asset folder.
    Capture4
    Capture5
  2. I added the following two lines after y = model(img)
    a,b = y
    print(a.shape)

    I added my custom model trained best.pt file. After running this command 'python models/export.py --weights best.pt --img 512 --batch 16' I don't see torch.Size([16, 25200, 85]) but I see torch.Size([16, 16128, 26])
    Capture6

Actually I train my model in kaggle notebook. After finishes training best.pt file is generated from the output. If my custom model inference result y is a list then is there any way to convert to tuple? If you want I can share my custom trained model best.pt file with you and then you see and give me some solutions to integrate it in this app. @jeffxtang

@jeffxtang
Copy link
Contributor

jeffxtang commented Mar 3, 2021

The reasons you see torch.Size([16, 16128, 26]) instead of torch.Size([16, 25200, 85]) are: 1. you set input image size to be 512 instead of the default 640 and 2. your custom model class# is 26 instead of 85. You'll need to change the following values in PrePostProcessor.java:

    static int mInputWidth = 640;
    static int mInputHeight = 640;

    // model output is of size 25200*85
    private static int mOutputRow = 25200; // as decided by the YOLOv5 model for input image of size 640*640
    private static int mOutputColumn = 85; 

The error you see when running on Android (BTW, what you see when running the export.py is not error but warning messages) is because your model's forward method output a list of tensors instead of a tuple. You can either change your model code or in Android MainActivity.java, change IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList(); then do some comparison debugging to see which element in the list is similar to the first element in the returned tuple when using the yolo5 model downloadable for the app that already works - the outputs floats array a few lines after the forward call will give you the info you need.

@navidnayyem
Copy link
Author

navidnayyem commented Mar 6, 2021

I train my model in the 640 image size then after running this command 'python models/export.py --weights best.pt --img 640 --batch 16', I see torch.Size([16, 25200, 26]) and I change the following values in PrePostProcessor.java to :
static int mInputWidth = 640;
static int mInputHeight = 640;

// model output is of size 25200*26
private static int mOutputRow = 25200; // as decided by the YOLOv5 model for input image of size 640*640
private static int mOutputColumn = 26; 

Untitleda

And also change in Android MainActivity.java, from IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList(); but again the same type of error is showing.
157586391_1656399144544675_6167276961855674462_n
As you said, my model's output is a tensor list instead of a tuple, is there any way to convert it from tensor list to tuple by using code or is there any way to integrate it in this app by changing the detect code lines? @jeffxtang @IvanKobzarev

@jeffxtang
Copy link
Contributor

If your model's output is a tensor list then mModule.forward(IValue.from(inputTensor)).toList() or mModule.forward(IValue.from(inputTensor)).toTensorList() should work. But your latest Android error message says "actual type 7", meaning your model's output is TYPE_CODE_TUPLE now, while your first reported Android error message of 7 day ago is "actual type 11", meaning your model's output then is TYPE_CODE_TENSOR_LIST.

So looks like you have changed your model's output type - then you should change your Android code back to mModule.forward(IValue.from(inputTensor)).toTuple().

If you still have the problem, you can send a downloadable link to your model file and I'll give it a try.

@navidnayyem
Copy link
Author

navidnayyem commented Mar 9, 2021

It works. Thank you. Now I am going to check the camera view portion. @jeffxtang

@jeffxtang
Copy link
Contributor

Glad it works for you! What was the cause and fix for the problem?

@navidnayyem
Copy link
Author

navidnayyem commented Mar 10, 2021

  1. As you said to change from mModule.forward(IValue.from(inputTensor)).toList() back to mModule.forward(IValue.from(inputTensor)).toTuple(). I changed it. Then,
  2. In PrePostProcessor.java, in line no. 143, I change the code from Result result = new Result(cls, outputs[i*(muliplysign)85+4], rect); to Result result = new Result(cls, outputs[i*(muliplysign)26+4], rect);
    Then, It works for 640x640 image size model train. Now I am trying to integrate for 512x512 image size model train file in the android app. I hope it will also work. If any problem still occurs. I will let you know. Thank you. @jeffxtang

@navidnayyem
Copy link
Author

I export my best.pt model file to best.torchscript.pt as img size 512(torch.Size([16, 16128, 26]). Then I integrate it in the app. I change the following values in PrePostProcessor.java to :
static int mInputWidth = 512;
static int mInputHeight = 512;

// model output is of size 1612826
private static int mOutputRow = 16128; // as decided by the YOLOv5 model for input image of size 512
512
private static int mOutputColumn = 26;

And also change in Android MainActivity.java, from IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList(); but an error is showing which I upload it before
01
.
I give you the downloadable link to my model file including classes.txt and test images(assets folder) and you can give it a try and check.

https://drive.google.com/file/d/1yh1uycn-40TXKFd8z9cOIInJRzRFYt2n/view?usp=sharing
@jeffxtang

@jeffxtang
Copy link
Contributor

This is a similar but different error message - you used to have "expected IValue type 7, actual type 11" and "expected IValue type 12, actual type 7" and now you're having "expected IValue type 12, actual type 11", meaning your model's output is now TYPE_CODE_LIST (in Android Studio, simply click into toList and you'll enter IValue.java where the following constants are defined):

  private static final int TYPE_CODE_TUPLE = 7;
  private static final int TYPE_CODE_BOOL_LIST = 8;
  private static final int TYPE_CODE_LONG_LIST = 9;
  private static final int TYPE_CODE_DOUBLE_LIST = 10;
  private static final int TYPE_CODE_TENSOR_LIST = 11;
  private static final int TYPE_CODE_LIST = 12;

So you need to change toList() to toTensorList.

@navidnayyem
Copy link
Author

navidnayyem commented Mar 12, 2021

When I change from toList() to toTensorList(). There is a red line shown under this line. @jeffxtang
02

Then I change it to: Tensor[] outputList = mModule.forward(IValue.from(inputTensor)).toTensorList(); and there is a red color in line no. 224 toTensor()
03
Then I removed toTensor() and run it.
04
A different error message is showing of ArrayIndexOutOfBoundsException. My mOutputColumn value is 26;
05

@jeffxtang
Copy link
Contributor

You need to find out the shape of your model's outputList - how many tensors are in it? What's the size of outputTensor (outputList[0])? Is the first element the right output to pass to the method in PrePosProcessor?

@nlm-yuh5
Copy link

nlm-yuh5 commented Apr 6, 2021

@navidnayyem @jeffxtang I encounter the same error "expected IValue type 7, actual type 11" as you did. For me the problem was I did not do the optimization as suggest in the tutorial:

Then edit models/export.py to make two changes:

Change the line 50 from model.model[-1].export = True to model.model[-1].export = False

Add the following two lines of model optimization code after line 57, between ts = torch.jit.trace(model, img) and ts.save(f):

    from torch.utils.mobile_optimizer import optimize_for_mobile
    ts = optimize_for_mobile(ts)

After adding the code above, this problem went away.

@rachtibat
Copy link

I love you all! Thanks, I had the same issue

@animeshkalita82
Copy link

animeshkalita82 commented Aug 5, 2021

When I change from toList() to toTensorList(). There is a red line shown under this line. @jeffxtang
02

Then I change it to: Tensor[] outputList = mModule.forward(IValue.from(inputTensor)).toTensorList(); and there is a red color in line no. 224 toTensor()
03
Then I removed toTensor() and run it.
04
A different error message is showing of ArrayIndexOutOfBoundsException. My mOutputColumn value is 26;
05

I also faced the same issue of ArrayIndexOutBoundException even though I was using 3 classes and also I had mOutputColumn =8. I did some analysis and found that the problem was I was running the export file with --batch 1 so
shape of a torch.Size([1, 25200, 8]). As I was training my model in --batch16 so I change the batch size to --batch 16
!python model/export.py --batch 16 --weights best.pt
@jeffxtang I do not know if this makes sense or not but it resolved my issue and now model is working fine in andriod.

@myasser63
Copy link

myasser63 commented Aug 31, 2021

hello, I am using the same method of export.py to convert my custom yolov5s and loaded my model to the pytorch android demo app (Github Link: https://github.com/pytorch/android-demo-app/tree/master/ObjectDetection). the app builds with no error. however, the app launches and immediately crashes every time.

the error message here:

E/AndroidRuntime: FATAL EXCEPTION: main
Process: org.pytorch.demo.objectdetection, PID: 9957
java.lang.RuntimeException: Unable to start activity ComponentInfo{org.pytorch.demo.objectdetection/org.pytorch.demo.objectdetection.MainActivity}: com.facebook.jni.CppException: PytorchStreamReader failed locating file bytecode.pkl: file not found ()
Exception raised from valid at ../caffe2/serialize/inline_container.cc:157 (most recent call first):
(no backtrace available)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3611)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3775)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:85)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2246)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loop(Looper.java:233)
at android.app.ActivityThread.main(ActivityThread.java:8010)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:631)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:978)
Caused by: com.facebook.jni.CppException: PytorchStreamReader failed locating file bytecode.pkl: file not found ()
Exception raised from valid at ../caffe2/serialize/inline_container.cc:157 (most recent call first):
(no backtrace available)
at org.pytorch.LiteNativePeer.initHybrid(Native Method)
at org.pytorch.LiteNativePeer.(LiteNativePeer.java:23)
at org.pytorch.LiteModuleLoader.load(LiteModuleLoader.java:29)
at org.pytorch.demo.objectdetection.MainActivity.onCreate(MainActivity.java:184)
at android.app.Activity.performCreate(Activity.java:8006)
at android.app.Activity.performCreate(Activity.java:7990)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1329)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3584)
... 11 more
W/System: A resource failed to call close.
I/Process: Sending signal. PID: 9957 SIG: 9

@Amy9876
Copy link

Amy9876 commented Oct 27, 2021

The reasons you see torch.Size([16, 16128, 26]) instead of torch.Size([16, 25200, 85]) are: 1. you set input image size to be 512 instead of the default 640 and 2. your custom model class# is 26 instead of 85. You'll need to change the following values in PrePostProcessor.java:

    static int mInputWidth = 640;
    static int mInputHeight = 640;

    // model output is of size 25200*85
    private static int mOutputRow = 25200; // as decided by the YOLOv5 model for input image of size 640*640
    private static int mOutputColumn = 85; 

The error you see when running on Android (BTW, what you see when running the export.py is not error but warning messages) is because your model's forward method output a list of tensors instead of a tuple. You can either change your model code or in Android MainActivity.java, change IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList(); then do some comparison debugging to see which element in the list is similar to the first element in the returned tuple when using the yolo5 model downloadable for the app that already works - the outputs floats array a few lines after the forward call will give you the info you need.

Hello,if torch.Size([1, 80, 80,15]), not 3 dimensin,how do? Thanks @jeffxtang

@blackCmd
Copy link

It works for me.
Thank you !

@vyvy3n
Copy link

vyvy3n commented Apr 18, 2022

@navidnayyem I guess you were able to use this object detection app with img-size other 640, would you provide some suggestions on this issue 233? Thanks!

@atultiwari
Copy link

  1. Have you tried python models/export.py and used the generated yolov5s.torchscript.pt to see if it works with the app?
  2. Add the following two lines after y = model(img) # dry run and run python models/export.py --weights best.pt --img 512 --batch 16, do you see torch.Size([16, 25200, 85])?
    a,b = y
    print(a.shape)

We don't know how your custom model is trained, but most likely your custom model inference result y is not a tuple but a list. You need to debug by comparing the model(img) results when running python models/export.py and python models/export.py --weights best.pt --img 512 --batch 16.

Hi
I am also facing the issue when I use the model other than default image size (640)

As mentioned by you, I tried to print the shape of "a" (attached screenshot), I got - torch.Size([1, 25200, 10])

From my understanding from the mentioned comment is that the 2nd number decides mOutputRow (25200) and 3rd number decides mOutputColumn (since i have 5 classes so it's value would be = 10). (Please correct me if I interpreted it wrong)

I don't know what does first number decides from this shape (value of which is 1 in my case, while it was 16 in the mentioned comment). If you help me understand it, that would be a great help.

Also, in my case I trained my model with image size of 320, then why I am getting 2nd value in a.shape as 25200 (isn't it for image size 640)?

yolo-mobile-err-1

@pdeubel
Copy link

pdeubel commented Jan 8, 2023

If somebody still has issues with a custom image size and custom dataset see my comment in another issue.

@yscc-16
Copy link

yscc-16 commented Apr 26, 2023

Hello, I have a different error on my end. Actually, we are looking for a new solution.
E/AndroidRuntime: FATAL EXCEPTION: Thread-3
Process: org.pytorch.YourEyes, PID: 8325
com.facebook.jni.CppException: forward() Expected a value of type 'Tensor' for argument 'x' but instead found type 'List[Tensor]'.
Position: 1
Declaration: forward(torch.models.yolo.Model self, Tensor x) -> ((Tensor, Tensor[]))

Exception raised from checkArg at ../aten/src/ATen/core/function_schema_inl.h:162 (most recent call first):
(no backtrace available)
at org.pytorch.NativePeer.forward(Native Method)
at org.pytorch.Module.forward(Module.java:52)
at org.pytorch.YourEyes.walking.WalkActivity.run(WalkActivity.java:238)
at java.lang.Thread.run(Thread.java:923)

@yscc-16
Copy link

yscc-16 commented Apr 28, 2023

@jeffxtang
The reasons you see torch.Size([16, 16128, 26]) instead of torch.Size([16, 25200, 85]) are: 1. you set input image size to be 512 instead of the default 640 and 2. your custom model class# is 26 instead of 85. You'll need to change the following values in PrePostProcessor.java:

    static int mInputWidth = 640;
    static int mInputHeight = 640;

    // model output is of size 25200*85
    private static int mOutputRow = 25200; // as decided by the YOLOv5 model for input image of size 640*640
    private static int mOutputColumn = 85; 

The error you see when running on Android (BTW, what you see when running the export.py is not error but warning messages) is because your model's forward method output a list of tensors instead of a tuple. You can either change your model code or in Android MainActivity.java, change IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList(); then do some comparison debugging to see which element in the list is similar to the first element in the returned tuple when using the yolo5 model downloadable for the app that already works - the outputs floats array a few lines after the forward call will give you the info you need.

Hello, according to you,changed IValue[] outputTuple = mModule.forward(IValue.from(inputTensor)).toTuple(); to IValue[] outputList = mModule.forward(IValue.from(inputTensor)).toList();
But there was a new error:
**E/AndroidRuntime: FATAL EXCEPTION: Thread-3
Process: org.pytorch.YourEyes, PID: 16827
java.lang.RuntimeException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/models/yolo.py", line 32, in forward
_22 = getattr(self.model, "2")
_23 = getattr(self.model, "1")
_24 = (getattr(self.model, "0")).forward(x, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_25 = (_22).forward((_23).forward(_24, ), )
_26 = (_20).forward((_21).forward(_25, ), )
File "code/torch/models/common.py", line 10, in forward
_0 = self.conv
_1 = torch.slice(x, 2, 0, 9223372036854775807, 2)
_2 = torch.slice(_1, 3, 0, 9223372036854775807, 2)
~~~~~~~~~~~ <--- HERE
_3 = torch.slice(x, 2, 1, 9223372036854775807, 2)
_4 = torch.slice(_3, 3, 0, 9223372036854775807, 2)

Traceback of TorchScript, original code (most recent call last):
.\models\common.py(92): forward
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py(709): _slow_forward
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py(725): _call_impl
.\models\yolo.py(136): forward_once
.\models\yolo.py(116): forward
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py(709): _slow_forward
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py(725): _call_impl
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\jit\_trace.py(934): trace_module
D:\Program Files (x86)\Anaconda3\envs\torch\lib\site-packages\torch\jit\_trace.py(733): trace
models/export.py(57): <module>
RuntimeError: Dimension out of range (expected to be in range of [-3, 2], but got 3)

    at org.pytorch.NativePeer.forward(Native Method)
    at org.pytorch.Module.forward(Module.java:52)
    at org.pytorch.YourEyes.walking.WalkActivity.run(WalkActivity.java:239)
    at  @java.lang.Thread.run(Thread.java:923)**

What should I do?

@jeffxtang

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests