Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid output Tensor index: 1 when trying to run a tiny-yolov3 model on TFLite #39803

Closed
SirTapir opened this issue May 23, 2020 · 22 comments
Closed
Assignees
Labels
comp:lite TF Lite related issues TFLiteConverter For issues related to TFLite converter type:others issues not falling in bug, perfromance, support, build and install or feature

Comments

@SirTapir
Copy link

Following @jvishnuvardhan suggestion in #39157 (comment)_, I'm creating a new issue

I'm facing an error when trying to run a tiny-yolov3 model on TensorFlow Lite's Object Detection Android Demo.
When I try to run the app on mobile phone, the app crashed with the following error

E/AndroidRuntime: FATAL EXCEPTION: inference
    Process: org.tensorflow.lite.examples.detection, PID: 5535
    java.lang.IllegalArgumentException: Invalid output Tensor index: 1
        at org.tensorflow.lite.NativeInterpreterWrapper.getOutputTensor(NativeInterpreterWrapper.java:292)
        at org.tensorflow.lite.NativeInterpreterWrapper.run(NativeInterpreterWrapper.java:166)
        at org.tensorflow.lite.Interpreter.runForMultipleInputsOutputs(Interpreter.java:314)
        at org.tensorflow.lite.examples.detection.tflite.TFLiteObjectDetectionAPIModel.recognizeImage(TFLiteObjectDetectionAPIModel.java:204)
        at org.tensorflow.lite.examples.detection.DetectorActivity$2.run(DetectorActivity.java:181)
        at android.os.Handler.handleCallback(Handler.java:873)
        at android.os.Handler.dispatchMessage(Handler.java:99)
        at android.os.Looper.loop(Looper.java:214)
        at android.os.HandlerThread.run(HandlerThread.java:65)

I'm using yolov3-tiny that was trained (using transfer learning) with Alexey's implementation to detect 2 custom objects (knife and machete).

mystic's implementation was then used to convert the .weight file to a .pb

Then I used the following code to convert the .pb file to .tflite

import tensorflow as tf

graph_def_file = "frozen_darknet_yolov3_model.pb"
input_arrays = ["inputs"]
output_arrays = ["output_boxes"]

converter = tf.lite.TFLiteConverter.from_frozen_graph(
        graph_def_file, input_arrays, output_arrays)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

(I'm using tensorflow 1.15 to run this code)

The .tflite that was created is then moved to the assets folder of the object_detection example of tflite https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android

The tflite and labelfile that I used can be found here https://drive.google.com/file/d/1Av7Q1mjLdOIEE81oNnt8cLENlXykxnEu/view?usp=sharing

I changed the following on DetectorActivity.java

TF_OD_API_INPUT_SIZE from 300 to 416
TF_OD_API_IS_QUANTIZED from true to false

Then I changed the following on TFLiteObjectDetectionAPIModel.java

NUM_DETECTIONS from 10 to 2535
d.outputLocations = new float[1][NUM_DETECTIONS][4] to d.outputLocations = new float[1][NUM_DETECTIONS][7];

Here's the DetectorActivity.java and TFLiteObjectDetectionAPIModel.java that I use here

Any assistance would be appreciated

@SirTapir SirTapir added the type:others issues not falling in bug, perfromance, support, build and install or feature label May 23, 2020
@jvishnuvardhan jvishnuvardhan added comp:lite TF Lite related issues TFLiteConverter For issues related to TFLite converter labels May 23, 2020
@Anasel23
Copy link

@SirTapir Happy that you re-opened the issue really hope that someone can help

@Anasel23
Copy link

@SirTapir still having the same problem ?

@SirTapir
Copy link
Author

@Anasel23 Yes, no progress yet

@Anasel23
Copy link

@SirTapir do you think that it can be a problem in the architecture of the model itself? I mean the tiny version has been created so it can be used on mobiles can we change it architecture by removing some layers and do another training

@SirTapir
Copy link
Author

@Anasel23 I'll be honest, I have no idea. I've been facing this error for weeks now, not really sure where the problem lies.
Could be a conversion problem from .weights to .pb then to .tflite or maybe something else that I've changed from the TensorFlow Lite's Object Detection Android Demo.

@Anasel23
Copy link

@SirTapir facing the same problem too for weeks now. You must know that i've followed the exact same process !!Hope to find the solution.

@SirTapir
Copy link
Author

@jvishnuvardhan
Here's my .weight, cfg, and .pb if needed

@Anasel23
Copy link

Anasel23 commented May 28, 2020

@SirTapir Any updates on this error ? did you train your yolov3_tiny_obj with this weights that you installed using this command
./darknet partial cfg/yolov3-tiny.cfg yolov3-tiny.weights yolov3-tiny.conv.15 15?

@SirTapir
Copy link
Author

@Anasel23 Yes, I trained the model with the same command. Btw, I've also opened a question at stack overflow here maybe it could be of some help.
On the topic of updates regarding this error:

  • I've successfully ran the model without crashing. I did this by only putting one array into the output map, like this
Map<Integer, Object> outputMap = new HashMap<>();
outputMap.put(0, outputLocations);

tfLite.runForMultipleInputsOutputs(inputArray, outputMap);

  • Now the problem is translating said output into actual data like locations of object, scores, class label, and number of detections. Here is the output of said array
 Array at: 0 values: [-25.297955] [-6.9190693] [65.46178] [35.47879] [6.7820656E-6] [0.51488364] [0.5272327]
    TFLiteObjectDetectionAPIModel: Array at: 1 values: [-91.8242] [-40.757454] [129.2085] [72.019424] [2.2218128E-6] [0.37300995] [0.5925319]
    TFLiteObjectDetectionAPIModel: Array at: 2 values: [-240.25125] [-186.38759] [274.3983] [222.5612] [1.127338E-5] [0.27641284] [0.67838424]
    TFLiteObjectDetectionAPIModel: Array at: 3 values: [-16.850494] [-2.0965796] [118.82944] [28.96283] [1.1363889E-5] [0.3750859] [0.53759706]
    TFLiteObjectDetectionAPIModel: Array at: 4 values: [-74.77507] [-21.789557] [171.89941] [58.995293] [2.7761434E-6] [0.3655538] [0.72778356]
    TFLiteObjectDetectionAPIModel: Array at: 5 values: [-187.20813] [-144.38745] [278.30975] [174.90073] [3.4437292E-6] [0.3618639] [0.5931993]
    TFLiteObjectDetectionAPIModel: Array at: 6 values: [14.077995] [1.0179415] [152.3075] [27.06137] [1.11327045E-5] [0.3666517] [0.54309994]
    TFLiteObjectDetectionAPIModel: Array at: 7 values: [-48.26387] [-15.610519] [214.6371] [55.597824] [1.2245713E-6] [0.49970642] [0.5791726]
    TFLiteObjectDetectionAPIModel: Array at: 8 values: [-146.7292] [-127.653015] [309.5728] [160.98468] [2.0313819E-6] [0.60291785] [0.3433442]
    TFLiteObjectDetectionAPIModel: Array at: 9 values: [36.78253] [-0.8245907] [190.24797] [27.01309] [2.506639E-5] [0.36374664] [0.48420942]
    TFLiteObjectDetectionAPIModel: Array at: 10 values: [-18.765198] [-14.468082] [247.22986] [54.61629] [2.6518353E-6] [0.39860374] [0.5662671]

There are 2535 array with differing values, I only provided 10 of them as an example.
Can anybody help translate these values into a workable data? I would like to know which values correspond to locations of object, scores of detection, class label, and number of detection

Here's my latest TFLiteObjectDetectionAPIModel.java if needed @jvishnuvardhan

@SirTapir
Copy link
Author

SirTapir commented Jun 5, 2020

Alright, I got it now. Why the array is shaped [1][2535][7]

YOLO v3 makes prediction across 3 different scales. The detection layer is used make detection at feature maps of three different sizes, having strides 32, 16, 8 respectively. This means, with an input of 416 x 416, we make detections on scales 13 x 13, 26 x 26 and 52 x 52.

Tiny Version of Yolov3 apparently only use 2 size, strides 32 and 16 only. This means, with an input of 416 x 416, it makes detection on scales 13 x 13 and 26 x 26

The shape [1][2535][7] is created because mystic's implementation concatenate the results of (13x13) detection and (26x26). So, we got 2353 from ( (13x13)+(26x26) ) * 3

The [7] last array is the bounding boxes, confidences, and probability of that class being there.
As my custom model only detect 2 classes, it resulted in [7] (4 box coordinates + 1 object confidence + n number of class confidences) If you convert the original yolov3-tiny model that was trained on COCO, it will result in [85] because num_of_classes of coco is 80
Array 0-3: Bounding Boxes
Array 4: Confidences of any object appearing on said box
Array 5-end of array: Probability of said class appearing on that bound box location.

Trying to convert said array into a working detector is a challenge. But at least I understand where to begin.

Resource:
Understanding Mystic's implementation
Where Mystic's implementation was inspired from

@SirTapir SirTapir closed this as completed Jun 5, 2020
@agh372
Copy link

agh372 commented Nov 13, 2020

@SirTapir

       final float xPos = outputLocations[0][i][0];
        final float yPos = outputLocations[0][i][1];
        final float w = outputLocations[0][i][2];
        final float h = outputLocations[0][i][3];
        final RectF rectF = new RectF(
               xPos - w / 2,
                yPos - h / 2,
                 xPos + w / 2,
                yPos + h / 2);

Do you extract the bounding box information like this? I'm getting weird results after doing this.

Here is my complete code...All the bounding boxes in my frame are stacked on top left, and there is only one class title showing for all the bounding boxes.

Here is the full code

 outputLocations = new float[1][2535][15];

    Object[] inputArray = {imgData};
    Map<Integer, Object> outputMap = new HashMap<>();
    outputMap.put(0, outputLocations);

    Trace.endSection();

    // Run the inference call.
    Trace.beginSection("run");
    tfLite.runForMultipleInputsOutputs(inputArray, outputMap);
    Trace.endSection();


  
    final ArrayList<Recognition> recognitions = new ArrayList<>();


    for (int i = 0; i < outputLocations[0].length; ++i) {  //2535

      float maxClass = 0;
      int detectedClass = -1;
      final float[] classes = new float[labels.size()];
      final float confidence = sigmoid(outputLocations[0][i][4]);

      for (int c = 0;c< labels.size();c++){
        classes [c] = outputLocations[0][i][5+c];

      }
      for (int c = 0;c<labels.size();++c){
        if (classes[c] > maxClass){
          detectedClass = c;
          maxClass = classes[c];
        }
      }
      final float score = maxClass;
      Log.d("Scores: "," "+score );

      if (score > 0.5) {
        final float xPos = outputLocations[0][i][0];
        final float yPos = outputLocations[0][i][1];
        final float w = outputLocations[0][i][2];
        final float h = outputLocations[0][i][3];
        final RectF rectF = new RectF(
               xPos - w / 2,
                yPos - h / 2,
                 xPos + w / 2,
                yPos + h / 2);

        recognitions.add(
                new Recognition(
                        "" + i, labels.get((int) detectedClass), score, rectF));

      }
    }
    Trace.endSection(); // "recognizeImage"
    return recognitions;
  }

@SirTapir
Copy link
Author

@agh372 Ah, I didn't continue with mystic's converter. I've changed my converter to hunlgc007's converter at https://github.com/hunglc007/tensorflow-yolov4-tflite and used the android example in their repo as a reference to implement yolov3 to work in mobile app.

@agh372
Copy link

agh372 commented Nov 13, 2020

@SirTapir Do we need anchors and masks? But that is YoloClassifier4 right? Is there any difference between their implementation.

@SirTapir
Copy link
Author

@agh372 It's been a while, but IIRC yes. We still need anchors and masks. In that repo you could also convert yolov3 models. You just need to put yolov3's mask and anchor values

@agh372
Copy link

agh372 commented Nov 13, 2020

Sorry to bother, but I have been struggling to get the bounding boxes. I have a yolov3-tiny.tflite model

Could you tell me the name of the function you used..There are many functions commented in this class
YoloV4Classifier.java)
I would be really grateful! Thank you!

@SirTapir
Copy link
Author

@agh372 I used most of them, just change the mask and achors value so that it matches with yolov3 and you should be good to go.

@agh372
Copy link

agh372 commented Nov 13, 2020

Thank you! Last question, so anchors and masks are fixed for all yolov3-tiny models right (irrespective of the shape) right?

@SirTapir
Copy link
Author

@agh372 AFAIK yes. But don't quote me on that. I haven't refreshed my memory on YOLOv3

@agh372
Copy link

agh372 commented Nov 13, 2020

Thank you so much! @SirTapir

@agh372
Copy link

agh372 commented Nov 13, 2020

@SirTapir
I used this https://github.com/hunglc007/tensorflow-yolov4-tflite and still I'm getting like this! Did you face or have any idea what it could be?
WhatsApp Image 2020-11-13 at 3 23 28 AM

@SirTapir
Copy link
Author

@agh372 Sorry, I don't know where the problem is that could result in your picture. Maybe try opening an issue in that repo?

@agh372
Copy link

agh372 commented Nov 16, 2020

@SirTapir I figured the post-processing part! I have a question regarding the order. So I observed that your tflite model has a shape of [1, 2535, 7]

What I don't understand is, so each grid cell of 13 scales are used to predict 3 bounding box is what I read. Hence the formula ( B * ( C + 5)). Here B = 3 in my case.

So in the array of 2535, If the 0th index represents the first cell (0,0) of 13 scale for the first bounding box

what does the 1st index represent then?

Does it represent the second bounding box information for the first cell (0,0) of 13 scale

or

Does it represent the first bounding box information for the second cell (0,1) of 13 scale?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues TFLiteConverter For issues related to TFLite converter type:others issues not falling in bug, perfromance, support, build and install or feature
Projects
None yet
Development

No branches or pull requests

5 participants