Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x70 in tid 28007 #34313

Closed
kazuimotn opened this issue Nov 15, 2019 · 11 comments
Closed
Assignees
Labels
comp:lite TF Lite related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug

Comments

@kazuimotn
Copy link

kazuimotn commented Nov 15, 2019

There was a problem when calling converted custom model (LSTM.tflite) in Android Studio.
with

converter.experimental_new_converter = True

System information
model training machine

  • Ubuntu 16.04 (LTS)
  • GTX1080
  • tensorflow 1.13.1
  • Keras 2.2.4
  • Python 3.6.8

model converting & loading machine

  • MacOS Mojave 10.14.5
  • Python 3.7.1
  • tensorflow 2.0.0, tf-nightly 2.1.0.dev20191113
  • Android Studio 3.4.1
  • targetSdkVersion 28

Smartphone

  • Android 8.0.0
  • SONY SO-04J (docomo xperia)

I can convert custom LSTM model from .h5 to .tflite with following code:

import tensorflow as tf
model = tf.keras.models.load_model("LSTM.h5")
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.experimental_new_converter = True
tflite_model = converter.convert()
open("LSTM.tflite", "wb").write(tflite_model)

Next I want to load this model (LSTM.tflite) into my Android APK.
I already have class that can perform inference using 1D-CNN custom model (1DCNN.tflite).

package iis.kmjlab.kazuimotn.sartips.others;
import android.content.res.AssetFileDescriptor;
import android.content.res.AssetManager;
import org.tensorflow.lite.Interpreter;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

public class TensorFlowLiteClassifier {

    private Interpreter interpreter;
    public static final String MODEL_FILE = "1DCNN.tflite";
    public static final int LABEL_NUM = 2;

    /**
     * Registers interpreter
     */
    private TensorFlowLiteClassifier(Interpreter interpreter) {
        this.interpreter = interpreter;
    }

    /**
     * Loads model into interpreter
     */
    public static TensorFlowLiteClassifier classifier(AssetManager assetManager, String modelPath) throws IOException {
        ByteBuffer byteBuffer = loadModelFile(assetManager, modelPath);
        Interpreter interpreter = new Interpreter(byteBuffer);
        return new TensorFlowLiteClassifier(interpreter);
    }

    /**
     * Loads model
     */
    private static MappedByteBuffer loadModelFile(AssetManager assets, String path) throws IOException {
        AssetFileDescriptor file = assets.openFd(path);
        FileInputStream stream = new FileInputStream(file.getFileDescriptor());
        FileChannel channel = stream.getChannel();
        long startOffset = file.getStartOffset();
        long declaredLength = file.getDeclaredLength();
        return channel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
    }

    /**
     * Function that actually performs inference
     */
    public float[][] predictProbabilities(float[][] input) {
        float[][] output = new float[1][LABEL_NUM];
        interpreter.run(input, output);
        return output;
    }
}

When I use 1DCNN.tflite, this class works completely. However, when I change the model into LSTM.tflite, getting ERROR at this line

Interpreter interpreter = new Interpreter(byteBuffer);

And ERROR logcat is following
A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x70 in tid 28007

How can I deal with this unknown error?
After my some investigation, this error seems to occur when the prepared array is exceeded for some reason.

@jvishnuvardhan jvishnuvardhan added the comp:lite TF Lite related issues label Nov 15, 2019
@jvishnuvardhan jvishnuvardhan self-assigned this Nov 15, 2019
@jvishnuvardhan jvishnuvardhan added TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug labels Nov 15, 2019
@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 15, 2019
@gargn gargn assigned miaout17 and haozha111 and unassigned gargn Nov 15, 2019
@haozha111
Copy link
Contributor

I'm wondering if it's an issue with the Android APK environment. Does the same lstm flatbuffer model run on desktop?

@kazuimotn
Copy link
Author

kazuimotn commented Nov 16, 2019

I can infer probabilities using same LSTM.tflite model with this Python code:

import tensorflow as tf
interpreter = tf.compat.v2.lite.Interpreter("LSTM.tflite")
interpreter.allocate_tensors()
input  = interpreter.tensor(interpreter.get_input_details()[0]["index"])
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])
for i in range(10):
  input().fill(1)
  interpreter.invoke()
  print("inference %s" % output())

Output is following:

inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]
inference [[9.5466515e-09 1.0000000e+00]]

My model is for binary classification, it means this output is what I want.
In other words, does it mean that the Android APK environment is incorrect?

@kazuimotn
Copy link
Author

It might be due to some native methods here:
TensorflowLite/library/src/main/java/org/tensorflow/lite/NativeInterpreterWrapper.java

private static native long createErrorReporter(int size);
private static native long createModelWithBuffer(MappedByteBuffer modelBuffer, long errorHandle);
private static native long createInterpreter(long modelHandle);

/**
   * Initializes a {@code NativeInterpreterWrapper} with a {@code MappedByteBuffer}. The
   * MappedByteBuffer should not be modified after the construction of a {@code
   * NativeInterpreterWrapper}.
   */
  NativeInterpreterWrapper(MappedByteBuffer mappedByteBuffer) {
    modelByteBuffer = mappedByteBuffer;
    errorHandle = createErrorReporter(ERROR_BUFFER_SIZE);
    modelHandle = createModelWithBuffer(modelByteBuffer, errorHandle);
    interpreterHandle = createInterpreter(modelHandle);
  }

@tensorflowbutler tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 16, 2019
@lucasveljacic
Copy link

Hi!, could any of you solve this issue?
I'm having the same error with an LSTM clasifier.
In fact, not exactly the same... actually I'm getting this:

java.lang.IllegalArgumentException: ByteBuffer is not a valid flatbuffer model

@luang008
Copy link

I solved a similar issue following this guide.

Particularly adding this into build.gradle:

android { defaultConfig { ndk { abiFilters 'armeabi-v7a', 'arm64-v8a' } } }

to filter unnecessary ABI dependencies.

@Saduf2019
Copy link
Contributor

@kazuimotn
Please update as per above comment.

@Saduf2019 Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Sep 4, 2020
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Sep 11, 2020
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@StewardH
Copy link

I am seeing the same crash: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)

This occurs while running the standard object detection routine using the mobileNet SSD .tflite model that can be found in tf model zoo:

The issue was replicated on Samsung Galaxy S20+ as well as on Motorola phones and it seems like this crash is phone independent. On Motorola phones the crash occurs within 2 hours while running the detection while on Samsung Galaxy the issue sometimes occurs after 8h of running the same code on the same input data in a repetitive loop. Filtering the abi's did not affect this case.
Did anyone resolved this crash in a more general fashion?

It seems like it occurs on the RenderThread during the DeferredLayerUpdate.
Here is my crash report:
--------- beginning of crash
2022-02-13 15:33:42.474 18219-18249/? A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2820 in tid 18249 (RenderThread), pid 18219 (dia.rec)
2022-02-13 15:33:42.626 18717-18717/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2022-02-13 15:33:42.626 18717-18717/? A/DEBUG: Build fingerprint: 'motorola/kyoto_retaile/kyoto:11/RRKS31.Q3-19-97-3/bc988:user/release-keys'
2022-02-13 15:33:42.626 18717-18717/? A/DEBUG: Revision: 'PVT1'
2022-02-13 15:33:42.626 18717-18717/? A/DEBUG: ABI: 'arm64'
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: Timestamp: 2022-02-13 15:33:42-0500
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: pid: 18219, tid: 18249, name: RenderThread >>> dia.rec <<<
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: uid: 10324
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2820
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x0 b4000076636106d0 x1 b4000076636dbed0 x2 0000007576fea8c0 x3 00000000000000ff
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x4 0000000000000003 x5 b4000076c35c8980 x6 b4000076c35c8980 x7 0000000000000000
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x8 0000000000000000 x9 0000000000000001 x10 0000000000000000 x11 0000000063610768
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x12 00000000e1b5b851 x13 2365000000084108 x14 001c222856e7e2a0 x15 000021d5f589e147
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x16 00000078764327f8 x17 00000078739dbb70 x18 0000007576d54000 x19 b4000076636106d0
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x20 b4000076636dbed0 x21 b4000076636dbef0 x22 0000000000002800 x23 b4000075f35b75b8
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x24 b4000076235d4c50 x25 0000007576feacc0 x26 0000007576feaff8 x27 00000000000fe000
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: x28 00000000000fc000 x29 0000007576fea870
2022-02-13 15:33:42.628 18717-18717/? A/DEBUG: lr 0000007874205500 sp 0000007576fea870 pc 000000787420ae00 pst 0000000020001000
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: backtrace:
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #00 pc 0000000000239e00 /system/lib64/libhwui.so (android::uirenderer::Layer::Layer(android::uirenderer::RenderState&, sk_sp, int, SkBlendMode)+304) (BuildId: 0e3f217960712f3e10c4b8b3b618903d)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #1 pc 00000000002344fc /system/lib64/libhwui.so (android::uirenderer::DeferredLayerUpdater::apply()+428) (BuildId: 0e3f217960712f3e10c4b8b3b618903d)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #2 pc 000000000021fbd8 /system/lib64/libhwui.so (_ZNSt3__110__function6__funcIZN7android10uirenderer12renderthread13DrawFrameTask11postAndWaitEvE3$_0NS_9allocatorIS6_EEFvvEEclEv$c303f2d2360db58ed70a2d0ac7ed911b+684) (BuildId: 0e3f217960712f3e10c4b8b3b618903d)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #3 pc 000000000020e504 /system/lib64/libhwui.so (android::uirenderer::WorkQueue::process()+220) (BuildId: 0e3f217960712f3e10c4b8b3b618903d)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #4 pc 000000000022f41c /system/lib64/libhwui.so (android::uirenderer::renderthread::RenderThread::threadLoop()+88) (BuildId: 0e3f217960712f3e10c4b8b3b618903d)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #5 pc 0000000000015414 /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+260) (BuildId: 82f928b900a93dc07b75aefd76a59775)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #6 pc 0000000000014cd8 /system/lib64/libutils.so (thread_data_t::trampoline(thread_data_t const*)+412) (BuildId: 82f928b900a93dc07b75aefd76a59775)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #7 pc 00000000000af86c /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+64) (BuildId: ced3426ba03689dee1bf74b8de9c436a)
2022-02-13 15:33:42.669 18717-18717/? A/DEBUG: #8 pc 0000000000050110 /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: ced3426ba03689dee1bf74b8de9c436a)

@yen-dang-ts
Copy link

I got same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug
Projects
None yet
Development

No branches or pull requests