Segmentation Fault when using Coral Edge Tpu #62371

Skillnoob · 2023-11-10T22:11:05Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

binary

TensorFlow version

2.14.0

Custom code

Yes

OS platform and distribution

Raspberry Pi Os Bookworm

Mobile device

No response

Python version

3.11.2

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

The program crashes as soon as it tries to load the edge tpu model.
I already asked in the YoloV8 discord and they told me to open a issue here from looking at the stacktrace.
The hardware is a raspberry pi 5 with 8Gb of ram but the same issue also happens on a raspberry pi 4B with 2Gb of ram

Standalone code to reproduce the issue

import cv2
from ultralytics import YOLO


def main():
    model = YOLO('model_edgetpu.tflite', task='detect') # has to be a YoloV8 model converted to tflite and compiled to the coral edge tpu

    camera = cv2.VideoCapture(0)

    while camera.isOpened():
        _, frame = camera.read()

        results = model.predict(frame)

        annotated_frame = results[0].plot()

        cv2.imshow("YOLOv8 Inference", annotated_frame)

        if cv2.waitKey(1) & 0xFF == ord("q"):
            break

    camera.release()
    cv2.destroyAllWindows()


if __name__ == '__main__':
    main()

Relevant log output

Crash when running through gdb to get the stacktrace:

(gdb) run ai_model_copy.py run ai_model_copy.py 
Starting program: /usr/bin/python3 ai_model_copy.py run ai_model_copy.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5a4f180 (LWP 2502)]
[New Thread 0x7ffff523f180 (LWP 2503)]
[New Thread 0x7ffff0a2f180 (LWP 2504)]
[New Thread 0x7fffd9bef180 (LWP 2505)]
[New Thread 0x7fffd73df180 (LWP 2506)]
[New Thread 0x7fffd4bcf180 (LWP 2507)]
[Detaching after vfork from child process 2508]
[Detaching after vfork from child process 2509]
[Detaching after vfork from child process 2510]
[New Thread 0x7fff94aaf180 (LWP 2517)]
[New Thread 0x7fff9229f180 (LWP 2518)]
[New Thread 0x7fff91a8f180 (LWP 2519)]
Loading mode_edgetpu.tflite for TensorFlow Lite Edge TPU inference...
[New Thread 0x7fff89bef180 (LWP 2521)]
[Thread 0x7fff89bef180 (LWP 2521) exited]
[New Thread 0x7fff89bef180 (LWP 2522)]
[Thread 0x7fff89bef180 (LWP 2522) exited]
[New Thread 0x7fff89bef180 (LWP 2523)]
[New Thread 0x7fff893df180 (LWP 2524)]
[New Thread 0x7fff88bcf180 (LWP 2525)]
[Thread 0x7fff88bcf180 (LWP 2525) exited]
[Thread 0x7fff893df180 (LWP 2524) exited]
[New Thread 0x7fff893df180 (LWP 2526)]
[New Thread 0x7fff88bcf180 (LWP 2527)]
[New Thread 0x7fff83fff180 (LWP 2528)]

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007fff8bbee22c in tflite::Subgraph::ReplaceNodeSubsetsWithDelegateKernels(TfLiteRegistration, TfLiteIntArray const*, TfLiteDelegate*) ()
   from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so

Stacktrace made using gdb:

#0  0x00007fff8bbee22c in tflite::Subgraph::ReplaceNodeSubsetsWithDelegateKernels(TfLiteRegistration, TfLiteIntArray const*, TfLiteDelegate*) ()
   from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#1  0x00007fff8bbee1e0 in tflite::Subgraph::ReplaceNodeSubsetsWithDelegateKernels(TfLiteContext*, TfLiteRegistration, TfLiteIntArray const*, TfLiteDelegate*) ()
   from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#2  0x00007fff89c6be24 in ?? () from /lib/aarch64-linux-gnu/libedgetpu.so.1
#3  0x00007fff8bbf3058 in tflite::Subgraph::ModifyGraphWithDelegateImpl(TfLiteDelegate*) () from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#4  0x00007fff8bbf3624 in tflite::Subgraph::ModifyGraphWithDelegate(TfLiteDelegate*) () from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#5  0x00007fff8bbe5ce8 in tflite::impl::Interpreter::ModifyGraphWithDelegateImpl(TfLiteDelegate*) () from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#6  0x00007fff8b92af70 in tflite::interpreter_wrapper::InterpreterWrapper::ModifyGraphWithDelegate(TfLiteDelegate*) ()
   from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#7  0x00007fff8b9273a4 in pybind11::cpp_function::initialize<pybind11_init__pywrap_tensorflow_interpreter_wrapper(pybind11::module_&)::$_23, pybind11::object, tflite::interpreter_wrapper::InterpreterWrapper&, unsigned long, pybind11::name, pybind11::is_method, pybind11::sibling, char [60]>(pybind11_init__pywrap_tensorflow_interpreter_wrapper(pybind11::module_&)::$_23&&, pybind11::object (*)(tflite::interpreter_wrapper::InterpreterWrapper&, unsigned long), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, char const (&) [60])::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) ()
   from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#8  0x00007fff8b91772c in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /home/pi/.local/lib/python3.11/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
#9  0x00000000004c9d5c in ?? ()
#10 0x0000000000494548 in _PyObject_MakeTpCall ()
#11 0x00000000004aa23c in _PyEval_EvalFrameDefault ()
#12 0x00000000004e2cec in _PyFunction_Vectorcall ()
#13 0x000000000049c1d8 in _PyObject_FastCallDictTstate ()
#14 0x00000000004edc24 in ?? ()
#15 0x00000000004944d8 in _PyObject_MakeTpCall ()
#16 0x00000000004aa23c in _PyEval_EvalFrameDefault ()
#17 0x00000000004e2cec in _PyFunction_Vectorcall ()
#18 0x00000000004f3390 in PyObject_Call ()
#19 0x00000000004ae388 in _PyEval_EvalFrameDefault ()
#20 0x00000000004e2cec in _PyFunction_Vectorcall ()
#21 0x000000000049c1d8 in _PyObject_FastCallDictTstate ()
#22 0x00000000004edc24 in ?? ()
#23 0x00000000004944d8 in _PyObject_MakeTpCall ()
#24 0x00000000004aa23c in _PyEval_EvalFrameDefault ()
#25 0x00000000004a0b60 in PyEval_EvalCode ()
#26 0x00000000005fafa8 in ?? ()
#27 0x00000000005f7bd0 in ?? ()
#28 0x0000000000608760 in ?? ()
#29 0x0000000000608308 in _PyRun_SimpleFileObject ()
#30 0x0000000000608070 in _PyRun_AnyFileObject ()
#31 0x000000000060631c in Py_RunMain ()
#32 0x00000000005d0154 in Py_BytesMain ()
#33 0x00007ffff7ce7780 in __libc_start_call_main (main=main@entry=0x5cfff4 <_start+52>, argc=argc@entry=4, argv=argv@entry=0x7ffffffff0f8) at ../sysdeps/nptl/libc_start_call_main.h:58
#34 0x00007ffff7ce7858 in __libc_start_main_impl (main=0x5cfff4 <_start+52>, argc=4, argv=0x7ffffffff0f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:360
#35 0x00000000005cfff0 in _start ()

sushreebarsa · 2023-11-14T07:27:30Z

@Skillnoob Could you please make sure that you are using the correct TensorFlow Lite model and interpreter for your Edge TPU device?
The Edge TPU hardware and firmware needs to be up to date to avoid such issues.
Thank you!

Skillnoob · 2023-11-14T08:57:09Z

@sushreebarsa The model has been converted to tflite with int8 and compiled using the latest version of the edgetpu compiler. The edgetpu runtime is also on the latest version. The ultralytics module uses the tflite interpreter from the tensorflow module iirc.

LakshmiKalaKadali · 2023-11-17T05:37:52Z

@Skillnoob , Could you please provide the model_edgetpu.tflite file to better understand the issue and to investigate further?

Thank You

Skillnoob · 2023-11-17T12:48:45Z

@LakshmiKalaKadali
https://drive.google.com/file/d/1Vua1ujK07FXpMdM34uFltA0AchEvC0lH/view?usp=sharing

LakshmiKalaKadali · 2023-11-21T15:02:17Z

@Skillnoob, I have verified the Tflite file given. It seems that your model is using custom ops. Please find the gist. Could you please check these instructions(documentation) for custom ops and let us know if they have been followed?

Thank You

Skillnoob · 2023-11-21T16:34:36Z

@LakshmiKalaKadali the error you got inside the notebook is to be expected as the model relies on the delegate from the coral usb accelerator

LakshmiKalaKadali · 2023-11-24T14:26:03Z

Hi @pkgoogle ,
Please look into the issue.

Thank you

pkgoogle · 2023-11-27T21:57:14Z

Hi @Skillnoob, it seems like there is a dependency on some installation instructions from Coral? Can you let us know the Coral instructions you followed prior to receiving this error? Also please note that the website explicitly states:

Python 3.6 - 3.9

As a requirement.

Apparently you are using: 3.11.2, you might want to try using python=3.9 and see if it resolves your issue. Thanks for your help.

Skillnoob · 2023-11-27T22:01:32Z

@pkgoogle Hi. I've tried the same program previously on a raspberry pi 4B with python 3.9.2 and had the same crash but couldn't run it through gdb as the pi would just freeze. I also tried it on the pi 5 using a anaconda virtual env with python 3.9.18 and the same crash occured.

Skillnoob · 2024-01-21T19:41:37Z

Recently tried to run this using tensorflow-aarch64 2.13.1 and 2.15.0 and the crash still occurs

Skillnoob · 2024-01-21T19:50:54Z

easier way of reproducing the issue:

pip install ultralytics

yolo export model=yolov8n.pt format=edgetpu

Install the edge tpu runtime as explained here

yolo detect val model=yolov8n_saved_model/yolov8n_full_integer_quant_edgetpu.tflite data=coco128.yaml batch=1

Skillnoob · 2024-01-22T01:44:10Z

also tried this with every tensorflow version going from the current latest to 2.12.1 on a raspberry pi 4B with python 3.9.2

Skillnoob · 2024-01-30T01:12:58Z

This is now resolved due to https://github.com/feranick updating the libedgetpu runtime to support newer tflite_runtime versions in google-coral/edgetpu#812 .

google-ml-butler · 2024-01-30T01:13:00Z

Are you satisfied with the resolution of your issue?
Yes
No

feranick · 2024-01-30T02:33:17Z

I would still keep this issue open. While there is a solution from my forked repository, Google should update the main one, and until then the issue is not solved.

Skillnoob · 2024-01-30T02:42:18Z

@feranick i agree. But it seems that google has completly abandoned the coral project.

pkgoogle · 2024-01-30T19:34:46Z

Hi @feranick I don't work with that repo but can you perhaps link your PR that fixes the issue? Maybe I can help push it from here.

Edit: nevermind.. found it. I think this is it? google-coral/libedgetpu#59

Skillnoob · 2024-01-30T20:29:46Z

@pkgoogle thats the correct pr

feranick · 2024-01-31T15:51:28Z

New builds are finally available against Tensorflow v2.15.0 (current). I had to refactor the WORKSPACE to conform to the deprecation of the TF/Toolchain. All seem to be working for me (including on armhf where it only worked with tflite_runtime v2.13.1).

I plan to do a new PR soon.

https://github.com/feranick/libedgetpu/releases/tag/v16.0-TF2.15.0-1

feranick · 2024-01-31T15:53:18Z

Hi @feranick I don't work with that repo but can you perhaps link your PR that fixes the issue? Maybe I can help push it from here.

Edit: nevermind.. found it. I think this is it? google-coral/libedgetpu#59

Thanks. That is the PR, but I plan to make a new one based on the last few commits hat bring support to TensorFlow 2.15.0. I'll post here.

feranick · 2024-01-31T16:23:28Z

BTW, in case someone needs updated tflite_runtime wheels, I prepared a few here.

https://github.com/feranick/TFlite-builds/releases/tag/v2.15.0

cyclux · 2024-02-27T16:01:46Z

@feranick Thanks a lot! Finally after a long journey in the rabbit hole, this is the only solution to get working with 3.11 currently

feranick · 2024-02-27T16:05:50Z

Hi @feranick I don't work with that repo but can you perhaps link your PR that fixes the issue? Maybe I can help push it from here.

Edit: nevermind.. found it. I think this is it? google-coral/libedgetpu#59

pkgoogle Just wondering whether you had a chance to move this forward. The correct and current PR is this:

google-coral/libedgetpu#60

feranick · 2024-02-29T21:03:03Z

Thanks very much to @pkgoogle and @Namburger at Google for merging PR. The libedgetpu library is now fully updated, and I hope binaries will be made available soon through the official channel.

pkgoogle · 2024-02-29T21:24:45Z

Hi @Skillnoob, as @feranick mentioned, the libedgetpu library is now updated. Can you test your case against master and see if it resolves your issue?

Skillnoob · 2024-02-29T22:07:00Z

@pkgoogle I have already tested with @feranick 's builds a while ago and they work without any issues on my rpi 5 with python 3.11

pkgoogle · 2024-02-29T22:51:21Z

@Skillnoob, awesome, if you are confident this issue is resolved and you have no more open items, please feel free to close this issue as completed. Thanks.

google-ml-butler · 2024-02-29T22:52:01Z

Are you satisfied with the resolution of your issue?
Yes
No

google-ml-butler bot added the type:bug Bug label Nov 10, 2023

google-ml-butler bot assigned sushreebarsa Nov 10, 2023

sushreebarsa added comp:lite TF Lite related issues TF2.14 For issues related to Tensorflow 2.14.x labels Nov 11, 2023

Skillnoob mentioned this issue Nov 11, 2023

Docker & EdgeTPU Segmentation fault ultralytics/ultralytics#1829

Closed

2 tasks

sushreebarsa assigned pjpratik and unassigned pjpratik Nov 14, 2023

sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Nov 14, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Nov 14, 2023

sushreebarsa assigned pjpratik and unassigned sushreebarsa Nov 14, 2023

pjpratik assigned LakshmiKalaKadali and unassigned pjpratik Nov 15, 2023

LakshmiKalaKadali added the stat:awaiting response Status - Awaiting response from author label Nov 17, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Nov 17, 2023

LakshmiKalaKadali added the stat:awaiting response Status - Awaiting response from author label Nov 21, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Nov 21, 2023

LakshmiKalaKadali assigned pkgoogle and unassigned LakshmiKalaKadali Nov 25, 2023

pkgoogle added the stat:awaiting response Status - Awaiting response from author label Nov 27, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Nov 27, 2023

pkgoogle assigned nutsiepully Nov 28, 2023

Skillnoob mentioned this issue Jan 21, 2024

Test Google Coral TPU M.2 Accelerator A+E key geerlingguy/raspberry-pi-pcie-devices#44

Open

Skillnoob closed this as completed Jan 30, 2024

Skillnoob reopened this Jan 30, 2024

feranick mentioned this issue Jan 31, 2024

Update libedgetpu repo for compatibility of recent versions of Tensorflow. google-coral/libedgetpu#60

Merged

apalrd mentioned this issue Feb 27, 2024

Deprecate Coral TPU suggestion from documentation as upstream dependencies are abandoned blakeblackshear/frigate#10056

Closed

pkgoogle added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Feb 29, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Feb 29, 2024

pkgoogle added the stat:awaiting response Status - Awaiting response from author label Feb 29, 2024

Skillnoob closed this as completed Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation Fault when using Coral Edge Tpu #62371

Segmentation Fault when using Coral Edge Tpu #62371

Skillnoob commented Nov 10, 2023 •

edited

sushreebarsa commented Nov 14, 2023

Skillnoob commented Nov 14, 2023

LakshmiKalaKadali commented Nov 17, 2023

Skillnoob commented Nov 17, 2023

LakshmiKalaKadali commented Nov 21, 2023 •

edited

Skillnoob commented Nov 21, 2023

LakshmiKalaKadali commented Nov 24, 2023

pkgoogle commented Nov 27, 2023

Skillnoob commented Nov 27, 2023 •

edited

Skillnoob commented Jan 21, 2024

Skillnoob commented Jan 21, 2024

Skillnoob commented Jan 22, 2024

Skillnoob commented Jan 30, 2024

google-ml-butler bot commented Jan 30, 2024

feranick commented Jan 30, 2024

Skillnoob commented Jan 30, 2024

pkgoogle commented Jan 30, 2024 •

edited

Skillnoob commented Jan 30, 2024

feranick commented Jan 31, 2024

feranick commented Jan 31, 2024

feranick commented Jan 31, 2024 •

edited

cyclux commented Feb 27, 2024

feranick commented Feb 27, 2024 •

edited

feranick commented Feb 29, 2024

pkgoogle commented Feb 29, 2024

Skillnoob commented Feb 29, 2024

pkgoogle commented Feb 29, 2024

google-ml-butler bot commented Feb 29, 2024

Segmentation Fault when using Coral Edge Tpu #62371

Segmentation Fault when using Coral Edge Tpu #62371

Comments

Skillnoob commented Nov 10, 2023 • edited

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sushreebarsa commented Nov 14, 2023

Skillnoob commented Nov 14, 2023

LakshmiKalaKadali commented Nov 17, 2023

Skillnoob commented Nov 17, 2023

LakshmiKalaKadali commented Nov 21, 2023 • edited

Skillnoob commented Nov 21, 2023

LakshmiKalaKadali commented Nov 24, 2023

pkgoogle commented Nov 27, 2023

Skillnoob commented Nov 27, 2023 • edited

Skillnoob commented Jan 21, 2024

Skillnoob commented Jan 21, 2024

Skillnoob commented Jan 22, 2024

Skillnoob commented Jan 30, 2024

google-ml-butler bot commented Jan 30, 2024

feranick commented Jan 30, 2024

Skillnoob commented Jan 30, 2024

pkgoogle commented Jan 30, 2024 • edited

Skillnoob commented Jan 30, 2024

feranick commented Jan 31, 2024

feranick commented Jan 31, 2024

feranick commented Jan 31, 2024 • edited

cyclux commented Feb 27, 2024

feranick commented Feb 27, 2024 • edited

feranick commented Feb 29, 2024

pkgoogle commented Feb 29, 2024

Skillnoob commented Feb 29, 2024

pkgoogle commented Feb 29, 2024

google-ml-butler bot commented Feb 29, 2024

Skillnoob commented Nov 10, 2023 •

edited

LakshmiKalaKadali commented Nov 21, 2023 •

edited

Skillnoob commented Nov 27, 2023 •

edited

pkgoogle commented Jan 30, 2024 •

edited

feranick commented Jan 31, 2024 •

edited

feranick commented Feb 27, 2024 •

edited