-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensorflow_xla model inference crash on Jetson AGX xavier #48104
Comments
@lcx2017, |
Also, TensorFlow v2.4.1 is compatible with CUDA 11.0 and cuDNN 8.0. Please take a look at the tested build configurations for more information.
Could you please update CUDA to v11.0 and check if you are still facing the same error. Thanks! |
For Jetson AGX Xavier platform, it only support Nvidia Jetpack SDK to update system AI tool chain, now I'm using Jetpack4.4, and Jetpack does not support CUDA 11.0 yet. https://developer.nvidia.com/jetpack-sdk-44-archive |
Do you have Jetson AGX Xavier platform to reproduce this issue? |
@lcx2017, Currently, I do not have the NVIDIA Jetson AGX Xavier Developer Kit. But a minimal code snippet would help us debug the issue and determine the source of the error easily. |
Thanks a lot! have also tested Jetpack 4.3 on Jetson AGX Xavier, find there is no crash issue on Jetpack4.3 (Tensorflow 2.2)(https://developer.nvidia.com/jetpack-43-archive), After compare the Jetpack4.4 with Tensorflow 2.4 and Jetpack4.3 with Tensorflow2.2, found new latency slow issue for op of tf.math.unsorted_segment_max: |
I met tensorflow_xla crash issue for model inference on Nvidia Jetson AGX Xavier aarch64 system.
System information
crash.log
gdb.log
Issues:
1, When do model inference with xla enable, this crash can be reproduced almost every time (xavier aarch64 system)
2, When I turn off some of custom ops (CPU compute op), crash can happen about 7 times after 10 runs (xavier aarch64 system)
3, When I run same code, same tensorflow-2.4.1 version on V100 GPU and x86 system, it can run successful without crash (x86 + v100 gpu system)
The text was updated successfully, but these errors were encountered: