Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A crash due to check-fail can be triggered in QuantizedConv2D #59927

Closed
shijy16 opened this issue Mar 8, 2023 · 3 comments · Fixed by #63288
Closed

A crash due to check-fail can be triggered in QuantizedConv2D #59927

shijy16 opened this issue Mar 8, 2023 · 3 comments · Fixed by #63288
Assignees
Labels
awaiting review Pull request awaiting review comp:ops OPs related issues TF 2.12 For issues related to Tensorflow 2.12 type:bug Bug

Comments

@shijy16
Copy link

shijy16 commented Mar 8, 2023

Click to expand!

Issue Type

Bug

Have you reproduced the bug with TF nightly?

Yes

Source

binary

Tensorflow Version

2.13.0.dev20230307

Custom Code

No

OS Platform and Distribution

Linux Ubuntu 20.04

Mobile device

No response

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

CUDA 11.5

GPU model and memory

No response

Current Behaviour?

A crash due to check-fail can be triggered in QuantizedConv2D and its external api `tf.compat.v1.nn.quantized_conv2d`.

Standalone code to reproduce the issue

import tensorflow as tf
print(tf.__version__)
strides = [1, 128, 128, 1]
padding = "SAME"
dilations = [1, 1, 1, 1]
input = tf.cast(tf.random.uniform([2, 1, 0, 1], minval=0, maxval=64, dtype=tf.int64), dtype=tf.quint8)
filter = tf.cast(tf.random.uniform([1, 1, 1, 1], minval=0, maxval=64, dtype=tf.int64), dtype=tf.quint8)
min_input = tf.random.uniform([], dtype=tf.float32)
max_input = tf.random.uniform([], dtype=tf.float32)
min_filter = tf.random.uniform([], dtype=tf.float32)
max_filter = tf.random.uniform([], dtype=tf.float32)
# res = tf.raw_ops.QuantizedConv2D(
res = tf.compat.v1.nn.quantized_conv2d(
    strides=strides,
    padding=padding,
    dilations=dilations,
    input=input,
    filter=filter,
    min_input=min_input,
    max_input=max_input,
    min_filter=min_filter,
    max_filter=max_filter,
)

Relevant log output

2023-03-08 10:44:27.398763: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-03-08 10:44:27.448284: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-08 10:44:28.229207: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2.13.0-dev20230307
2023-03-08 10:44:29.975476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7865 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:2f:00.0, compute capability: 7.0
2023-03-08 10:44:30.000649: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.179857: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.185432: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__RandomUniformInt_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.193600: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__Cast_device_/job:localhost/replica:0/task:0/device:CPU:0" with target device "/job:localhost/replica:0/task:0/device:CPU:0". Took 0 secs.
2023-03-08 10:44:30.195940: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.201918: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__QuantizedConv2D_device_/job:localhost/replica:0/task:0/device:CPU:0" with target device "/job:localhost/replica:0/task:0/device:CPU:0". Took 0 secs.
2023-03-08 10:44:30.202557: F tensorflow/core/kernels/quantized_conv_ops.cc:572] Check failed: out_cols > 0 (0 vs. 0)
Aborted (core dumped)
@google-ml-butler google-ml-butler bot added the type:bug Bug label Mar 8, 2023
@SuryanarayanaY SuryanarayanaY added comp:ops OPs related issues TF 2.12 For issues related to Tensorflow 2.12 labels Mar 9, 2023
@SuryanarayanaY
Copy link
Collaborator

Hi @shijy16 ,

I have replicated the reported issues with tf-nightly(2.13.0-dev20230308) with colab.

Please refer to snapshot below. Thanks for reporting.

Screenshot 2023-03-09 at 12 46 09 PM

@SuryanarayanaY SuryanarayanaY added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 14, 2023
@SuryanarayanaY
Copy link
Collaborator

Still an issue with tf-nightly(2.15.0-dev20231004). Crash with not so clear logs.

Screenshot 2023-10-04 at 2 52 16 PM

@SuryanarayanaY SuryanarayanaY added awaiting review Pull request awaiting review and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Mar 10, 2024
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Pull request awaiting review comp:ops OPs related issues TF 2.12 For issues related to Tensorflow 2.12 type:bug Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants