A crash due to check-fail can be triggered in QuantizedConv2D #59927

shijy16 · 2023-03-08T02:46:25Z

Click to expand!

Issue Type

Bug

Have you reproduced the bug with TF nightly?

Yes

Source

binary

Tensorflow Version

2.13.0.dev20230307

Custom Code

No

OS Platform and Distribution

Linux Ubuntu 20.04

Mobile device

No response

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

CUDA 11.5

GPU model and memory

No response

Current Behaviour?

A crash due to check-fail can be triggered in QuantizedConv2D and its external api `tf.compat.v1.nn.quantized_conv2d`.

Standalone code to reproduce the issue

import tensorflow as tf
print(tf.__version__)
strides = [1, 128, 128, 1]
padding = "SAME"
dilations = [1, 1, 1, 1]
input = tf.cast(tf.random.uniform([2, 1, 0, 1], minval=0, maxval=64, dtype=tf.int64), dtype=tf.quint8)
filter = tf.cast(tf.random.uniform([1, 1, 1, 1], minval=0, maxval=64, dtype=tf.int64), dtype=tf.quint8)
min_input = tf.random.uniform([], dtype=tf.float32)
max_input = tf.random.uniform([], dtype=tf.float32)
min_filter = tf.random.uniform([], dtype=tf.float32)
max_filter = tf.random.uniform([], dtype=tf.float32)
# res = tf.raw_ops.QuantizedConv2D(
res = tf.compat.v1.nn.quantized_conv2d(
    strides=strides,
    padding=padding,
    dilations=dilations,
    input=input,
    filter=filter,
    min_input=min_input,
    max_input=max_input,
    min_filter=min_filter,
    max_filter=max_filter,
)

Relevant log output

2023-03-08 10:44:27.398763: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-03-08 10:44:27.448284: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-08 10:44:28.229207: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2.13.0-dev20230307
2023-03-08 10:44:29.975476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7865 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:2f:00.0, compute capability: 7.0
2023-03-08 10:44:30.000649: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.179857: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.185432: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__RandomUniformInt_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.193600: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__Cast_device_/job:localhost/replica:0/task:0/device:CPU:0" with target device "/job:localhost/replica:0/task:0/device:CPU:0". Took 0 secs.
2023-03-08 10:44:30.195940: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0" with target device "/job:localhost/replica:0/task:0/device:GPU:0". Took 0 secs.
2023-03-08 10:44:30.201918: I tensorflow/core/common_runtime/process_function_library_runtime.cc:586] Finished graph optimizations for MultiDevice function "__wrapped__QuantizedConv2D_device_/job:localhost/replica:0/task:0/device:CPU:0" with target device "/job:localhost/replica:0/task:0/device:CPU:0". Took 0 secs.
2023-03-08 10:44:30.202557: F tensorflow/core/kernels/quantized_conv_ops.cc:572] Check failed: out_cols > 0 (0 vs. 0)
Aborted (core dumped)

SuryanarayanaY · 2023-03-09T07:20:33Z

Hi @shijy16 ,

I have replicated the reported issues with tf-nightly(2.13.0-dev20230308) with colab.

Please refer to snapshot below. Thanks for reporting.

SuryanarayanaY · 2023-10-04T09:24:40Z

Still an issue with tf-nightly(2.15.0-dev20231004). Crash with not so clear logs.

google-ml-butler · 2024-03-13T17:56:04Z

Are you satisfied with the resolution of your issue?
Yes
No

google-ml-butler bot added the type:bug Bug label Mar 8, 2023

google-ml-butler bot assigned tilakrayal Mar 8, 2023

SuryanarayanaY assigned SuryanarayanaY and unassigned tilakrayal Mar 9, 2023

SuryanarayanaY added comp:ops OPs related issues TF 2.12 For issues related to Tensorflow 2.12 labels Mar 9, 2023

SuryanarayanaY added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 14, 2023

SuryanarayanaY mentioned this issue Mar 8, 2024

Fix checkfail in QuantizedConv2D with invalid input shape dimensions #63288

Merged

SuryanarayanaY added awaiting review Pull request awaiting review and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Mar 10, 2024

copybara-service bot closed this as completed in 07e29eb Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A crash due to check-fail can be triggered in QuantizedConv2D #59927

A crash due to check-fail can be triggered in QuantizedConv2D #59927

shijy16 commented Mar 8, 2023 •

edited by google-ml-butler bot

Issue Type

Have you reproduced the bug with TF nightly?

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output

SuryanarayanaY commented Mar 9, 2023

SuryanarayanaY commented Oct 4, 2023

google-ml-butler bot commented Mar 13, 2024

A crash due to check-fail can be triggered in QuantizedConv2D #59927

A crash due to check-fail can be triggered in QuantizedConv2D #59927

Comments

shijy16 commented Mar 8, 2023 • edited by google-ml-butler bot

Issue Type

Have you reproduced the bug with TF nightly?

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output

SuryanarayanaY commented Mar 9, 2023

SuryanarayanaY commented Oct 4, 2023

google-ml-butler bot commented Mar 13, 2024

shijy16 commented Mar 8, 2023 •

edited by google-ml-butler bot