Conv2D computes wrongly in Windows OS #64396

Shuo-Sun20 · 2024-03-25T07:25:12Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

tf 2.16

Custom code

Yes

OS platform and distribution

Windows 10

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

On Windows OS, Conv2D generates a wrong output in some cases, while performs correctly on some others.
This error does not occur on Linux OS, even with the same code.

An wrong execution example:

You can tell that the result l(x) has a wrong shape.

I notice that an exisiting issue #63860 points out the similar error in Conv3D. I guess Conv2D and Conv3D have similar problem since they have the same parent class BaseConv.

Standalone code to reproduce the issue

#This test case works fine on linux OS, while goes wrongly on Windows.


from keras.layers import Conv2D
import numpy as np

x=np.random.rand(1,2,2,1)
print(l(x).shape)
print(l.compute_output_shape(x.shape))

Relevant log output

TensorShape([1, 2, 2, 1])
(1, 0, 0, 1)

The text was updated successfully, but these errors were encountered:

Venkat6871 · 2024-03-26T06:51:00Z

Hi @Shuo-Sun20 ,
I tried to run your code on Colab using TF v2.16.1 and i am not facing any issue. Please find the gist here for reference.

Thank you!

Shuo-Sun20 · 2024-03-26T09:03:47Z

This issue only exists on Windows OS, so on Colab(linux OS) this issue will not show up.
Please try it on Windows.

NeoZhangJianyu · 2024-04-02T02:39:48Z

@Shuo-Sun20

Use following code, I got same result in both linux & windows:

from keras.layers import Conv2D
import numpy as np

x=np.random.rand(1,2,2,1)
l=Conv2D(1,3,(1,1),'valid','channels_last', [1,1],1, 'linear', True)
print(l(x).shape)
print(l.compute_output_shape(x.shape))

(1, 2, 2, 1)
(1, 0, 0, 1)

tensorflow 2.16.1
keras 3.1.1

The kernel is 3, but the input is <3. It's strange case.
Could you confirm if such input parameters are right?

Shuo-Sun20 · 2024-04-02T06:07:08Z

You are right, kernel_size > input_size should be an invalid parameter combination, while now Conv2D can generate a reseult without warning. Maybe a checker should exist here?
The behavior of Conv2D is diffenrent on Linux and Windows (since I failed to reproduce it on colab), this inconsisdent may need deeper inspection.

NeoZhangJianyu · 2024-04-07T07:37:06Z

@Shuo-Sun20

For the checker of abnormal input, could report it as a feature in another issue?
I run the case and got same result in both windows and linux.
You said they were different in your case.
Could you share the whole logs of them?
And share the result of 'pip list' in windows and linux.

Shuo-Sun20 · 2024-04-07T09:38:55Z

I'll report it in another issue.
I run the following code in both windows and linux:

from keras.layers import Conv2D
import numpy as np

x=np.random.rand(1,2,2,1)
l=Conv2D(1,3,(1,1),'valid','channels_last', [1,1],1, 'linear', True)
print(l(x).shape)
print(l.compute_output_shape(x.shape))

In windows the result is

2024-04-07 16:35:31.739978: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-07 16:35:32.406322: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-07 16:35:33.580266: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
(1, 2, 2, 1)
(1, 0, 0, 1)

The pip list in windows:

absl-py==2.1.0
astunparse==1.6.3
certifi==2024.2.2
charset-normalizer==3.3.2
flatbuffers==24.3.25
gast==0.5.4
google-pasta==0.2.0
grpcio==1.62.1
h5py==3.10.0
idna==3.6
importlib_metadata==7.1.0
keras==3.1.1
libclang==18.1.1
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mdurl==0.1.2
ml-dtypes==0.3.2
namex==0.0.7
numpy==1.26.4
opt-einsum==3.3.0
optree==0.11.0
packaging==24.0
protobuf==4.25.3
Pygments==2.17.2
requests==2.31.0
rich==13.7.1
six==1.16.0
tensorboard==2.16.2
tensorboard-data-server==0.7.2
tensorflow==2.16.1
tensorflow-intel==2.16.1
tensorflow-io-gcs-filesystem==0.31.0
termcolor==2.4.0
typing_extensions==4.11.0
urllib3==2.2.1
Werkzeug==3.0.2
wrapt==1.16.0
zipp==3.18.1

while in this linux colab, the result is:

(1, 0, 0, 1)
(1, 0, 0, 1)

I failed to install tensorflow-intel 2.16.1 in colab(linux), so I just installed tensorflow and keras using regular pip command.
You can freely edit the code in the colab and install packages.

The pip list is a little different since colab pre installs many packages used in Deep Learning. The list is too long to show in this comment. You can find it yourself with the shared link.

NeoZhangJianyu · 2024-04-09T08:26:25Z

@Shuo-Sun20
In windows, my result is same as yours.
In linux, when enable oneDNN path in TF by TF_ENABLE_ONEDNN_OPTS=1, the result is same as windows.
If TF_ENABLE_ONEDNN_OPTS=0, the result is same as yours (different with windows).

I think it's the different between oneDNN code and Eigen code of TF.

But the following code can't work in colab env you provide.
I guess the TF in colab doesn't support oneDNN code path.
Maybe you could try in local/another linux.

import os
os.environ['TF_ENABLE_ONEDNN_OPTS']='1'

from keras.layers import Conv2D
import numpy as np

x=np.random.rand(1,2,2,1)
l=Conv2D(1,3,(1,1),'valid','channels_last', [1,1],1, 'linear', True)
print(l(x).shape)
print(l.compute_output_shape(x.shape))

google-ml-butler bot added the type:bug Bug label Mar 25, 2024

google-ml-butler bot assigned Venkat6871 Mar 25, 2024

Venkat6871 added TF 2.16 subtype:windows Windows Build/Installation Issues labels Mar 26, 2024

Venkat6871 added the stat:awaiting response Status - Awaiting response from author label Mar 26, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Mar 26, 2024

GatGit12 mentioned this issue Mar 26, 2024

Please bring back native Windows CUDA support! #59918

Open

Venkat6871 assigned SuryanarayanaY and unassigned Venkat6871 Mar 27, 2024

SuryanarayanaY added the subtype:cpu-intel To track windows cpu issues label Mar 27, 2024

SuryanarayanaY added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 3, 2024

This was referenced Apr 7, 2024

A checker is needed in Conv layers keras-team/keras#19457

Closed

A checker is needed for inputs of Conv layers. #65214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conv2D computes wrongly in Windows OS #64396

Conv2D computes wrongly in Windows OS #64396

Shuo-Sun20 commented Mar 25, 2024 •

edited

Venkat6871 commented Mar 26, 2024

Shuo-Sun20 commented Mar 26, 2024

NeoZhangJianyu commented Apr 2, 2024

Shuo-Sun20 commented Apr 2, 2024

NeoZhangJianyu commented Apr 7, 2024

Shuo-Sun20 commented Apr 7, 2024

NeoZhangJianyu commented Apr 9, 2024

Conv2D computes wrongly in Windows OS #64396

Conv2D computes wrongly in Windows OS #64396

Comments

Shuo-Sun20 commented Mar 25, 2024 • edited

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

Venkat6871 commented Mar 26, 2024

Shuo-Sun20 commented Mar 26, 2024

NeoZhangJianyu commented Apr 2, 2024

Shuo-Sun20 commented Apr 2, 2024

NeoZhangJianyu commented Apr 7, 2024

Shuo-Sun20 commented Apr 7, 2024

NeoZhangJianyu commented Apr 9, 2024

Shuo-Sun20 commented Mar 25, 2024 •

edited