tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip #57010

slyforce · 2022-08-04T15:32:25Z

Click to expand!

Issue Type

Bug

Source

binary

Tensorflow Version

2.9.2

Custom Code

No

OS Platform and Distribution

MacOS 12.3 arm64

Mobile device

No response

Python version

3.9.12

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

When using the pip wheel for tensorflow_macos==2.9.2 (tensorflow_macos-2.9.2-cp39-cp39-macosx_11_0_arm64.whl), using tf.constant in graph mode, or equivalently in tf.function() decorated functions, with the meta-optimizer enabled leads to incorrect tensor contents. Disabling the meta-optimizer results in correct behavior. Using tf.float32 also leads to correct behavior.

Examination of the output generated with TF_CPP_MIN_LOG_LEVEL=0 TF_CPP_MAX_VLOG_LEVEL=3 indeed shows that the binary content of the tensor is simply the value passed in, rather than the actual binary representation of float16.

This issue #53260 also suffers from the same problem.
Also this seems to be a Mac M1 ARM64 wheel-specific problem. The behaviour is correct for Linux-based systems and wheels.

Standalone code to reproduce the issue

import numpy as np
import tensorflow as tf


def f(dtype, disable_meta_optimizer: bool):
  tf.config.optimizer.set_experimental_options(
    options={
      "disable_meta_optimizer": disable_meta_optimizer,
      "min_graph_nodes": 1, # the graph will only consist of a single node, default is 4
    }
  )
  print(f"dtype={dtype} disable_meta_optimizer={disable_meta_optimizer}")
  with tf.Graph().as_default() as g, tf.compat.v1.Session(graph=g) as s:
    t = tf.constant(1.0, dtype=dtype)
    try:
      fetch = s.run(t)
      assert fetch.astype(np.float32) == np.full([], 1.0, np.float32)
    except AssertionError:
      print(f"Fail! Contents of fetched tensor: {fetch}")
    else:
      print(f"Success!")


for dtype in [tf.float32, tf.float16]:
  for disable_meta_optimizer in [True, False]:
    f(dtype, disable_meta_optimizer)

Relevant log output

dtype=<dtype: 'float32'> disable_meta_optimizer=True
Success!
dtype=<dtype: 'float32'> disable_meta_optimizer=False
Success!
dtype=<dtype: 'float16'> disable_meta_optimizer=True
Success!
dtype=<dtype: 'float16'> disable_meta_optimizer=False
Fail! Contents of fetched tensor: 0.0

The text was updated successfully, but these errors were encountered:

gadagashwini · 2022-08-08T09:12:28Z

Hi @slyforce, I executed your code with Tensorflow 2.9.2 on Mac.
Looks like its working as expected.

dtype=<dtype: 'float32'> disable_meta_optimizer=True
Metal device set to: AMD Radeon Pro 555X

systemMemory: 32.00 GB
maxCacheSize: 2.00 GB

2022-08-08 14:18:07.615126: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-08-08 14:18:07.615370: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2022-08-08 14:18:08.291252: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
Success!
dtype=<dtype: 'float32'> disable_meta_optimizer=False
2022-08-08 14:18:08.298983: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-08-08 14:18:08.299012: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2022-08-08 14:18:08.300201: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
Success!
dtype=<dtype: 'float16'> disable_meta_optimizer=True
2022-08-08 14:18:08.303720: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-08-08 14:18:08.303745: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Success!
dtype=<dtype: 'float16'> disable_meta_optimizer=False
2022-08-08 14:18:08.307425: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-08-08 14:18:08.307450: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2022-08-08 14:18:08.308204: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
Success!
(tensorflow-metal) gadag-macbookpro:Documents gadag$

slyforce · 2022-08-08T13:43:03Z

Thanks for looking into this!
My script output was generated using the Mac M1 Pro CPU, whilst you used the GPU device of an older Mac. Using the GPU via tensorflow-metal also results in the same wrong behaviour with tf.float16.

gadagashwini · 2022-08-10T10:57:23Z

Hi @slyforce, I tested it on macOS Monterey Version 12.5. with CPU version of Tensorflow. I followed the instructions mentioned here to install Tensorflow. Thank you!

maxhgerlach · 2022-08-10T11:06:39Z

@gadagashwini, I see the same issue as OP, MacBook Pro 2021 with M1 Pro (aarch64), macOS 12.5.

TensorFlow is installed via the package tensorflow-macos==2.9.2, no tensorflow-metal needed.

The issue may very well be specific to arm64 (Apple Silicon). In that case you won't see it with an Intel CPU.

slyforce · 2022-08-11T10:42:25Z

What @maxhgerlach described reflects the current situation. (replying to remove the awaiting response tag)

sachinprasadhs · 2022-08-16T18:59:30Z

Thanks for reporting the issue. I was able to reproduce the behavior in Tensorflow 2.9.2 in MaCOS M1 chip.

tilakrayal · 2024-04-26T05:01:24Z

@slyforce,
I tried to execute the code with the latest tensorflow version on MacOS, and observed that the output was as expected.
Kindly find the screenshot for the reference.

Thank you!

google-ml-butler bot added the type:bug Bug label Aug 4, 2022

google-ml-butler bot assigned mohantym Aug 4, 2022

slyforce changed the title ~~tf.constant with tf.float16 results in incorrect outputs~~ tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip Aug 4, 2022

mohantym added comp:ops OPs related issues TF 2.9 Issues found in the TF 2.9 release (or RCs) subtype:macOS macOS Build/Installation issues labels Aug 5, 2022

mohantym assigned gadagashwini and unassigned mohantym Aug 5, 2022

gadagashwini added the stat:awaiting response Status - Awaiting response from author label Aug 8, 2022

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Aug 8, 2022

gadagashwini added the stat:awaiting response Status - Awaiting response from author label Aug 10, 2022

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Aug 11, 2022

gadagashwini assigned sachinprasadhs and unassigned gadagashwini Aug 12, 2022

sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 16, 2022

tilakrayal added the stat:awaiting response Status - Awaiting response from author label Apr 26, 2024

sachinprasadhs removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip #57010

tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip #57010

slyforce commented Aug 4, 2022 •

edited

Issue Type

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output

gadagashwini commented Aug 8, 2022

slyforce commented Aug 8, 2022

gadagashwini commented Aug 10, 2022

maxhgerlach commented Aug 10, 2022

slyforce commented Aug 11, 2022 •

edited

sachinprasadhs commented Aug 16, 2022

tilakrayal commented Apr 26, 2024

tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip #57010

tf.constant with tf.float16 results in incorrect outputs with Mac M1 chip #57010

Comments

slyforce commented Aug 4, 2022 • edited

Issue Type

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output

gadagashwini commented Aug 8, 2022

slyforce commented Aug 8, 2022

gadagashwini commented Aug 10, 2022

maxhgerlach commented Aug 10, 2022

slyforce commented Aug 11, 2022 • edited

sachinprasadhs commented Aug 16, 2022

tilakrayal commented Apr 26, 2024

slyforce commented Aug 4, 2022 •

edited

slyforce commented Aug 11, 2022 •

edited