Unwanted tf.function retracing when using variable-length inputs #38561

zaccharieramzi · 2020-04-15T05:32:44Z

System information

Have I written custom code (as opposed to using a stock
example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): 2.2.0rc2
Python version: 3.6.8

Describe the current behavior

A lot of warnings saying that there is a tf.function retracing are happening when using a keras model in a loop with variable length inputs.

Describe the expected behavior

I would like not to have retracing if there is no need (for example a fully convolutionnal model).

Standalone code to reproduce the issue

from random import randint

import tensorflow as tf
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.models import Sequential

model = Sequential()
model.add(Conv1D(8, 3))
model.build([None, 12, 1])

predict_tensors = [
    tf.random.normal([randint(1, 8), randint(4, 40), 1])
    for _ in range(10)
]
for t in predict_tensors:
    _ = model.predict(t)

Other info / logs

Logs:

WARNING: Logging before flag parsing goes to stderr.
W0406 09:22:52.525994 139643050075904 def_function.py:598] 5 out of the last 6 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f00a7fc1268> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
W0406 09:22:52.615050 139643050075904 def_function.py:598] 6 out of the last 7 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f00a7fc1268> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
W0406 09:22:52.653312 139643050075904 def_function.py:598] 7 out of the last 8 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f00a7fc1268> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
W0406 09:22:52.706550 139643050075904 def_function.py:598] 8 out of the last 10 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f00a7fc1268> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.

This issue was originally described here, and some other people have had trouble with training as well.

When switching back to 2.1, the problem is gone.

The text was updated successfully, but these errors were encountered:

gurushantj · 2020-04-15T06:53:00Z

As per my understanding, if the input tensor's shape or dtype changes(if it is not constant) then the function would get retraced again.
You may refer this https://www.tensorflow.org/api_docs/python/tf/function

zaccharieramzi · 2020-04-15T07:00:35Z

Yes this is totally true, but I am not using tf.function myself directly. Maybe keras is under the hood, but in any case they should handle inputs with varying shapes (but same rank and "compatible" shapes) better by for example specifying a dynamic input signature (see Inputs signatures in the doc).

Moreover, the behaviour I am describing is for version 2.2.0rc2, and the doc is still for 2.1 where there is no issue.

ngc92 · 2020-04-15T08:23:41Z

You can see the current doc here:
https://www.tensorflow.org/api_docs/python/tf/function?version=nightly
I think the option you need should be experimental_relax_shapes.

As a workaround, you could try to wrap the keras model in an explicit tf.function call, like this

@tf.function(experimental_relax_shapes=True)
def predict(x):
     return model.predict(x)

gurushantj · 2020-04-15T08:45:21Z

Yes this is totally true, but I am not using tf.function myself directly. Maybe keras is under the hood, but in any case they should handle inputs with varying shapes (but same rank and "compatible" shapes) better by for example specifying a dynamic input signature (see Inputs signatures in the doc).

Moreover, the behaviour I am describing is for version 2.2.0rc2, and the doc is still for 2.1 where there is no issue.

Following is the ouput of tf 2.1.0, seems output is the same

/usr/local/bin/python3.7 /Users/gurushant/PycharmProjects/MTCNN/test6.py
2020-04-15 14:12:33.527382: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-15 14:12:33.545554: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa56ad8c050 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-15 14:12:33.545588: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:5 out of the last 5 calls to <function _make_execution_function.<locals>.distributed_function at 0x134d83290> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
WARNING:tensorflow:6 out of the last 6 calls to <function _make_execution_function.<locals>.distributed_function at 0x134d83290> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
WARNING:tensorflow:7 out of the last 7 calls to <function _make_execution_function.<locals>.distributed_function at 0x134d83290> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
WARNING:tensorflow:8 out of the last 8 calls to <function _make_execution_function.<locals>.distributed_function at 0x134d83290> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
WARNING:tensorflow:9 out of the last 9 calls to <function _make_execution_function.<locals>.distributed_function at 0x134d83290> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.

zaccharieramzi · 2020-04-15T08:59:00Z

@gurushantj yes you are right. I don't know why I thought this was in 2.1, it's actually in 2.0 that the problem is gone.

Still, the documentation regarding the re-tracing is about the same.

@ngc92 I tried your workaround but got the following error:

ValueError: When using data tensors as input to a model, you should specify the `steps` argument.

gurushantj · 2020-04-15T10:48:32Z

@gurushantj yes you are right. I don't know why I thought this was in 2.1, it's actually in 2.0 that the problem is gone.

Still, the documentation regarding the re-tracing is about the same.

@ngc92 I tried your workaround but got the following error:
ValueError: When using data tensors as input to a model, you should specify the `steps` argument.

Could you please validate following and let me know :

Disable eager execution setting

tf.compat.v1.disable_eager_execution()

and pass steps=1 in model.predict and validate

amahendrakar · 2020-04-15T12:49:19Z

Was able to reproduce the issue with TF v2.1, TF v2.2.0rc3, TF-nightly. Please find the attached gist. Thanks!

amahendrakar · 2020-04-15T12:53:13Z

@zaccharieramzi,
Could you please check this comment from a similar issue and let us know if it works? Thanks!

zaccharieramzi · 2020-04-15T13:33:49Z

@amahendrakar I am not sure what I am supposed to see in that comment. The issue you linked suggests that this should be dealt with.

zaccharieramzi · 2020-04-15T13:35:44Z

@ngc92 still got an error: AttributeError: 'Tensor' object has no attribute '_numpy'.

ngc92 · 2020-04-15T19:07:47Z

Is this what you want to do?

@tf.function(experimental_relax_shapes=True)
def predict(t):
    return model(t)

for t in predict_tensors:
    _ = predict(t)

Note that you are no longer using any features of the model.predict function, but since you seem to be looping over examples by hand that might be OK.

Also, in tf 2.2 there is support for custom model.predict_function, i.e. you might be able to do something like

model.predict_function = tf.function(experimental_relax_shapes=True)(model.predict_function)

i.e. just wrapping the default provided function in something that relaxes shapes.
I haven't tried 2.2 yet, so I'm not very sure about the second suggestion.

zaccharieramzi · 2020-04-16T09:10:11Z

@ngc92 yes this is a fair workaround. However there are cases where you would want to use predict for the callbacks or the batch size.

The second option you provided didn't work straight out of the box, but you can try things in tf 2.2 in colab: https://colab.research.google.com/drive/1MfRPQyRhjrF7he7fymoIEG7k64YCd0Da

You will notice that in the case of evaluate and I guess train if you feed the variable-length input through a tf dataset, it doesn't retrace the function, suggesting a bug somewhere.

mdanatg · 2020-04-19T01:24:49Z

We're investigating - it seems that a newly-added warning about function retracing seems to fire more than expected.

kkimdev · 2020-04-20T18:56:53Z

The error message is not new one so this seems from the existing retracing detection logic. I think the warning is WAI as it's tracing many times here. Perhaps Keras using experimental_relax_shapes is an option?

mdanatg · 2020-04-20T19:13:20Z

@omalleyt12 @fchollet

zaccharieramzi · 2020-05-08T13:19:09Z

@mdanatg do you have any news on this?

mdanatg · 2020-05-08T14:24:43Z

@zaccharieramzi No fix yet. According to the code the function that the warning talks about should be cached and only traced once.

@omalleyt12 any thoughts why the tracing happens so many times?

zaccharieramzi · 2020-05-08T18:04:35Z

@mdanatg ok too bad, I just have one question though maybe you have the answer.
Do you know if the fix provided by @ngc92 , i.e.:

@tf.function(experimental_relax_shapes=True)
def predict(t):
    return model(t)

for t in predict_tensors:
    _ = predict(t)

still allows predict to benefit from a distribution strategy (typically MirroredStrategy)? My guess is that not but I am not sure, and not sure how to test this on a single GPU (2 logical GPUs).

mdanatg · 2020-05-08T18:29:43Z

@guptapriya

omalleyt12 · 2020-05-18T04:23:51Z

@zaccharieramzi Thanks for the issue! This should be fixed in the latest nightly

google-ml-butler · 2020-05-18T04:23:53Z

Are you satisfied with the resolution of your issue?
Yes
No

zaccharieramzi · 2020-05-28T09:53:13Z

@omalleyt12 thanks ! One question I didn't ask though is: was this bug slowing down anything or I am just getting annoyed with an unwanted warning?

chopwoodwater · 2020-08-14T01:32:35Z

TF 2.3 still have this issue.

RuralHunter · 2021-04-22T09:47:51Z

The problem still persists with version 2.4.1 and the workaround doesn't work on predict_on_batch:

    @tf.function(experimental_relax_shapes=True)
    def predict_on_batch(self,states):
        return self.model.predict_on_batch(states)

tf reports error:
RuntimeError: Detected a call to Model.predict_on_batchinside atf.function. Model.predict_on_batch is a high-level endpoint that manages its own tf.function. Please move the call to Model.predict_on_batch outside of all enclosing tf.functions. Note that you can call a Model directly on Tensors inside a tf.function like: model(x).`

RuralHunter · 2021-04-23T03:08:17Z

OK, I found my problem is because of multi-thread calling the predict_on_batch function. I added an empty predict before launching the threads and the warning was gone.

aakashba · 2021-04-25T05:13:16Z

Have this issue with using model.predict inside a loop of 5 different models . The warning also leads to :

nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

A150852 · 2021-09-12T01:11:01Z

My observation - In multiprocessing setting, invoking predict() causes this warning and when processing large amounts of data it errors out eventually(may be memory leakage). Setting experimental_relax_shape=True for the function being invoked by multiple processors resolves the issue. Also using model(input) instead of model.predict(input) resolves the issue. So key issue seems to be due to retracing even when input shape changes. Issue persists even on using tensors as input instead of python object

zaccharieramzi added the type:bug Bug label Apr 15, 2020

google-ml-butler bot assigned amahendrakar Apr 15, 2020

amahendrakar added TF 2.1 for tracking issues in 2.1 release TF 2.2 Issues related to TF 2.2 comp:autograph Autograph related issues labels Apr 15, 2020

amahendrakar assigned gowthamkpr and unassigned amahendrakar Apr 15, 2020

gowthamkpr assigned mdanatg and unassigned gowthamkpr Apr 19, 2020

gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 19, 2020

mdanatg added the comp:core issues related to core part of tensorflow label Apr 19, 2020

mdanatg assigned kkimdev Apr 19, 2020

ckkuang mentioned this issue Apr 20, 2020

Understanding warning "5 out of the last 5 calls to <function XXX> triggered tf.function retracing" #34025

Closed

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 22, 2020

mdanatg assigned omalleyt12 May 8, 2020

mdanatg added comp:keras Keras related issues and removed comp:autograph Autograph related issues comp:core issues related to core part of tensorflow labels May 8, 2020

omalleyt12 closed this as completed May 18, 2020

amahendrakar mentioned this issue Jul 14, 2020

WARNING:tensorflow:11 out of the last 11 via using Keras library #41347

Closed

ZippeyKeys12 mentioned this issue Jul 27, 2020

Port to tf.keras aparrish/pincelate#9

Closed

adriangb mentioned this issue Jul 30, 2020

Compatibility with Dask adriangb/scikeras#24

Closed

atorch added a commit to atorch/math_sentences that referenced this issue Jul 31, 2020

Bump tensorflow version (fixes warnings tensorflow/tensorflow#38561)

7f5b687

PaulPauls mentioned this issue Aug 17, 2020

Erroneously triggering tf.function retracing warnings when rapidly creating new TF models. #42441

Closed

bhack mentioned this issue Sep 25, 2020

Spurious tf.function retracing warnings, when developing Keras layer in colab #43555

Closed

zaccharieramzi mentioned this issue Oct 13, 2020

function retracing in tensorflow 2.3 with forward nufft zaccharieramzi/tfkbnufft#23

Closed

thaink mentioned this issue Dec 29, 2020

Channel Found None in MaxUnPooling2D tensorflow/addons#2320

Closed

koernerfelicia mentioned this issue Sep 6, 2021

Investigate retracing warning on CI RasaHQ/rasa#9534

Closed

2 tasks

tilakrayal mentioned this issue Sep 14, 2021

WARNING:tensorflow:11 out of the last 11 via using Keras library keras-team/keras#14157

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unwanted tf.function retracing when using variable-length inputs #38561

Unwanted tf.function retracing when using variable-length inputs #38561

zaccharieramzi commented Apr 15, 2020

gurushantj commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

ngc92 commented Apr 15, 2020

gurushantj commented Apr 15, 2020 •

edited

Loading

zaccharieramzi commented Apr 15, 2020

gurushantj commented Apr 15, 2020 •

edited

Loading

amahendrakar commented Apr 15, 2020

amahendrakar commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

ngc92 commented Apr 15, 2020

zaccharieramzi commented Apr 16, 2020

mdanatg commented Apr 19, 2020

kkimdev commented Apr 20, 2020

mdanatg commented Apr 20, 2020

zaccharieramzi commented May 8, 2020

mdanatg commented May 8, 2020 •

edited

Loading

zaccharieramzi commented May 8, 2020

mdanatg commented May 8, 2020

omalleyt12 commented May 18, 2020

google-ml-butler bot commented May 18, 2020

zaccharieramzi commented May 28, 2020

chopwoodwater commented Aug 14, 2020

RuralHunter commented Apr 22, 2021

RuralHunter commented Apr 23, 2021

aakashba commented Apr 25, 2021 •

edited

Loading

A150852 commented Sep 12, 2021

Unwanted tf.function retracing when using variable-length inputs #38561

Unwanted tf.function retracing when using variable-length inputs #38561

Comments

zaccharieramzi commented Apr 15, 2020

gurushantj commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

ngc92 commented Apr 15, 2020

gurushantj commented Apr 15, 2020 • edited Loading

zaccharieramzi commented Apr 15, 2020

gurushantj commented Apr 15, 2020 • edited Loading

amahendrakar commented Apr 15, 2020

amahendrakar commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

zaccharieramzi commented Apr 15, 2020

ngc92 commented Apr 15, 2020

zaccharieramzi commented Apr 16, 2020

mdanatg commented Apr 19, 2020

kkimdev commented Apr 20, 2020

mdanatg commented Apr 20, 2020

zaccharieramzi commented May 8, 2020

mdanatg commented May 8, 2020 • edited Loading

zaccharieramzi commented May 8, 2020

mdanatg commented May 8, 2020

omalleyt12 commented May 18, 2020

google-ml-butler bot commented May 18, 2020

zaccharieramzi commented May 28, 2020

chopwoodwater commented Aug 14, 2020

RuralHunter commented Apr 22, 2021

RuralHunter commented Apr 23, 2021

aakashba commented Apr 25, 2021 • edited Loading

A150852 commented Sep 12, 2021

gurushantj commented Apr 15, 2020 •

edited

Loading

gurushantj commented Apr 15, 2020 •

edited

Loading

mdanatg commented May 8, 2020 •

edited

Loading

aakashba commented Apr 25, 2021 •

edited

Loading