Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior calling tf.keras.Model.call() with named parameters #35902

Closed
yngtodd opened this issue Jan 15, 2020 · 4 comments
Closed
Assignees
Labels
comp:keras Keras related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug

Comments

@yngtodd
Copy link

yngtodd commented Jan 15, 2020

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04.3 LTS
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.0.2
  • Python version: Python 3.7.4 Anaconda
  • CUDA/cuDNN version: ROCM:
Package: rocm-libs
Version: 3.0.6
Priority: optional
Section: devel
Maintainer: Advanced Micro Devices Inc.
Installed-Size: 13.3 kB
Depends: rocfft, rocrand, rocblas, hipblas, rocsparse, hipsparse, rocalution, rocprim, rocthrust, hipcub
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 802 B
APT-Manual-Installed: yes
APT-Sources: http://repo.radeon.com/rocm/apt/debian xenial/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack
  • GPU model and memory: AMD Radeon VII

The full environment script does not work for my machine, but:

`python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"`
>>> v2.0.0-rocm-3-g0826c3a 2.0.2

Describe the current behavior

I am getting odd behavior when calling a tf.keras.Model's call method when using the names of the method's parameters. The method works as expected when using position only arguments, but breaks when using the names. However, when I call my model_instance.call() with the names of the method parameters, things work as expected. It is making me wonder which __call__ method I am calling when simply running model_instance().

Describe the expected behavior

Using the names of the parameters in a tf.keras.Model's call method should not be raising an error.

Code to reproduce the issue

First, a little bit of setup showing that calling a tf.keras.layers.Attention instance from a function works with and without using the names of the positional arguments in a user defined function, call:

import tensorflow as tf


def call(q, v, k, mask_q=None, mask_v=None):
    """ Call attention instance """
    return attn(inputs=[q, v, k], mask=[mask_q, mask_v])

x = tf.random.uniform((1, 2, 2))
attn = tf.keras.layers.Attention(use_scale=True)
# position arguments work well
call(x, x, x)
>>> <tf.Tensor: id=89, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.62968266, 0.6612503 ],
        [0.6235384 , 0.73767066]]], dtype=float32)>
# naming the parameters also fine here
call(q=x, v=x, k=x)
>>> <tf.Tensor: id=89, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.62968266, 0.6612503 ],
        [0.6235384 , 0.73767066]]], dtype=float32)>

Things start getting weird when doing something similar within a tf.keras.Model:

class MyAttention(tf.keras.Model):
    
    def __init__(self):
        super(MyAttention, self).__init__()
        self.attention = tf.keras.layers.Attention(use_scale=True)
        
    def call(self, q, v, k, mask_q=None, mask_v=None):
        return self.attention(inputs=[q, v, k], mask=[mask_q, mask_v])


my_attention = MyAttention()
# Still works with positional arguments
my_attention(x, x, x)
>>> <tf.Tensor: id=106, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.62968266, 0.6612503 ],
        [0.6235384 , 0.73767066]]], dtype=float32)>
# Breaks when naming the arguments in my_attention:
my_attention(q=x, v=x, k=x)
>>> 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-5fa3b47998d9> in <module>
----> 1 needs_attention(q=x, v=x, k=x)

TypeError: __call__() missing 1 required positional argument: 'inputs'

Finally, if I explicitly call my_attention.call():

my_attention.call(q=x, v=x, k=x)
>>> <tf.Tensor: id=106, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.62968266, 0.6612503 ],
        [0.6235384 , 0.73767066]]], dtype=float32)>

Other info / logs
Here is a gist to show this behavior:
https://gist.github.com/yngtodd/f3bda25503a9611765ab33c1178db48c

@yngtodd
Copy link
Author

yngtodd commented Jan 15, 2020

Interestingly, if I add **kwargs to MyAttention:

class MyAttention(tf.keras.Model):
    
    def __init__(self):
        super(MyAttention, self).__init__()
        self.attention = Attention(use_scale=True)
        
    def call(self, q, v, k, mask_q=None, mask_v=None, **kwargs):
        """ Print **kwargs, then call tf.keras.layers.Attention """
        for key, value in kwargs.items(): 
            print(f'{key} == {value}') 
        return self.attention(inputs=[q, v, k], mask=[mask_q, mask_v])

I can check to see if that my_attention.call is being used:

# as expected:
my_attention(x, x, x, extra_arg='hi')
>>> extra_arg == hi
<tf.Tensor: id=58, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.74538416, 0.40650344],
        [0.6511749 , 0.39531568]]], dtype=float32)>
# fails to print the kwarg, complains about missing positional arg `inputs`
my_attention(q=x, v=x, k=x, extra_arg='hi')
>>>
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-c2133110731e> in <module>
----> 1 needs_attention(q=x, v=x, k=x, extra_arg='hi')

TypeError: __call__() missing 1 required positional argument: 'inputs'
# it only takes leaving out the name of the first parameter for it to work again
my_attention(x, v=x, k=x, extra_arg='hi')
>>> extra_arg == hi
<tf.Tensor: id=63, shape=(1, 2, 2), dtype=float32, numpy=
array([[[0.74538416, 0.40650344],
        [0.6511749 , 0.39531568]]], dtype=float32)>

@gadagashwini-zz gadagashwini-zz self-assigned this Jan 16, 2020
@gadagashwini-zz gadagashwini-zz added TF 2.0 Issues relating to TensorFlow 2.0 comp:keras Keras related issues type:bug Bug labels Jan 16, 2020
@gadagashwini-zz
Copy link
Contributor

I could replicate the issue with Tf 2.0.
Please find the gist here. Thanks!

@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jan 16, 2020
@qlzh727
Copy link
Member

qlzh727 commented Feb 14, 2020

Can't reproduce this error with latest tf-nightly 2.2.0.dev20200203. Probably the issue is fixed by the recent changes.

@qlzh727 qlzh727 closed this as completed Feb 14, 2020
@tensorflow-bot
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug
Projects
Development

No branches or pull requests

4 participants