Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attention Mechanism not working #56

Closed
SaharaAli16 opened this issue May 27, 2021 · 10 comments
Closed

Attention Mechanism not working #56

SaharaAli16 opened this issue May 27, 2021 · 10 comments

Comments

@SaharaAli16
Copy link

SaharaAli16 commented May 27, 2021

Hi,
I have added an attention layer (following the example) to my simple LSTM network shown below.

timestep = timesteps
features = 11
model = Sequential()
model.add(LSTM(64, input_shape=(timestep,features), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32, return_sequences=True))
model.add(LSTM(16, return_sequences=True))
model.add(Attention(32))
model.add(Dense(32))
model.add(Dense(16))
model.add(Dense(1))
print(model.summary())
The code worked fine up till last week and I got a summary of model having attention layer details like this:
image

However, now running the same code gives me a weird error.
ValueError: tf.function-decorated function tried to create variables on non-first call.

What I noticed is that the model summary has changed too:
image

I am tight on time due an upcoming deadline. Any assistance would be highly appreciated.
P.S. This was a fully working model that has stopped working all of a sudden for no apparent reason.

@philipperemy
Copy link
Owner

Try to downgrade your tensorflow version.

@SaharaAli16
Copy link
Author

SaharaAli16 commented May 28, 2021

I changed the way I was defining the model, without downgrading Tensorflow and it started working again. New model definition:

timestep = timesteps
features = 11

model_input = Input(shape=(timestep,features))
x = LSTM(64, return_sequences=True)(model_input)
x = Dropout(0.2)(x)
x = LSTM(32, return_sequences=True)(x)
x = LSTM(16, return_sequences=True)(x)
x = Attention(32)(x)
x = Dense(32)(x)
x = Dense(16)(x)
x = Dense(1)(x)
model = Model(model_input, x)
print(model.summary())

@philipperemy
Copy link
Owner

Great!

@SaharaAli16
Copy link
Author

SaharaAli16 commented May 30, 2021

Quick follow-up question: Can you tell how to downgrade tensorflow to 2.3? Current version in Colab is 2.5 and I am having the reported issue again, even with the new model definition.
I know %tensorflow_version 2.x cannot downgrade TF to 2.3

@SaharaAli16 SaharaAli16 reopened this May 30, 2021
@philipperemy
Copy link
Owner

I think this should work:

!pip install tensorflow==2.3

Like that
image

@SaharaAli16
Copy link
Author

Alright, so that worked. Next up, I cannot use multiple Attention layers in one ensembled model. So, I have model1 that has an attention layer and I have model2 that has another attention layer. But when I concatenate these two models, I get this error:
ValueError: The name "last_hidden_state" is used 2 times in the model. All layer names should be unique.
I believe this is because the attention layer itself has multiple inner/nested layers and models cannot have layers with same name. I tried renaming the attention layer but since it is just a wrapper, that renaming didn't help and the error persists.
Any workaround for this?

@philipperemy
Copy link
Owner

@SaharaAli16 yes have to remove the names inside the layers: https://github.com/philipperemy/keras-attention-mechanism/blob/0f8b440e8e74fb25309b2d391f7280bf4f13129a/attention/attention.py#L24. Otherwise Keras will complain that they already exist if you instantiate a second Attention class.

@shlomi-schwartz
Copy link

The suggested setup:

timestep = timesteps
features = 11

model_input = Input(shape=(timestep,features))
x = LSTM(64, return_sequences=True)(model_input)
x = Dropout(0.2)(x)
x = LSTM(32, return_sequences=True)(x)
x = LSTM(16, return_sequences=True)(x)
x = Attention(32)(x)
x = Dense(32)(x)
x = Dense(16)(x)
x = Dense(1)(x)
model = Model(model_input, x)
print(model.summary())

No longer works for TF 2.7.0

Error :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-88e2c30c5093> in <module>()
      7 x = LSTM(32, return_sequences=True)(x)
      8 x = LSTM(16, return_sequences=True)(x)
----> 9 x = Attention(32)(x)
     10 x = Dense(32)(x)
     11 x = Dense(16)(x)

1 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in __init__(self, trainable, name, dtype, dynamic, **kwargs)
    339              trainable.dtype is tf.bool)):
    340       raise TypeError(
--> 341           'Expected `trainable` argument to be a boolean, '
    342           f'but got: {trainable}')
    343     self._trainable = trainable

TypeError: Expected `trainable` argument to be a boolean, but got: 32

@SaharaAli16
Copy link
Author

I would suggest copying the source code and compile it in your code. That should work.

@philipperemy
Copy link
Owner

Yes so this issue was fixed in the latest release (4.1) of the attention mechanism.

pip install attention --upgrade

Will solve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants