Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Exception encountered when calling RNN.call() - Undefined shapes are not supported #19482

Closed
mpetteno opened this issue Apr 10, 2024 · 3 comments
Assignees

Comments

@mpetteno
Copy link

Hi everyone,

I have updated Keras to the last release (3.2.1).
Looking at this snippet:

model_a = keras.Sequential([
    keras.Input(shape=(5, 10)),
    keras.layers.LSTM(10, return_sequences=True),
    keras.layers.LSTM(10, return_state=True)
])
model_a.summary()
model_b = keras.Sequential([
    keras.Input(shape=(5, 10)),
    keras.layers.RNN([keras.layers.LSTMCell(10), keras.layers.LSTMCell(10)], return_state=True)
])
model_b.summary()

If I print the summary without return_state=True they have the same number of parameters:

Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 5, 10)          │           840 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM)                   │ (None, 10)             │           840 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,680 (6.56 KB)
 Trainable params: 1,680 (6.56 KB)
 Non-trainable params: 0 (0.00 B)
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ rnn (RNN)                       │ (None, 10)             │         1,680 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,680 (6.56 KB)
 Trainable params: 1,680 (6.56 KB)
 Non-trainable params: 0 (0.00 B)

But with the flag I have this error:

Error
Traceback (most recent call last):
File "test.py", line 164, in test_rnn
model_b = keras.Sequential([
^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/keras/src/models/sequential.py", line 74, in init
self._maybe_rebuild()
File ".venv/lib/python3.11/site-packages/keras/src/models/sequential.py", line 139, in _maybe_rebuild
self.build(input_shape)
File ".venv/lib/python3.11/site-packages/keras/src/layers/layer.py", line 222, in build_wrapper
original_build_method(*args, **kwargs)
File ".venv/lib/python3.11/site-packages/keras/src/models/sequential.py", line 180, in build
x = layer(x)
^^^^^^^^
File ".venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File ".venv/lib/python3.11/site-packages/optree/ops.py", line 594, in tree_map
return treespec.unflatten(map(func, *flat_args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Exception encountered when calling RNN.call().

Undefined shapes are not supported.

Arguments received by RNN.call():
• args=('<KerasTensor shape=(None, 5, 10), dtype=float32, sparse=None, name=keras_tensor_5>',)
• kwargs={'training': 'False', 'mask': 'None'}

Is this a Keras issue?

From what I understand the main difference should be that model_b returns the states of both LSTM layers while model_a returns only the final ones (as expected). But in the stacked implementation of model_b are the states of the first layer used to initialize the states of the second one?

Thanks for your support again.

@fchollet
Copy link
Member

From what I understand the main difference should be that model_b returns the states of both LSTM layers while model_a returns only the final ones (as expected).

Yes, that's right. The second model would return 4 state tensors (2 per cell).

But in the stacked implementation of model_b are the states of the first layer used to initialize the states of the second one?

No, states are initialized at zero by each cell. To get a non-zero state you would have to pass the initial state when calling the layer.

Is this a Keras issue?

Yes, that's actually a bug. I've fixed it at HEAD. Check that it works for you.

Note that since your layer returns (outputs, [cell_1_state_0, cell_0_state_1], [cell_1_state_1, cell_1_state_2]) you cannot use it with a Sequential model. Instead you could do something like:

inputs = keras.Input(shape=(5, 10))
outputs, cell_1_states, cell_2_states = keras.layers.RNN(
    [keras.layers.LSTMCell(10), keras.layers.LSTMCell(10)],
    return_state=True,
)(inputs)
model = keras.Model(inputs, [outputs] + cell_1_states + cell_2_states)
model.summary()

@mpetteno
Copy link
Author

mpetteno commented Apr 11, 2024

Hi @fchollet, thanks for the quick fix but I can't find it at master branch HEAD.

So model_a and model_b are actually equivalent in the first snippet? And they will behave differently (even if they have the same number of parameters) If I do something like:

from keras.layers import RNN, LSTM, LSTMCell

inputs = keras.Input(shape=(5, 10))
first_lstm_layer_out, *cell_states = LSTM(10, return_sequences=True, return_state=True)(inputs)
second_lstm_layer_out = LSTM(10)(first_lstm_layer_out, initial_state=cell_states)
model_a = keras.Model(inputs, second_lstm_layer_out)
model_a.summary()

inputs = keras.Input(shape=(5, 10))
stacked_lstm_outputs = RNN([LSTMCell(10), LSTMCell(10)])(inputs)
model_b = keras.Model(inputs, stacked_lstm_outputs)
model_b.summary()

@fchollet
Copy link
Member

Yes, there's only one gotcha: Functional model inputs/outputs must be flat structured, and here stacked_lstm_outputs is nested. You have to flatten it (like in my example above). If you want to keep it structured, write a subclassed model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants