Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rnn with initial_state model can't be loaded with load_model #32

Closed
xing-w opened this issue Aug 31, 2023 · 5 comments
Closed

rnn with initial_state model can't be loaded with load_model #32

xing-w opened this issue Aug 31, 2023 · 5 comments

Comments

@xing-w
Copy link

xing-w commented Aug 31, 2023

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

2.13.0

Custom code

Yes

OS platform and distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

A simple RNN with LSTMcell model.
I want to initialize the states with initial_state_h and initial_state_c.

batch_size= 16
inputs = tf.keras.layers.Input(shape=(20,5),batch_size=batch_size)
units = 8
lstm_cell_fw = tf.keras.layers.LSTMCell(units)

initial_state_h = tf.random.normal(shape = (batch_size,units), mean=0., stddev=10., dtype=tf.dtypes.float32)
initial_state_c = tf.random.normal(shape = (batch_size,units), mean=0., stddev=10., dtype=tf.dtypes.float32)
lstm_layer_fw = tf.keras.layers.RNN(lstm_cell_fw, stateful=True, return_state=True, return_sequences=False)
outputs,states_h_fw, states_c_fw= lstm_layer_fw(inputs,initial_state = [initial_state_h,initial_state_c])

lstm_dense1 = tf.keras.layers.Dense(16, activation = 'relu')
lstm_dense2 = tf.keras.layers.Dense(2, activation = 'softmax')
out=lstm_dense2(lstm_dense1(outputs))

model = tf.keras.models.Model(inputs, out)

After compile and train, the model is saved with model.save('my_model_test.keras').

model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()

xTrain = np.random.rand(96,20,5)
yTrain = np.random.rand(96,2)

for i in range(10):
  model.fit(xTrain, yTrain,batch_size=batch_size)

model.save('my_model_test.keras')

But when I try to load it with load_model = tf.keras.models.load_model('my_model_test.keras'), it gives error:

13 frames
[/usr/local/lib/python3.10/dist-packages/keras/src/backend.py](https://localhost:8080/#) in int_shape(x)
   1530     """
   1531     try:
-> 1532         shape = x.shape
   1533         if not isinstance(shape, tuple):
   1534             shape = tuple(shape.as_list())

AttributeError: 'float' object has no attribute 'shape'

I tried to save in other format, .h5, .json, etc. All give the same error.

But, if I don't use initial_state in outputs,states_h_fw, states_c_fw= lstm_layer_fw(inputs), everything goes well. No problem with load_model.

Standalone code to reproduce the issue

https://colab.research.google.com/gist/sushreebarsa/df202f7ea6ad3c85bdf4184cc8e1c9a1/rnn_save_model.ipynb

Relevant log output

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-8e0130abf25e> in <cell line: 1>()
----> 1 load_model = tf.keras.models.load_model('my_model_test.keras')

13 frames
/usr/local/lib/python3.10/dist-packages/keras/src/backend.py in int_shape(x)
   1530     """
   1531     try:
-> 1532         shape = x.shape
   1533         if not isinstance(shape, tuple):
   1534             shape = tuple(shape.as_list())

AttributeError: 'float' object has no attribute 'shape'
@tilakrayal
Copy link
Collaborator

@xing-w,
I request you to take a look at this issue where a similar feature has been proposed and it is still open.Also I request to follow the similar issue which has been raised to have the updates on the similar issue. Thank you!

@xing-w
Copy link
Author

xing-w commented Sep 5, 2023

I think I am facing the same issue with load_model(). Right now, I'm using save_weight() and load_weight() to go around this problem. Hope it would be fixed soon.

@sachinprasadhs sachinprasadhs transferred this issue from keras-team/keras Sep 22, 2023
@tilakrayal
Copy link
Collaborator

@xing-w,
I tried to execute the mentioned code on tf-nightly and keras 3.0 version and it was executed without any issue/error. Kindly find the gist of it here. Thank you!

Copy link

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Jan 26, 2024
Copy link

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants