Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use TimeDistributed if I have multiple inputs #3057

Closed
yataozhong opened this issue Jun 24, 2016 · 23 comments
Closed

How to use TimeDistributed if I have multiple inputs #3057

yataozhong opened this issue Jun 24, 2016 · 23 comments

Comments

@yataozhong
Copy link

TimeDistributed works fine if there is only one input as is in this exampe at the bottom of the page. But when there are multiple inputs, TimeDistributed seems not working.

Say, if my model has 3 inputs,
seq_inputs= [Input(shape=(TIME_STEPS, FEATURE_LENGTH)) for i in range(3)] outputs=TimeDistributed(model)(seq_inputs)
the reported error is: TypeError: can only concatenate tuple (not "list") to tuple

So I changed the last to outputs=TimeDistributed(model)(*seq_inputs), but there is still an error saying that TypeError: call() takes at most 3 arguments (4 given)

################# below is my code

from keras.models import Sequential, Model, Graph
from keras.layers import Input, Convolution2D, MaxPooling2D, LSTM, Dense, BatchNormalization, ZeroPadding2D, Flatten, merge, Masking, Dropout, TimeDistributed, Reshape, Lambda, Embedding
from keras import backend

NUM_INPUTS=3
TIME_STEPS=20

model = Sequential()
model.add(Dense(32, input_dim=784))

inputs = [Input(shape=(32,)) for i in range(NUM_INPUTS)]
temps=[model(x) for x in inputs]
merged=merge(temps, mode='concat')

merged_model=Model(input=inputs, output=merged)

merged_model(inputs)

pdb.set_trace()

seq_inputs = (Input(shape=(TIME_STEPS, 32)) for i in range(NUM_INPUTS))
outputs=TimeDistributed(merged_model)(*seq_inputs)

@ghost
Copy link

ghost commented Aug 5, 2016

Have you found a solution yet?

@farizrahman4u
Copy link
Contributor

farizrahman4u commented Aug 7, 2016

num_inputs = 3
input_dim = 784
input_length = 20
output_dim = 32

model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim))

merged_input = Input((num_inputs, input_dim))
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
merged = merge(temps, 'concat')
merged_model = Model(input=merged_input, output=merged)

seq_inputs = [Input(input_length, input_dim) for x in range(num_inputs)]
seq = map(Reshape((input_length, 1, input_dim)), seq_inputs)
seq = merge(seq, 'concat', concat_axis=2)
outputs = TimeDistributed(merged_model)(seq)

@ghost
Copy link

ghost commented Aug 7, 2016

@farizrahman4u
thanks, my solution looks similar.
I merge the inputs and then use Lambda functions (for slicing) to retrieve all parts.

@farizrahman4u
Copy link
Contributor

If you guys need it, I could enable multi input support to the TimeDistributed wrapper.

@ghost
Copy link

ghost commented Aug 9, 2016

I think it is not necessary, as everything for this exist. Also this thread easily pops up when search in the Internet.

@farizrahman4u
Copy link
Contributor

#3432

@nixingyang
Copy link

@farizrahman4u I believe the trick of merging multiple input tensors would not work if the shape of the input tensor differ from each other. It would be nice if there is native and robust support of using multiple input tensors with TimeDistributed.

@farizrahman4u
Copy link
Contributor

They should have the same number of timesteps either way.

@daniilsorokin
Copy link

@farizrahman4u I think that would be useful and for the sake of consistency.
Or at least it should be mentioned in the TimeDistributed documentation/error message that it doesn't support multiple inputs. This thread is easy to find once you figure out what the cause of the problem is, but it's not obvious right away that there should be this problem. It is easy to assume that you can pass multiple inputs here as anywhere else.

@calclavia
Copy link

calclavia commented Apr 7, 2017

Is multi input support planned to be implemented? I'm currently packing multi-inputs into a single input, which is not very ideal/good design.

@QuentinFresnel
Copy link

@farizrahman4u
Thank you for your solution. I tried it but it fails


With theano backend, I had

TypeError: ('Not a Keras tensor:', Subtensor{::, int64, ::}.0)

because of line 13

...
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
...

With tensorflow backend, I had

AttributeError: 'NoneType' object has no attribute 'inbound_nodes'

because of line 15

...
merged_model = Model(input=merged_input, output=merged)
...

I am new to keras and use

keras: 2.0.2
theano 0.10.0dev1.dev-RELEASE
tensorflow 1.1.0
ubuntu 16.04

@ChiaraMasiero
Copy link

Any update on multi-input support for TimeDistributed?
I would like to implement a hierarchical model to classify each sentence based on the context information provided by the whole document. Thus, I designed a model where the first recurrent layer works at sentence level and the second one at document level. I want to include further sentence-level information (e.g. sentence type or category), concatenate it to the output of the first layer and use the resulting augmented tensor to feed the second recurrent layer. Here is my code:

# input sentence
in_sentence = Input(shape=(MAX_LENGTH,), dtype='int32')
# additional sentence-level information (fixed length array - already "embedded")
in_info = Input(shape=(INFO_LENGTH,), dtype='float32')
# first layer (sentence-level)
# word embedding
embedded_sentence = Embedding(len(vocab) + 1,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                     trainable = False)(in_sentence)

recurrent_sentence = GRU(hidden_dim_1)(embedded_sentence)
recurrent_sentence_and_info = concatenate([recurrent_sentence, in_info])
encoded_model = Model([in_sentence, in_info], recurrent_sentence_and_info)

#second layer (document-level)
sequence_input = Input(shape=(MAX_SENTENCES, MAX_LENGTH), dtype='int32')
info_seq_input = Input(shape=(MAX_SENTENCES, INFO_LENGTH), dtype='float32')
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])

# Encode entire sentence
seq_encoded = GRU(hidden_dim_2,return_sequences=True)(seq_encoded)

# Prediction
prediction = Dense(NUM_CLASSES, activation='softmax')(seq_encoded)
model = Model([sequence_input,info_seq_input], prediction)

For the sake of completeness, MAX_LENGTH is different from INFO_LENGTH, in general.
When I run
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])
I get the following assertion error: assert len(input_shape) >= 3

Without additional sentence-level information, TimeDistributed has a single input and everything seems to work fine. Is there any work around to include multiple input?

@stale stale bot added the stale label Sep 25, 2017
@stale
Copy link

stale bot commented Sep 25, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@ghost
Copy link

ghost commented Apr 28, 2018

I am using Bucketing to group together batches of different length. This is time series data where I have multiple time series of the same length (but variable length across batches) as input to the different layers at the beginning of the model. Thus the input shape for my input layer is (None, 1) because I only have one column of data per input. How can I apply @farizrahman4u original solution without an input length?

I've tried submitting None as a dimension and get the following error:

ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

@karenyun
Copy link

karenyun commented May 1, 2018

If the multi-input of TimeDistributed has different shape, such as there has three input A, B, C, only A is a 5D tensor, the B and C are not, but I have to input them together to a custom_conv function through Lambda layer which will be wrapped by the TimeDistributed, so does it have possibility to support to input a list with different shapes.

@dmadeka
Copy link
Contributor

dmadeka commented Jun 27, 2018

@karenyun Can you elaborate? How you pass a 5D tensor and 2 non-5D tensors exactly?

@iretiayo
Copy link

iretiayo commented Sep 9, 2018

Here is a simple code that shows the problem of trying to combine a multi-input model with LSTM. It fails on the TimeDistributed line. Any ideas on how to fix it?

from keras.models import Sequential, Model
from keras.layers import Conv2DTranspose, Input, InputLayer, Reshape, Lambda, Concatenate
from keras.layers import TimeDistributed, Conv2D, MaxPooling2D, Flatten, Dropout, Dense
from keras.layers.recurrent import LSTM

sequence_len = 50
input_image_dim = (128,) * 2 + (3,)
input_vector_dim = (5,)
output_dim = 4

# simple multi-input model
input_x1 = Input(shape=input_image_dim, name='image_input')
x = input_x1
x = Conv2D(128, (4, 4), strides=5, activation='relu')(x)
x1 = Flatten()(x)

input_x2 = Input(shape=input_vector_dim, name='vector_input')
x = input_x2
x2 = Dense(32)(x)

x = Concatenate()([x1, x2])
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

# pass in a model into LSTM
input_x = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')
input_v = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')

x = TimeDistributed(model)([input_x, input_v])
x = LSTM(256, return_sequences=True, dropout=0.5)(x)
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

@raharth
Copy link

raharth commented Oct 13, 2018

@iretiayo Have you found any solution to your problem? I'm currently looking at the exact same architectural problem. As you, I'm having a 2D vector and an image as input, which is then fed to a LSTM. If you have any solution using a different approach, it would also be great if you could share it! My only idea of how to solve that is by using T input networks with shared weight, which then are used as a sequence to feed to the lstm layer.

@mohammadAbdolhosseiniMoghaddam

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

@raharth
Copy link

raharth commented Feb 15, 2019

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

Unfortunately I don't see your code. As far as I remember I actually used shared weights to solve it, but it was a whole mess and really hacky. I actually decided to build the same architecture using pytorch, which is way more flexible and cleaner for an architecture like that. Since it doesn't compile the graph it doesn't care about what you did with the tensor before feeding it to a specific layer, so you just need to take care that the shape actually matches what it expects.

@mohammadAbdolhosseiniMoghaddam
Copy link

@farizrahman4u Thanks a lot for the proposed solution. I tried Lambda layer, but it seems the nested model can't be trained when it is embedded in Lambda layer. Do you have any suggestion regarding this issue?
Here, you can find the snippet of my code that I am stuck. Any help is highly appreciated.

@chrishmorris
Copy link

RepeatVector can help, e.g.

TimeDistributed( Dense(tensorflow.shape(word_embeddings)[2] ) )(Concatenate()([
    word_embeddings,  
    RepeatVector( tensorflow.shape(word_embeddings)[1] )(sentence_embedding)
]))

@losDaniel
Copy link

I was able to solve this problem using the RepeatVector layer.

from keras.models import Model
from keras.layers import Input, Dense, BatchNormalization, LSTM, TimeDistributed #, Conv1D, LeakyReLU, MaxPool1D,
from keras.layers import Concatenate, RepeatVector
...
core_input_1 = Input(shape=(self.core_timesteps, self.core_input_1_dim), name='core_input_1')
core_branch_1 = BatchNormalization(momentum=0.0, name='core_1_bn')(core_input_1)
core_branch_1 = LSTM(self.core_nodes[0], activation='relu', name='core_1_lstm_1', return_sequences=True)(core_branch_1)
core_branch_1 = LSTM(self.core_nodes[1], activation='relu', name='core_1_lstm_2')(core_branch_1)
   
core_input_2 = Input(shape=(self.core_timesteps, self.core_input_2_dim), name='core_input_2')
core_branch_2 = BatchNormalization(momentum=0.0, name='core_2_bn')(core_input_2)
core_branch_2 = LSTM(self.core_nodes[0], activation='relu', name='core_2_lstm_1', return_sequences=True)(core_branch_2)
core_branch_2 = LSTM(self.core_nodes[1], activation='relu', name='core_2_lstm_2')(core_branch_2)
        
merged = Concatenate()([core_branch_1, core_branch_2])

full_branch = RepeatVector(self.output_timesteps)(merged)        
full_branch = LSTM(self.core_nodes[1], activation='relu', name='final_lstm', return_sequences=True)(full_branch)

full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense', activation='relu'))(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense'))(full_branch)

Note that return sequences before the concat is False and that the RepeatVector layer repeats the concat vector the same number of timesteps as I want output by the TimeDistributed layer at the end. This is a multi-variate mutli-step forecasting model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests