Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a Dropout to Seq2Seq gives a MissingInputError #118

Closed
KhaledSharif opened this issue Nov 1, 2016 · 7 comments
Closed

Adding a Dropout to Seq2Seq gives a MissingInputError #118

KhaledSharif opened this issue Nov 1, 2016 · 7 comments

Comments

@KhaledSharif
Copy link

Running a Seq2Seq model without any dropout (setting the parameter dropout=0.0) works fine in both training and prediction. However, setting it to anything other than 0.0 results in an error during prediction (training works fine). Below is a code snippet that generates the error:

print(training_input.shape, training_output.shape, testing_input.shape, testing_output.shape)

model = Seq2Seq(output_dim=training_output.shape[2], output_length=training_output.shape[1],
                input_shape=(training_input.shape[1], training_input.shape[2]),
                depth=1, dropout=0.1)

model.compile(loss='mse', optimizer='sgd')

model.fit(training_input, training_output, nb_epoch=1, batch_size=512)
p = model.predict(testing_input)

And below is the output with the error produced (training works but prediction fails):

Using Theano backend.
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN not available)

(31304, 864, 6) (31304, 288, 6) (7253, 864, 6) (7253, 288, 6)

Epoch 1/1
31304/31304 [==============================] - 137s - loss: 0.1878   


Traceback (most recent call last):
  File "/home/khaled/PycharmProjects/WonderStations/sequence.py", line 83, in <module>
    p = model.predict(testing_input)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1176, in predict
    self._make_predict_function()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 734, in _make_predict_function
    **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/theano_backend.py", line 727, in function
    return Function(inputs, outputs, updates=updates, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/theano_backend.py", line 713, in __init__
    **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function.py", line 326, in function
    output_keys=output_keys)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 486, in pfunc
    output_keys=output_keys)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 1776, in orig_function
    output_keys=output_keys).create(
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 1430, in __init__
    accept_inplace)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 176, in std_fgraph
    update_mapping=update_mapping)
  File "/usr/local/lib/python3.5/dist-packages/theano/gof/fg.py", line 180, in __init__
    self.__import_r__(output, reason="init")
  File "/usr/local/lib/python3.5/dist-packages/theano/gof/fg.py", line 351, in __import_r__
    self.__import__(variable.owner, reason=reason)
  File "/usr/local/lib/python3.5/dist-packages/theano/gof/fg.py", line 396, in __import__
    variable=r)
theano.gof.fg.MissingInputError: An input of the graph, used to compute Elemwise{add,no_inplace}(<TensorType(float32, matrix)>, <TensorType(float32, matrix)>), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.

Backtrace when the variable is created:
  File "/home/khaled/PycharmProjects/WonderStations/sequence.py", line 77, in <module>
    depth=1, dropout=0.1)
  File "/usr/local/lib/python3.5/dist-packages/seq2seq/models.py", line 171, in Seq2Seq
    encoded_seq = encoder(encoded_seq)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 514, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 572, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 149, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/usr/local/lib/python3.5/dist-packages/recurrentshop/engine.py", line 311, in call
    last_output_1, outputs_1, states_1, updates = rnn(self.step, x, initial_states, go_backwards=self.go_backwards, mask=mask, unroll=unroll, input_length=input_shape[1])
  File "/usr/local/lib/python3.5/dist-packages/recurrentshop/backend.py", line 120, in rnn
    go_backwards=go_backwards)


Process finished with exit code 1

The environment is Python 3.5, running on Keras 1.1.0 and Theano 0.9.0.dev3.

@farizrahman4u
Copy link
Owner

This is a bug in theano.
Solution:
Create 2 models : one with dropout and another without (dropout=0.)
Train the first one.
Copy weights from the trained model to the second model(without dropout) (using set_weights and get_weights).
Predict using the second model.

@KhaledSharif
Copy link
Author

Thanks for the quick reply. This seems to solve the issue, at least temporarily until the bug is fixed in Theano.

@bityangke
Copy link

Why my fit function give this error when set dropout > 0. ?
And for the solution you said, could you please give a simple example code here?
I am very grateful.@farizrahman4u @KhaledSharif

@TillBeemelmanns
Copy link

Same problem here when I use the fit function and dropout > 0.

@TillBeemelmanns
Copy link

TillBeemelmanns commented Dec 21, 2016

This problem is triggered by keras.callbacks.EarlyStopping because of the model evaluations of the validation split.

This script crashes for me with theano.gof.fg.MissingInputError as above

import seq2seq
import numpy as np
from keras.callbacks import EarlyStopping

x_Train = np.random.random((200,8,2))
y_Train = np.random.random((200,8,2))

model = seq2seq.Seq2Seq(input_dim=2,
                        input_length=8,
                        output_dim=2,
                        output_length=8,
                        hidden_dim=128,
                        dropout=0.2,
                        depth=1)

model.compile('adam', 'mse', metrics=['accuracy'])

early_stopping = EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='auto')
history = model.fit(x_Train,
                    y_Train,
                    shuffle=True,
                    batch_size=32,
                    nb_epoch=2,
                    verbose=1,
                    validation_split=0.1,
                    callbacks=[early_stopping])

and this runs without error

import seq2seq
import numpy as np
from keras.callbacks import EarlyStopping

x_Train = np.random.random((200,8,2))
y_Train = np.random.random((200,8,2))

model = seq2seq.Seq2Seq(input_dim=2,
                        input_length=8,
                        output_dim=2,
                        output_length=8,
                        hidden_dim=128,
                        dropout=0.2,
                        peek=False,
                        depth=1)

model.compile('adam', 'mse', metrics=['accuracy'])

history = model.fit(x_Train,
                    y_Train,
                    shuffle=True,
                    batch_size=32,
                    nb_epoch=2,
                    verbose=1)

so avoid using EarlyStopping unless this bug is fixed und try to find nb_epochs by hyper parameter tuning

@TillBeemelmanns
Copy link

@bityangke, you can use something like this after training and before evaluating

        weights = model.get_weights()
        model_temp.set_weights(weights)
        del model
        model = model_temp

@bityangke
Copy link

@Depulsor Thanks!
You mean that we need do this after each epoch ends?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants