Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Dimensions must be equal" exception when using Dense layer on input with a rank greater than 2 #10736

Closed
stiffme opened this issue Jul 20, 2018 · 3 comments
Labels
type:bug/performance type:docs Need to modify the documentation

Comments

@stiffme
Copy link

stiffme commented Jul 20, 2018

Keras : 2.2.0
Tensorflow-GPU: 1.8.0/ 1.9.0 (Both versions have the same issue)

Problem: When using an input with rank greater than 2 in a Dense layer, "Dimensions must be equal" exception will be thrown from Tensorflow.

If using use_bias=False in Dense layer, it is OK.
If using Reshape before supply to Dense layer, it is OK.

With above symptons , it seems something is wrong when handling the bias with an input of rank size over 2. According to the documents: Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

Here is the short script to reproduce the issue:

from keras import Input, Model
from keras.layers import Dense, Reshape

def build_model():
    X = Input(shape=(30, 40))
    output = Dense(10, activation='tanh')(X)
    model = Model(inputs=X, outputs=output)


def main():
    build_model()


if __name__ == "__main__":
    main()

Here is the exception:

Using TensorFlow backend.
Traceback (most recent call last):
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1567, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 30 and 10 for 'dense_1/add' (op: 'Add') with input shapes: [?,30,10], [1,10,1].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/stiffme/PycharmProjects/Dense_Concate/main.py", line 15, in <module>
    main()
  File "/home/stiffme/PycharmProjects/Dense_Concate/main.py", line 11, in main
    build_model()
  File "/home/stiffme/PycharmProjects/Dense_Concate/main.py", line 6, in build_model
    output = Dense(10, activation='tanh')(X)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/keras/engine/topology.py", line 619, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/keras/layers/core.py", line 879, in call
    output = K.bias_add(output, self.bias)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3781, in bias_add
    x += reshape(bias, (1, bias_shape[0], 1))
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 979, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 297, in add
    "Add", x=x, y=y, name=name)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1734, in __init__
    control_input_ops)
  File "/home/stiffme/PycharmProjects/neural_network_transfer/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1570, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 30 and 10 for 'dense_1/add' (op: 'Add') with input shapes: [?,30,10], [1,10,1].
@Dref360
Copy link
Contributor

Dref360 commented Jul 20, 2018

Contrary to the documentation, we don't actually flatten it. It's applied on the last axis independently.

We have a couple of options, but they some are breaking changes.

  • Update the doc, making it clear that it's not actually flatten
  • Don't allow this behavior (Deprecate it), forcing the user to use TimeDistributed
  • Actually Flatten as the doc says, but it's a breaking change.

This is an API breaking change so what do you think @fchollet?

@Dref360 Dref360 added type:bug/performance type:docs Need to modify the documentation labels Jul 20, 2018
@tRosenflanz
Copy link

So is it true that current behaviour of Dense layer is identical to TimeDistributed(Dense) for rank 3 tensors but is different for rank 4+ ? Although it is up-to François, I as a user could get behind 2nd option especially if TimeDistributed wrapper could be reworked to have axis argument that defaults to 1 (won't break existing behaviour of TimeDistributed wrapper) but can be changed to higher values to match current Dense behaviour

@alex-lt-kong
Copy link

Hi @Dref360 , I encountered the same issue and after reading what you said I am still a bit confused. By saying It's applied on the last axis independently., my understanding is this:

Suppose my samples are all 5x2 matrices and I have a Dense layer with 8 neurons:

  • the layer will be applied to each sample 5 times;
  • each time, the layer is applied to a 1x2 vector, i.e., it fully connects the 2 input nodes to 8 output nodes;
  • the output from each application is a 1x8 vector;
  • the final output is a 5x8 matrix/tensor

Is this understanding correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug/performance type:docs Need to modify the documentation
Projects
None yet
Development

No branches or pull requests

5 participants