Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM from pytorch to tensorflow: "Squeeze" messes up rank #1383

rumschuettel opened this issue Sep 5, 2018 · 7 comments

LSTM from pytorch to tensorflow: "Squeeze" messes up rank #1383

rumschuettel opened this issue Sep 5, 2018 · 7 comments


Copy link

rumschuettel commented Sep 5, 2018

Hi! I'm trying to export a pytorch (installed from master branch) LSTM to ONNX (1.3.0), and then to import it into tensorflow (tf-nightly).

Exporting works. The relevant graph bit looks like

  %68 : Dynamic = onnx::Slice[axes=[0], ends=[2], starts=[1]](%1), scope: foo/LSTM[lstm]
  %69 : Dynamic = onnx::Slice[axes=[0], ends=[2], starts=[1]](%2), scope: foo/LSTM[lstm]
  %70 : Dynamic, %71 : Dynamic, %72 : Dynamic = onnx::LSTM[hidden_size=200](%47, %65, %66, %67, %21, %68, %69), scope: foo/LSTM[lstm]
  %73 : Dynamic = onnx::Squeeze[axes=[1]](%70), scope: foo/LSTM[lstm]
  %74 : Dynamic = onnx::Slice[axes=[0], ends=[200], starts=[0]](%11), scope: foo/LSTM[lstm]
  %75 : Dynamic = onnx::Slice[axes=[0], ends=[800], starts=[600]](%11), scope: foo/LSTM[lstm]

Inspecting said file after onnx.load gives me an identical output, so I assume the export from pytorch, and import into onnx works:

  %68 = Slice[axes = [0], ends = [2], starts = [1]](%1)
  %69 = Slice[axes = [0], ends = [2], starts = [1]](%2)
  %70, %71, %72 = LSTM[hidden_size = 200](%47, %65, %66, %67, %, %68, %69)
  %73 = Squeeze[axes = [1]](%70)
  %74 = Slice[axes = [0], ends = [200], starts = [0]](%11)
  %75 = Slice[axes = [0], ends = [800], starts = [600]](%11)

So far so good. Unfortunately, importing with onnx_tf.backend.prepare yields the following error:

File "./", line 28, in <module>
    tf_rep = prepare(model)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/onnx_tf/", line 348, in prepare
    model.graph, opset=model.opset_import[0].version))
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/onnx_tf/", line 324, in onnx_graph_to_tensorflow_net
    node, tensor_dict, opset=opset)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/onnx_tf/", line 407, in _onnx_node_to_tensorflow_op
    return method_to_call(node, input_dict)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/onnx_tf/backends/", line 713, in handle_l_s_t_m
    cell, input_dict[node.inputs[0]], time_major=True, dtype=tf.float32)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/ops/", line 664, in dynamic_rnn
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/ops/", line 727, in _dynamic_rnn_loop
    for input_ in flat_input)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/ops/", line 727, in <genexpr>
    for input_ in flat_input)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 765, in with_rank_at_least
    raise ValueError("Shape %s must have rank at least %d" % (self, rank))
ValueError: Shape (100, 200) must have rank at least 3

Indeed, the Squeeze call seems to eliminate the batch dimension, as exporting from pytorch with batch_size>1 results in an error being raised because Squeeze cannot eliminate a dimension with size not equal to 1.

Any idea what's going on here?
Let me know if you need more information.

Thanks a bunch!
/ J

Copy link

@tjingrant @fumihwh would you like to take a look?

Copy link

fumihwh commented Sep 6, 2018

Seems you are using old version of onnx-tf.
Could you try master branch?

Copy link

rumschuettel commented Sep 6, 2018

Ok, I installed (in a fresh conda environment) tf-nightly, onnx from source, and onnx-tf from source; the graph after onnx.load looks identical to the one printed above, but the error now occurs somewhere else:

Fail to get since_version of Expand in domain `` with max_inclusive_version=7. Set to 1.
Traceback (most recent call last):
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 1627, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [100,1,200], [200,27].

During handling of the above exception, another exception occurred:

  File "./", line 28, in <module>
    tf_rep = prepare(model)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/", line 76, in prepare
    return cls.onnx_model_to_tensorflow_rep(model, strict)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/", line 87, in onnx_model_to_tensorflow_rep
    return cls._onnx_graph_to_tensorflow_rep(model.graph, model.opset_import, strict)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/", line 141, in _onnx_graph_to_tensorflow_rep
    onnx_node, tensor_dict, handlers, opset=opset, strict=strict)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/", line 236, in _onnx_node_to_tensorflow_op
    return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/handlers/", line 60, in handle
    return ver_handle(node, **kwargs)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/handlers/backend/", line 14, in version_1
    return [cls.make_tensor_from_onnx_node(node, **kwargs)]
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/handlers/", line 111, in make_tensor_from_onnx_node
    return cls._run_tf_func(tf_func, inputs, attrs)
  File "/home/foo/opt/onnx-tensorflow/onnx_tf/handlers/", line 180, in _run_tf_func
    **dict([(p, attrs[p]) for p in params if p in attrs]))
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/ops/", line 2053, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/ops/", line 4560, in mat_mul
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 787, in _apply_op_helper
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/util/", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 3273, in create_op
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 1791, in __init__
  File "/home/foo/opt/anaconda5/envs/nlp-onnx/lib/python3.6/site-packages/tensorflow/python/framework/", line 1630, in _create_c_op
    raise ValueError(str(e))
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [100,1,200], [200,27].

The only MatMul layer in the graph is towards the end, where we have

  %121 = Slice[axes = [0], ends = [4], starts = [3]](%2)
  %122, %123, %124 = LSTM[hidden_size = 200](%99, %117, %118, %119, %, %120, %121)
  %125 = Squeeze[axes = [1]](%122)
  %126 = Concat[axis = 0](%45, %71, %97, %123)
  %127 = Concat[axis = 0](%46, %72, %98, %124)
  %128 = Transpose[perm = [1, 0]](%19)
  %129 = MatMul(%125, %128)
  %130 = Add(%129, %20)
  %131 = LogSoftmax[axis = 2](%130)
  return %131, %126, %127

So I dug a little deeper; the master branch of onnx-tf still sais it only works with onnx@1.1.2, so I installed that version with pip install "onnx==1.1.2", but that raised the same exception, and additionally some UserWarnings about unknown operations; so that's not better.

Any thoughts? Seems like we almost got to the root of the problem :) and I appreciate your help, thanks a lot!

/ J

Copy link

Ok, I fixed this the following way:

instead of calling my model with a batch size of 1, I don't give it a batch at all, just a single vector.
Before feeding it into the LSTM, I unsqueeze(1) a fake batch dimension in; after that, and before the following linear layers, I squeeze() the dimension away again.


Thanks a lot for your help anyways! If this is a bug/something to be improved in ONNX let me know if I can be of help.

/ J

Copy link

Just as a comment, that the error

Fail to get since_version of Expand in domain `` with max_inclusive_version=7. Set to 1.

still occurs, in case you want to check that bit.

@rumschuettel rumschuettel reopened this Sep 6, 2018
Copy link

fumihwh commented Sep 6, 2018

It's not an error. ( Or will not affect backend, onnx->tensorflow. )
The reason you get this is your onnx model's opset is 7. We can not find Expand schema because it is added from 8.

Copy link

stale bot commented Mar 29, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Mar 29, 2022
@stale stale bot closed this as completed Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

No branches or pull requests

3 participants