InvalidArgumentError: Mismatch between Graphs #415

donaparker01 · 2018-11-14T00:31:52Z

I have tried the README exactly for English to German 8 layer inference with the current setup
Ubuntu 16.04
Tensorflow 1.9

but I receive the following error:

InvalidArgumentError (see above for traceback): Restoring from checkpoint
failed. This is most likely due to a mismatch between the current graph and
the graph from the checkpoint. Please ensure that you have not altered the
graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [36549,1024] rhs
shape= [36548,1024]
[[Node: save/Assign_26 = Assign[T=DT_FLOAT, _class=
["loc:@embeddings/encoder/embedding_encoder"], use_locking=true,
validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"]
(embeddings/encoder/embedding_encoder, save/RestoreV2:26)]]

Can someone advise me on how to resolve the issue?

qwerybot · 2018-12-05T14:55:40Z

I'm getting the same error with a similar setup, I'm using tensorflow 1.12. Did you ever figure out a solution?

heyuanw · 2018-12-27T01:28:16Z

I'm getting the same error with a similar setup, I'm using tensorflow-night. Did you ever figure out a solution?

guyco87 · 2019-02-01T23:01:35Z

I found the problem, although I'm not sure how exactly to solve it.
There's a mismatch between the checkpoint graph's embedding layer size and the vocabulary size (both src and target). In wmt16/vocab.bpe.32000 file there are 36549 lines while the saved models' input layer size is [36548, num_units].

I know it's not a solution, but I added the following lines in line 503 in nmt/nmt.py and I was able to run the model:
src_vocab_size -= 1
tgt_vocab_size -= 1

I guess the problem is in the script nmt/scripts/wmt16_en_de.sh where wmt16/vocab.bpe.32000 is being created.

christ1ne · 2019-02-01T23:18:47Z

I end up just traing my own GNMT model. Afterwards, you can run the inference code with your own model without issue. I suspect the vocab file generation code for the pretrained model is somehow different.

qwerybot · 2019-02-04T10:07:55Z

I used the same solution as christ1ne.
I tried a similar solution to guyco87, but my resulting BLEU scores were very low, e.g. 6.32, which indicates that many of the lines of the vocab file have moved around as well as something being added (or taken away too).

It seems the generation of the vocab file is either non-deterministic, or has changed since the model was pretrained. If we were able to find a copy of the vocab file which was used to train the models then it would work. Unfortunately I had no look tracking one down.

SergeCraft · 2019-02-05T17:48:09Z

I got the same problem when tried to export inference graph with tensorflow object detection API, but i have solved it. I has mistaken with choose of pipeline.config file to export. In my case, it should be exactly the same with pipeline.config file that was used to train model.

nithya4 · 2019-02-07T09:24:21Z

@BenTaylor3115
Using the current scripts to generate the vocab, I get 36549 lines in my vocab.bpe.32000 file.

Using the BPE file linked in issue #85, I ran the BPE portion and it generated 37008 lines.
It indeed looks like the generation has changed since the model was trained.
But the model expects a tensor of dims [36548, 1024] - not sure where this is from.

chih-hong · 2019-02-20T07:10:51Z

I got the same problem. how to solve it.
Assign requires shapes of both tensors to match. lhs shape= [1024,2048] rhs shape= [2048]
[[node save/Assign_381 (defined at /home/sca_test/bazel-bin/im2txt/run_inference.runfiles/main/im2txt/inference_utils/inference_wrapper_base.py:116) = Assign[T=DT_FLOAT, _class=["loc:@lstm/basic_lstm_cell/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/basic_lstm_cell/kernel, save/RestoreV2:381)]]

gaebor · 2019-04-13T09:03:41Z

I had the same problem with the provided model: http://download.tensorflow.org/models/nmt/10122017/ende_gnmt_model_8_layer.zip
I used the script: https://github.com/tensorflow/nmt/blob/master/nmt/scripts/wmt16_en_de.sh
The resulted vocabulary had 36549 elements while the pre-trained model has 36548!

 Assign requires shapes of both tensors to match. lhs shape= [36549,1024] rhs shape= [36548,1024]
     [[node save/Assign_7 (defined at /mnt/store/gaebor/NMT_demo/nmt/model.py:101) ]]

I suspect that the bpe algorithm has changed and the script extracts different wordpieces compared to late 2017.

As a fix, could you provide a vocabulary for the trained models? I simply cannot train my own, because I don't have enough GPU :(

gaebor · 2019-04-14T17:07:05Z

FYI I rolled back the mosesdecoder to commit 5b9a6da9a4065b776d1dffedbd847be565c436ef and subword-nmt to 3d28265d779e9c6cbb39b41ba54b2054aa435005.
The resulted vocabulary was the right size so at least the checkpoint worked but the test BLEU was 10.6
so...

christ1ne · 2019-04-15T17:09:16Z

I had the same problem with the provided model: http://download.tensorflow.org/models/nmt/10122017/ende_gnmt_model_8_layer.zip
I used the script: https://github.com/tensorflow/nmt/blob/master/nmt/scripts/wmt16_en_de.sh
The resulted vocabulary had 36549 elements while the pre-trained model has 36548!
 Assign requires shapes of both tensors to match. lhs shape= [36549,1024] rhs shape= [36548,1024]
     [[node save/Assign_7 (defined at /mnt/store/gaebor/NMT_demo/nmt/model.py:101) ]]
I suspect that the bpe algorithm has changed and the script extracts different wordpieces compared to late 2017.

As a fix, could you provide a vocabulary for the trained models? I simply cannot train my own, because I don't have enough GPU :(

If you are looking for a trained model, please feel free to see if the following link works for you: https://github.com/mlperf/inference/tree/master/cloud/translation/gnmt/tensorflow

gaebor · 2019-04-15T18:24:51Z

thanks @christ1ne but that has the same problem than the one I tried: without the vocabulary the models are useless.

christ1ne · 2019-04-15T18:37:59Z

@gaebor the vocab generation is here: download_dataset.sh at https://github.com/mlperf/training/tree/master/rnn_translator

gaebor · 2019-04-15T18:43:02Z

I'd be damned if I haven't tried it, see: #415 (comment)

christ1ne · 2019-04-15T18:45:45Z

The vocab generation scripts are different at MLPerf and at the TF nmt repo.

gaebor · 2019-04-15T18:54:13Z

I beg to differ: https://github.com/mlperf/training/blob/master/rnn_translator/download_dataset.sh#L147
https://github.com/tensorflow/nmt/blob/master/nmt/scripts/wmt16_en_de.sh#L129
But I guess it can't hurt trying

navreeetkaur mentioned this issue Jun 4, 2019

error while running inference on deen_gnmt_model_4_layer #107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InvalidArgumentError: Mismatch between Graphs #415

InvalidArgumentError: Mismatch between Graphs #415

donaparker01 commented Nov 14, 2018

qwerybot commented Dec 5, 2018 •

edited

Loading

heyuanw commented Dec 27, 2018

guyco87 commented Feb 1, 2019

christ1ne commented Feb 1, 2019

qwerybot commented Feb 4, 2019

SergeCraft commented Feb 5, 2019

nithya4 commented Feb 7, 2019

chih-hong commented Feb 20, 2019

gaebor commented Apr 13, 2019

gaebor commented Apr 14, 2019

christ1ne commented Apr 15, 2019

gaebor commented Apr 15, 2019

christ1ne commented Apr 15, 2019

gaebor commented Apr 15, 2019

christ1ne commented Apr 15, 2019 •

edited

Loading

gaebor commented Apr 15, 2019

InvalidArgumentError: Mismatch between Graphs #415

InvalidArgumentError: Mismatch between Graphs #415

Comments

donaparker01 commented Nov 14, 2018

qwerybot commented Dec 5, 2018 • edited Loading

heyuanw commented Dec 27, 2018

guyco87 commented Feb 1, 2019

christ1ne commented Feb 1, 2019

qwerybot commented Feb 4, 2019

SergeCraft commented Feb 5, 2019

nithya4 commented Feb 7, 2019

chih-hong commented Feb 20, 2019

gaebor commented Apr 13, 2019

gaebor commented Apr 14, 2019

christ1ne commented Apr 15, 2019

gaebor commented Apr 15, 2019

christ1ne commented Apr 15, 2019

gaebor commented Apr 15, 2019

christ1ne commented Apr 15, 2019 • edited Loading

gaebor commented Apr 15, 2019

qwerybot commented Dec 5, 2018 •

edited

Loading

christ1ne commented Apr 15, 2019 •

edited

Loading