You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note to reviewers: this is labeled as "needs triage" but I already wrote and tested a fix: see the bottom of this issue for a link to the branch.
Training step on joint speech to text example hits a python exception due to mismatched tensor sizes in matrix multiplication (since decoder embedding size doesn't match the encoder size)
Run the training under "Jointly trained model from scratch." One difference is that I did not use the parallel text data.
Error output is as follows:
Traceback (most recent call last):
.....
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 216, in forward
x, extra = self.extract_features(
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 238, in extract_features
return self.extract_features_scriptable(
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 340, in extract_features_scriptable
x, layer_attn, _ = layer(
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/berckmann/si2/fairseq/fairseq/modules/transformer_layer.py", line 388, in forward
x, attn = self.encoder_attn(
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/berckmann/si2/fairseq/fairseq/modules/multihead_attention.py", line 216, in forward
k = self.k_proj(key)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py", line 96, in forward
return F.linear(input, self.weight, self.bias)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 1847, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (666x256 and 512x256)
🐛 Bug
Note to reviewers: this is labeled as "needs triage" but I already wrote and tested a fix: see the bottom of this issue for a link to the branch.
Training step on joint speech to text example hits a python exception due to mismatched tensor sizes in matrix multiplication (since decoder embedding size doesn't match the encoder size)
To Reproduce
Error output is as follows:
Traceback (most recent call last):
.....
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 216, in forward
x, extra = self.extract_features(
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 238, in extract_features
return self.extract_features_scriptable(
File "/home/berckmann/si2/fairseq/fairseq/models/transformer/transformer_decoder.py", line 340, in extract_features_scriptable
x, layer_attn, _ = layer(
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/berckmann/si2/fairseq/fairseq/modules/transformer_layer.py", line 388, in forward
x, attn = self.encoder_attn(
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/berckmann/si2/fairseq/fairseq/modules/multihead_attention.py", line 216, in forward
k = self.k_proj(key)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py", line 96, in forward
return F.linear(input, self.weight, self.bias)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 1847, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (666x256 and 512x256)
Code sample
This is the code that triggered the issue:
--save-dir ${save_dir}
--num-workers 4
--task speech_text_joint_to_text
--arch dualinputs2ttransformer_s
--user-dir examples/speech_text_joint_to_text
--max-epoch 100 --update-mix-data
--optimizer adam --lr-scheduler inverse_sqrt
--lr 0.001 --update-freq 8 --clip-norm 10.0
--criterion guided_label_smoothed_cross_entropy_with_accuracy
--label-smoothing 0.1 --max-tokens $max_token_cnt --max-tokens-text $max_token_cnt
--max-positions-text 400 --seed 2 --speech-encoder-layers 12
--text-encoder-layers 6 --encoder-shared-layers 6 --decoder-layers 6
--dropout 0.1 --warmup-updates 20000
--text-sample-ratio 0.25
--text-input-cost-ratio 0.5 --enc-grad-mult 2.0 --add-speech-eos
--log-format json --langpairs en-de --noise-token '"'"'▁NOISE'"'"'
--mask-text-ratio 0.0 --max-tokens-valid 20000 --ddp-backend no_c10d
--log-interval 100 --data-buffer-size 50 --config-yaml config.yaml
--keep-last-epochs 10 --valid-subset dev_st --train-subset train_st
--tensorboard-logdir logs/tensorb_log
Expected behavior
Training should complete
Environment
pip
, source): sourceAdditional context
Wrote code for the fix already, which fixed the problem for me locally:
https://github.com/tberckmann/fairseq/tree/joint_s2t_fixes
The text was updated successfully, but these errors were encountered: