Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The expanded size of the tensor (12) must match the existing size (84) at non-singleton dimension 2. Target sizes: [64, 80, 12]. Tensor sizes: [64, 1, 84] #450

Closed
lpierron opened this issue Apr 23, 2021 · 1 comment

Comments

@lpierron
Copy link

I have the same problem with v0.0.12:

 CUDA_VISIBLE_DEVICES="0" python ../../TTS/bin/train_tacotron.py --config_path model_config.json
 > Using CUDA:  True
 > Number of GPUs:  1
 > Git Hash: 59ab268
 > Experiment folder: /home/lpierron/Mozilla_TTS/CORPUS_LP/Models/maiLabs-fr-dca/mailabs-fr-ddc-April-23-2021_03+38PM-59ab268
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > stats_path:./scale_stats.npy
 | > log_func:<ufunc 'log10'>
 | > exp_func:<function AudioProcessor.__init__.<locals>.<lambda> at 0x7f1ef7ac6c10>
 | > hop_length:256
 | > win_length:1024
 | > /tmp/tts/by_book/female/ezwa/monsieur_lecoq/metadata.csv
 | > Found 14211 files in /tmp/tts
 > Using model: Tacotron2

 > Model has 28183506 parameters
 > Starting with inf best loss.

 > DataLoader initialization
 | > Use phonemes: True
   | > phoneme language: fr-fr
 | > Number of instances : 14069
 | > Max length sequence: 281
 | > Min length sequence: 3
 | > Avg length sequence: 105.0826640130784
 | > Num. instances discarded by max-min (max=153, min=6) seq limits: 2420
 | > Batch group size: 128.

 > EPOCH: 0/1000

 > Number of output frames: 7

 > TRAINING (2021-04-23 15:38:18)
 ! Run is removed from /home/lpierron/Mozilla_TTS/CORPUS_LP/Models/maiLabs-fr-dca/mailabs-fr-ddc-April-23-2021_03+38PM-59ab268
Traceback (most recent call last):
  File "../../TTS/bin/train_tacotron.py", line 744, in <module>
    main(args)
  File "../../TTS/bin/train_tacotron.py", line 704, in main
    train_avg_loss_dict, global_step = train(
  File "../../TTS/bin/train_tacotron.py", line 198, in train
    decoder_output, postnet_output, alignments, stop_tokens = model(
  File "/home/lpierron/miniconda3/envs/tts/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lpierron/Mozilla_TTS/COQUI-TTS/TTS/TTS/tts/models/tacotron2.py", line 226, in forward
    decoder_outputs = decoder_outputs * output_mask.unsqueeze(1).expand_as(decoder_outputs)
RuntimeError: The expanded size of the tensor (12) must match the existing size (84) at non-singleton dimension 2.  Target sizes: [64, 80, 12].  Tensor sizes: [64, 1, 84]

I have downgraded librosa==0.6.3 but it doesn't work.

See my configuration next:

model_config.json.txt

Originally posted by @lpierron in #370 (comment)

@erogol
Copy link
Member

erogol commented Apr 23, 2021

set 'r':7 in config.json

@erogol erogol closed this as completed Apr 23, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants