Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative training loss of wavenet #220

Open
shartoo opened this issue Sep 26, 2018 · 7 comments
Open

negative training loss of wavenet #220

shartoo opened this issue Sep 26, 2018 · 7 comments

Comments

@shartoo
Copy link

shartoo commented Sep 26, 2018

i got negative loss when training wavenet with THSCH-30 dataset.

Wavenet Train

###########################################################

Checkpoint_path: i:/data/voice/logs-Tacotron-2\wave_pretrained\wavenet_model.ckpt
Loading training data from: tacotron_output\gta\map.txt
Using model: Tacotron-2
Hyperparameters:
  allow_clipping_in_normalization: True
  attention_dim: 128
  attention_filters: 32
  attention_kernel: (31,)
  cin_channels: 80
  cleaners: basic_cleaners
  clip_mels_length: True
  cross_entropy_pos_weight: 1
  cumulative_weights: True
  decoder_layers: 2
  decoder_lstm_units: 1024
  embedding_dim: 512
  enc_conv_channels: 512
  enc_conv_kernel_size: (5,)
  enc_conv_num_layers: 3
  encoder_lstm_units: 256
  fmax: 7600
  fmin: 0
  frame_shift_ms: None
  freq_axis_kernel_size: 3
  gate_channels: 512
  gin_channels: -1
  griffin_lim_iters: 60
  hop_size: 275
  input_type: raw
  kernel_size: 3
  layers: 30
  leaky_alpha: 0.4
  log_scale_min: -32.23619130191664
  log_scale_min_gauss: -16.11809565095832
  mask_decoder: False
  mask_encoder: False
  max_abs_value: 4.0
  max_iters: 1000
  max_mel_frames: 900
  max_time_sec: None
  max_time_steps: 13000
  min_level_db: -100
  n_fft: 2048
  n_speakers: 1
  natural_eval: False
  normalize_for_wavenet: True
  num_freq: 1025
  num_gpus: 1
  num_mels: 80
  out_channels: 2
  outputs_per_step: 2
  postnet_channels: 512
  postnet_kernel_size: (5,)
  postnet_num_layers: 5
  power: 1.5
  predict_linear: False
  preemphasis: 0.97
  prenet_layers: [256, 256]
  quantize_channels: 65536
  ref_level_db: 20
  rescale: True
  rescaling_max: 0.999
  residual_channels: 512
  sample_rate: 22050
  signal_normalization: True
  silence_threshold: 2
  skip_out_channels: 256
  smoothing: False
  stacks: 3
  stop_at_any: True
  symmetric_mels: True
  tacotron_adam_beta1: 0.9
  tacotron_adam_beta2: 0.999
  tacotron_adam_epsilon: 1e-06
  tacotron_batch_size: 32
  tacotron_clip_gradients: False
  tacotron_data_random_state: 1234
  tacotron_decay_learning_rate: True
  tacotron_decay_rate: 0.4
  tacotron_decay_steps: 50000
  tacotron_dropout_rate: 0.5
  tacotron_final_learning_rate: 1e-05
  tacotron_initial_learning_rate: 0.001
  tacotron_random_seed: 5339
  tacotron_reg_weight: 1e-06
  tacotron_scale_regularization: True
  tacotron_start_decay: 50000
  tacotron_swap_with_cpu: False
  tacotron_synthesis_batch_size: 512
  tacotron_teacher_forcing_decay_alpha: 0.0
  tacotron_teacher_forcing_decay_steps: 280000
  tacotron_teacher_forcing_final_ratio: 0.0
  tacotron_teacher_forcing_init_ratio: 1.0
  tacotron_teacher_forcing_mode: constant
  tacotron_teacher_forcing_ratio: 1.0
  tacotron_teacher_forcing_start_decay: 10000
  tacotron_test_batches: 32
  tacotron_test_size: None
  tacotron_zoneout_rate: 0.1
  train_with_GTA: True
  trim_fft_size: 512
  trim_hop_size: 128
  trim_silence: True
  trim_top_db: 60
  upsample_conditional_features: True
  upsample_scales: [25, 11]
  use_all_gpus: False
  use_bias: True
  use_lws: False
  use_speaker_embedding: True
  wavenet_adam_beta1: 0.9
  wavenet_adam_beta2: 0.999
  wavenet_adam_epsilon: 1e-08
  wavenet_batch_size: 2
  wavenet_data_random_state: 1234
  wavenet_dropout: 0.05
  wavenet_ema_decay: 0.9999
  wavenet_learning_rate: 0.001
  wavenet_random_seed: 5339
  wavenet_swap_with_cpu: False
  wavenet_synthesis_batch_size: 8
  wavenet_test_batches: None
  wavenet_test_size: 0.02
  win_size: 1100
Initializing Wavenet model.  Dimensions (? = dynamic shape): 
  Train mode:                True
  Eval mode:                 False
  Synthesis mode:            False
  inputs:                    (?, 1, ?)
  local_condition:           (?, 80, ?)
  targets:                   (?, ?)
  outputs:                   (?, ?)
Initializing Wavenet model.  Dimensions (? = dynamic shape): 
  Train mode:                False
  Eval mode:                 True
  Synthesis mode:            False
  local_condition:           (1, 80, ?)
  targets:                   (?,)
  outputs:                   (?,)
Wavenet training set to a maximum of 6000 steps

Generated 32 train batches of size 2 in 0.089 sec

Generated 70 test batches of size 1 in 0.097 sec
Step      23 [1.688 sec/step, loss=-0.67038, avg_loss=-0.19141]
Generated 32 train batches of size 2 in 0.067 sec
Step      55 [1.475 sec/step, loss=-0.91690, avg_loss=-0.54008]
Generated 32 train batches of size 2 in 0.054 sec
Step      87 [1.414 sec/step, loss=-1.47173, avg_loss=-0.59483]
Generated 32 train batches of size 2 in 0.052 sec
Step     119 [1.313 sec/step, loss=-0.58264, avg_loss=-0.76051]
Generated 32 train batches of size 2 in 0.052 sec
Step     151 [1.309 sec/step, loss=-1.13693, avg_loss=-0.87239]
Generated 32 train batches of size 2 in 0.052 sec
Step     183 [1.310 sec/step, loss=-1.02893, avg_loss=-0.93583]
Generated 32 train batches of size 2 in 0.052 sec
Step     215 [1.312 sec/step, loss=-0.44637, avg_loss=-0.97706]
@Thien223
Copy link

I had this problem when training with input_type='raw' in hparams.

Since I do not know how to fix this, I changed input_type to 'mulaw_quantize' ( and quantize_channels to 256 as requirement) and It runs fine.

Might the problem is loss function was not built right. If you want to find out the problem, I recommend take some look at the DiscretizedMixtureLogisticLoss() function.

@rinleit
Copy link

rinleit commented Oct 1, 2018

I had have the same issue. Did you have way to training with raw audio, but don't got this isssue?

'Cause I have preprocessing datasets with input_type='raw' and training on Tacotron Model got a good result. If I changed input_type to 'mulaw_quantize and train for wavenet. I guess maybe get something went wrong in result after done it. I don't want to train again :) with input_type = 'mulaw_quantize' for Tacotron model, because it take up time.

@veqtor
Copy link

veqtor commented Oct 1, 2018

Since you're trying to minimize negative log likelihood, negative loss isn't impossible

@rinleit
Copy link

rinleit commented Oct 2, 2018

Duplicate of #186

@rinleit
Copy link

rinleit commented Oct 2, 2018

For WaveNet Model:

After maintain reading code.
I changed out_channels = 2 to 3 * 10 in hparams.py (ref r9y9: wavenet_vocoder). It's good for DiscretizedMixtureLogisticLoss() function. I got normal loss. :)

@Rayhane-mamah
Copy link
Owner

Hi, I'm a little late to the party! :)

While 16-bit WaveNet loss functions are generally discussed in this comment, I am going to leave some quick notes in here as well for clarity (Please read the link explanation first):

  • using out_channels=2 results in using Gaussian distribution modeling instead of Mixture of Logistic Distributions. to work with the latter, set out_channels as a multiple of 3 (out_channels=M*3) where M is the number of mixtures.
  • Gaussian objective function is Maximum Likelihood Estimation (MLE) where we aim to maximize the chance that a sample x is drawn from N(mu, sigma) where mu and sigma are outputs of the network. To maximize some function is the equivalent of minimizing its negative (or its negative log). Since this loss function is uncapped/not normalized, it is possible that it gets negative values that can actually be traced back to the probability density function (PDF) values (an approximation of how well the model is making its distributions).
  • The link I provided discusses alternatives of the MLE optimization that rely on CDF derivative approximations which are actually always positive for uni-modal distributions like Gaussian or Logistic.

So to make short, the negative loss values is not a bug, it is simply unusual to see :)

@rinleit
Copy link

rinleit commented Oct 2, 2018

@Rayhane-mamah thank you! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants