Wav2Vec2 diversity loss problem #3673

PeiyuChen1005 · 2021-07-01T13:32:50Z

❓ Questions and Help

Before asking:

search the issues.
issue wav2vec 2.0: L2 penalty on features #3315

What is your question?

I've found that the diversity loss weight of wav2vec2.0 is 0.1, and a question about why the weight of diversity loss is this low was proposed in issue #3315 , but no answer is provided. So my first question is the same: Why such low a weight is assigned to diversity loss.
Also, I've tried to give different weights to this loss term ,such as 0.5,1.0, and I found weird loss curves like this:
weight= 0.1

weight= 0.5

weight= 1.0

all curves in one

The diversity loss always rise sharply in the first few epoches (and can't go down to the original loss). Is this a normal phenomenon or something wrong occured? Is it because that my training data is too small? Is there anything that can help me understand the codebook? How can I set the number of codebooks correctly?

What's your environment?

fairseq Version (e.g., 1.0 or master): master
PyTorch Version (e.g., 1.0) 1.8.1
OS (e.g., Linux): Linux(Ubuntu)
How you installed fairseq (pip, source): source
Python version: 3.8.8
CUDA/cuDNN version: 11.1
GPU models and configuration: NVIDIA-A100

Any help is appreciated~~~

The text was updated successfully, but these errors were encountered:

alexeib · 2021-07-01T16:06:35Z

diversity loss value of 0.1 is enough to ensure that a large portion of the codebook is used. you can try other values and monitor code_perplexity to see what percentage of the codebook is used (max value is num latent groups * num latent vars). the actual loss value of diversity loss doesnt matter, it exists to ensure sufficient codebook use and to promote exploration in the early training phase

PeiyuChen1005 · 2021-07-01T17:10:23Z

diversity loss value of 0.1 is enough to ensure that a large portion of the codebook is used. you can try other values and monitor code_perplexity to see what percentage of the codebook is used (max value is num latent groups * num latent vars). the actual loss value of diversity loss doesnt matter, it exists to ensure sufficient codebook use and to promote exploration in the early training phase

So is that means it doesn't matter how large the diversity loss is? Just make sure most of the codebooks are used is ok(by monitor code_perplexity, is the percentage of codebooks used = code_perplexity/(num latent groups * num latent vars)? )?

alexeib · 2021-07-01T17:12:56Z

diversity loss value of 0.1 is enough to ensure that a large portion of the codebook is used. you can try other values and monitor code_perplexity to see what percentage of the codebook is used (max value is num latent groups * num latent vars). the actual loss value of diversity loss doesnt matter, it exists to ensure sufficient codebook use and to promote exploration in the early training phase

So is that means it doesn't matter how large the diversity loss is? Just make sure most of the codebooks are used is ok(by monitor code_perplexity, is the percentage of codebooks used = code_perplexity/(num latent groups * num latent vars)? )?

yes. too high coefficient can also hurt the main objective

PeiyuChen1005 · 2021-07-02T05:18:23Z

Thank you alexeib!!!!! I get it~ @alexeib

PeiyuChen1005 · 2021-07-02T11:50:21Z

I found that my 'code_perplexity' is quite low(diversity weight 0.1, code_perplexity ~100, diversity weight 0.5, code_perplexity ~300). Can you please tell me what is the value of the code_perplexity or codebook percentage in the normal range when num of total codebooks is 640? @alexeib

alexeib · 2021-07-27T16:39:04Z

anything that is not super low will generally do ok. e.g. 100-500 range

PeiyuChen1005 added needs triage question labels Jul 1, 2021

lematt1991 assigned alexeib Jul 1, 2021

lematt1991 removed the needs triage label Jul 1, 2021

PeiyuChen1005 closed this as completed Jul 2, 2021

PeiyuChen1005 reopened this Jul 2, 2021

PeiyuChen1005 closed this as completed Aug 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wav2Vec2 diversity loss problem #3673

Wav2Vec2 diversity loss problem #3673

PeiyuChen1005 commented Jul 1, 2021

alexeib commented Jul 1, 2021

PeiyuChen1005 commented Jul 1, 2021 •

edited

alexeib commented Jul 1, 2021 •

edited

PeiyuChen1005 commented Jul 2, 2021 •

edited

PeiyuChen1005 commented Jul 2, 2021 •

edited

alexeib commented Jul 27, 2021

Wav2Vec2 diversity loss problem #3673

Wav2Vec2 diversity loss problem #3673

Comments

PeiyuChen1005 commented Jul 1, 2021

❓ Questions and Help

Before asking:

What is your question?

What's your environment?

alexeib commented Jul 1, 2021

PeiyuChen1005 commented Jul 1, 2021 • edited

alexeib commented Jul 1, 2021 • edited

PeiyuChen1005 commented Jul 2, 2021 • edited

PeiyuChen1005 commented Jul 2, 2021 • edited

alexeib commented Jul 27, 2021

PeiyuChen1005 commented Jul 1, 2021 •

edited

alexeib commented Jul 1, 2021 •

edited

PeiyuChen1005 commented Jul 2, 2021 •

edited

PeiyuChen1005 commented Jul 2, 2021 •

edited