Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise parameters for Kaldi mfcc compatibility test #689

Open
mthrok opened this issue Jun 4, 2020 · 5 comments
Open

Revise parameters for Kaldi mfcc compatibility test #689

mthrok opened this issue Jun 4, 2020 · 5 comments

Comments

@mthrok
Copy link
Collaborator

mthrok commented Jun 4, 2020

Similar to #679

We should also revise the parameters for mfcc test.

See also #681

@engineerchuan
Copy link
Contributor

I would like to work on this.

@mthrok
Copy link
Collaborator Author

mthrok commented Jul 14, 2020

I would like to work on this.

Hi @engineerchuan

Thanks. Do you know what are good parameters for mfcc? I am not expert but we can consult with our collaborators.

@engineerchuan
Copy link
Contributor

Not off top of my head. Let me study it first for a day and come up with a proposal.

@engineerchuan
Copy link
Contributor

Hi @mthrok,

I would like to follow this approach with some questions:

  1. For testing compute-fbank-feats and compute-mfcc-feats, first extract the default argument values.
  2. As suggested in Revise parameters for Kaldi fbank compatibility test #679, use example datasets to augment more examples of "valid" kaldi parameters both for fbank and mfcc.

Question 1: How should we store and keep the default values for fbank and mfcc up to date?

Recommendation - cache the default fbank values and the override values in json. In future, revise manually if kaldi default argument values or example datasets default argument values change.

Question 2: How should we handle when some datasets don't have fbank config or don't have mfcc config?

Recommendation - we should only use configs from datasets for testing fbank or mfcc if they have the respective config.

Example: Switchboard has both fbank and mfcc config, thus we will use both for testing.

Example: librispeech only stores mfcc config, thus we will not use librispeech for testing fbank

Question 3: What should we do with generate_fbank_data.py?

Currently generate_fbank_data.py generates random parameters, which may be invalid. We could have it make network wget calls to the relevant repositories if possible to retrieve and parse the values. It could inspect Kaldi source code directory or execute the executable path with --help to parse out default values. This sounds hacky and maybe we should skip it for now.

@mthrok
Copy link
Collaborator Author

mthrok commented Jul 16, 2020

Question 1: How should we store and keep the default values for fbank and mfcc up to date?

Recommendation - cache the default fbank values and the override values in json. In future, revise manually if kaldi default argument values or example datasets default argument values change.

I am not quite sure what you mean by cache, but in terms of JSON data, I think providing empty arguments {}, would result in default parameters in both Kaldi CLI and torchaudio's implementation. That way if Kaldi changes default values, I think we can notice. Then we can add arguments with the current default values {"allow_downsample": false, "allow_upsample": false, ... }. I think the later is what you mean by caching.

BTW: Currently kaldi used in test CI is updated manually and I do it from time to time by building a new Docker file and pushing it. Although we plan to update it automatically, we do not know when that will happen.

Also, note that there are some parameter discrepancies on parameters due to inconsistent design. Kaldi expects full range wave form where as typical torchaudio functional expects normalized waveform, yet torchaudio.compliance.kaldi module expects full range values, which confuse users. #371 (comment), #328 I think for this test case, we use load_wav with normalize=False, but you might hit something. We have an idea of making kaldi module consistent with the rest of the code base but we have not planned work items yet.

Question 2: How should we handle when some datasets don't have fbank config or don't have mfcc config?

Recommendation - we should only use configs from datasets for testing fbank or mfcc if they have the respective config.

Example: Switchboard has both fbank and mfcc config, thus we will use both for testing.

Example: librispeech only stores mfcc config, thus we will not use librispeech for testing fbank

Yes, that makes sense.

Question 3: What should we do with generate_fbank_data.py?

Currently generate_fbank_data.py generates random parameters, which may be invalid. We could have it make network wget calls to the relevant repositories if possible to retrieve and parse the values. It could inspect Kaldi source code directory or execute the executable path with --help to parse out default values. This sounds hacky and maybe we should skip it for now.

generate_fbank_data.py is obsolete and provides no value. so we can simply delete it. If our tests can incorporate the latest changes on Kaldi side automatically, it would be nice, but at this moment, the priority is to have a good coverage of valid use cases. That itself is a great improvement.

Also making tests depend on external resource (networking, files stored elsewhere) increase maintenance cost, so we would like to refrain from doing it. Parsing help message of executables is plausible because it's available but let's defer on that one. We can discuss the extra value of doing that once we have a good set of values to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants