Revise parameters for Kaldi mfcc compatibility test #689

mthrok · 2020-06-04T13:15:31Z

Similar to #679

We should also revise the parameters for mfcc test.

See also #681

engineerchuan · 2020-07-14T19:39:30Z

I would like to work on this.

mthrok · 2020-07-14T21:46:52Z

I would like to work on this.

Hi @engineerchuan

Thanks. Do you know what are good parameters for mfcc? I am not expert but we can consult with our collaborators.

engineerchuan · 2020-07-15T12:21:25Z

Not off top of my head. Let me study it first for a day and come up with a proposal.

engineerchuan · 2020-07-16T11:37:57Z

Hi @mthrok,

I would like to follow this approach with some questions:

For testing compute-fbank-feats and compute-mfcc-feats, first extract the default argument values.
As suggested in Revise parameters for Kaldi fbank compatibility test #679, use example datasets to augment more examples of "valid" kaldi parameters both for fbank and mfcc.

Question 1: How should we store and keep the default values for fbank and mfcc up to date?

Recommendation - cache the default fbank values and the override values in json. In future, revise manually if kaldi default argument values or example datasets default argument values change.

Question 2: How should we handle when some datasets don't have fbank config or don't have mfcc config?

Recommendation - we should only use configs from datasets for testing fbank or mfcc if they have the respective config.

Example: Switchboard has both fbank and mfcc config, thus we will use both for testing.

Example: librispeech only stores mfcc config, thus we will not use librispeech for testing fbank

Question 3: What should we do with generate_fbank_data.py?

Currently generate_fbank_data.py generates random parameters, which may be invalid. We could have it make network wget calls to the relevant repositories if possible to retrieve and parse the values. It could inspect Kaldi source code directory or execute the executable path with --help to parse out default values. This sounds hacky and maybe we should skip it for now.

mthrok · 2020-07-16T18:46:38Z

Question 1: How should we store and keep the default values for fbank and mfcc up to date?

Recommendation - cache the default fbank values and the override values in json. In future, revise manually if kaldi default argument values or example datasets default argument values change.

I am not quite sure what you mean by cache, but in terms of JSON data, I think providing empty arguments {}, would result in default parameters in both Kaldi CLI and torchaudio's implementation. That way if Kaldi changes default values, I think we can notice. Then we can add arguments with the current default values {"allow_downsample": false, "allow_upsample": false, ... }. I think the later is what you mean by caching.

BTW: Currently kaldi used in test CI is updated manually and I do it from time to time by building a new Docker file and pushing it. Although we plan to update it automatically, we do not know when that will happen.

Also, note that there are some parameter discrepancies on parameters due to inconsistent design. Kaldi expects full range wave form where as typical torchaudio functional expects normalized waveform, yet torchaudio.compliance.kaldi module expects full range values, which confuse users. #371 (comment), #328 I think for this test case, we use load_wav with normalize=False, but you might hit something. We have an idea of making kaldi module consistent with the rest of the code base but we have not planned work items yet.

Question 2: How should we handle when some datasets don't have fbank config or don't have mfcc config?

Recommendation - we should only use configs from datasets for testing fbank or mfcc if they have the respective config.

Example: Switchboard has both fbank and mfcc config, thus we will use both for testing.

Example: librispeech only stores mfcc config, thus we will not use librispeech for testing fbank

Yes, that makes sense.

Question 3: What should we do with generate_fbank_data.py?

Currently generate_fbank_data.py generates random parameters, which may be invalid. We could have it make network wget calls to the relevant repositories if possible to retrieve and parse the values. It could inspect Kaldi source code directory or execute the executable path with --help to parse out default values. This sounds hacky and maybe we should skip it for now.

generate_fbank_data.py is obsolete and provides no value. so we can simply delete it. If our tests can incorporate the latest changes on Kaldi side automatically, it would be nice, but at this moment, the priority is to have a good coverage of valid use cases. That itself is a great improvement.

Also making tests depend on external resource (networking, files stored elsewhere) increase maintenance cost, so we would like to refrain from doing it. Parsing help message of executables is plausible because it's available but let's defer on that one. We can discuss the extra value of doing that once we have a good set of values to test.

mthrok added help wanted Kaldi module: tests labels Jun 4, 2020

mthrok mentioned this issue Jun 15, 2020

🚀 Feature Request: Add Kaldi Pitch Feature #686

Closed

engineerchuan mentioned this issue Jul 16, 2020

Revise parameters for Kaldi fbank compatibility test #679

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise parameters for Kaldi mfcc compatibility test #689

Revise parameters for Kaldi mfcc compatibility test #689

mthrok commented Jun 4, 2020

engineerchuan commented Jul 14, 2020

mthrok commented Jul 14, 2020

engineerchuan commented Jul 15, 2020

engineerchuan commented Jul 16, 2020

mthrok commented Jul 16, 2020

Revise parameters for Kaldi mfcc compatibility test #689

Revise parameters for Kaldi mfcc compatibility test #689

Comments

mthrok commented Jun 4, 2020

engineerchuan commented Jul 14, 2020

mthrok commented Jul 14, 2020

engineerchuan commented Jul 15, 2020

engineerchuan commented Jul 16, 2020

mthrok commented Jul 16, 2020