About fairseq wav2vec2 extractor #43

xieyuankun · 2022-11-14T03:24:29Z

hello, i have a problem about wav2vec2 feature extrcator.
i use your code class SSLModel(torch_nn.Module) to extract wav2vec2 feature and get error
Traceback (most recent call last): File "/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/preprocess.py", line 274, in <module> w2vmodel = SSLModel('/home/xieyuankun/data/wav2vec_big_960h.pt', 1024) File "/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/preprocess.py", line 228, in __init__ md, _, _ = fq.checkpoint_utils.load_model_ensemble_and_task([mpath]) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/checkpoint_utils.py", line 473, in load_model_ensemble_and_task model = task.build_model(cfg.model, from_checkpoint=True) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/tasks/audio_pretraining.py", line 197, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/tasks/fairseq_task.py", line 338, in build_model model = models.build_model(cfg, self, from_checkpoint) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/models/__init__.py", line 106, in build_model return model.build_model(cfg, task) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/models/wav2vec/wav2vec2_asr.py", line 208, in build_model w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary)) TypeError: object of type 'NoneType' has no len()
it seems like a problem from fairseq fq.checkpoint_utils.load_model_ensemble_and_task([mpath]) which didn't recognize the task: Wav2VecEncoder(cfg, len(task.target_dictionary))

The text was updated successfully, but these errors were encountered:

TonyWangX · 2022-11-14T06:26:30Z

Hi @xieyuankun

Sorry for the inconvenience.
I use the Fairseq code from their github, not from the fairseq in pip.
Fairseq github code has been updated regularly. I need to use a specific commit from that github.

To make the life easier, please run this script to install the conda environment.
It will install fairseq and other dependency.
https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts/blob/master/env-fs-install.sh

# make sure other conda envs are not loaded
bash env-fs-install.sh

# load
conda activate fairseq-pip2

This is written in https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts#python-environment

You can also follow the script and install fairseq manually.

xieyuankun · 2022-11-14T10:31:55Z

Thanks for you quick reply @TonyWangX
Unfortunately, I just use the previous version of fairseq from your env-fs-install.sh and get the same error.

File "preprocess.py", line 274, in
w2vmodel = SSLModel('/home/xieyuankun/data/wav2vec_big_960h.pt', 1024)
File "preprocess.py", line 228, in init
md, _, _ = checkpoint_utils.load_model_ensemble_and_task([mpath])
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/checkpoint_utils.py", line 457, in load_model_ensemble_and_task
model = task.build_model(cfg.model)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/tasks/audio_pretraining.py", line 198, in build_model
model = super().build_model(model_cfg)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/tasks/fairseq_task.py", line 320, in build_model
model = models.build_model(cfg, self)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/models/init.py", line 107, in build_model
return model.build_model(cfg, task)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 176, in build_model
w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
TypeError: object of type 'NoneType' has no len()

I ’ll keep trying to fix this bug. By the way, may i ask you another question about wav2vec2 extraction. Have you ever tried the huggingface version of wav2vec2 ? I have used following code to extract wav2vec2 feature.

  waveform = sf.read(path) 
  feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-large-960h") 
  model = Wav2Vec2model.from_pretrained("facebook/wav2vec2-large-960h").cuda()
  
  input_values = feature_extractor(waveform, sampling_rate=16000, return_tensors="pt").input_values.cuda()  
  wav2vec2_vector = model(input_values)['last_hidden_state']

However, the above wav2vec2_vector with LCNN backbone got 12%+ EER for 19LA. So, what's wrong with this extraction process?

TonyWangX · 2022-11-14T11:44:24Z

I guess that wav2vec_big_960h is a fine-tuned model.
facebookresearch/fairseq#3831

Could you try https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt instead?

TonyWangX · 2022-11-14T11:50:26Z

I haven't used huggingface wav2vec hence cannot guess.

Maybe related to the above issue, if the model is fine-tuned for ASR, using the last self-attention layer may not be a good option.

FYI, not relevant to the above issue, some prefer using intermediate layers' output from the wav2vec 10.21437/Interspeech.2022-11460

xieyuankun · 2022-11-14T15:28:15Z

I guess that wav2vec_big_960h is a fine-tuned model. facebookresearch/fairseq#3831

Could you try https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt instead?

Thanks! You are right !
I read the source code for a long time, but I didn't expect the checkpoint to be wrong.
I wish you good luck with your research study and looking forward to your future works.

TonyWangX · 2022-11-16T05:36:38Z

Thank you for the feedback!
Noted for one potential issue ; )

xieyuankun closed this as completed Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About fairseq wav2vec2 extractor #43

About fairseq wav2vec2 extractor #43

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 14, 2022 •

edited

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 14, 2022 •

edited

TonyWangX commented Nov 14, 2022

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 16, 2022

About fairseq wav2vec2 extractor #43

About fairseq wav2vec2 extractor #43

Comments

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 14, 2022 • edited

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 14, 2022 • edited

TonyWangX commented Nov 14, 2022

xieyuankun commented Nov 14, 2022

TonyWangX commented Nov 16, 2022

TonyWangX commented Nov 14, 2022 •

edited

TonyWangX commented Nov 14, 2022 •

edited