Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About fairseq wav2vec2 extractor #43

Closed
xieyuankun opened this issue Nov 14, 2022 · 6 comments
Closed

About fairseq wav2vec2 extractor #43

xieyuankun opened this issue Nov 14, 2022 · 6 comments

Comments

@xieyuankun
Copy link

hello, i have a problem about wav2vec2 feature extrcator.
i use your code class SSLModel(torch_nn.Module) to extract wav2vec2 feature and get error
Traceback (most recent call last): File "/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/preprocess.py", line 274, in <module> w2vmodel = SSLModel('/home/xieyuankun/data/wav2vec_big_960h.pt', 1024) File "/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/preprocess.py", line 228, in __init__ md, _, _ = fq.checkpoint_utils.load_model_ensemble_and_task([mpath]) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/checkpoint_utils.py", line 473, in load_model_ensemble_and_task model = task.build_model(cfg.model, from_checkpoint=True) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/tasks/audio_pretraining.py", line 197, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/tasks/fairseq_task.py", line 338, in build_model model = models.build_model(cfg, self, from_checkpoint) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/models/__init__.py", line 106, in build_model return model.build_model(cfg, task) File "/home/xieyuankun/miniconda3/lib/python3.9/site-packages/fairseq/models/wav2vec/wav2vec2_asr.py", line 208, in build_model w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary)) TypeError: object of type 'NoneType' has no len()
it seems like a problem from fairseq fq.checkpoint_utils.load_model_ensemble_and_task([mpath]) which didn't recognize the task: Wav2VecEncoder(cfg, len(task.target_dictionary))

@TonyWangX
Copy link
Member

TonyWangX commented Nov 14, 2022

Hi @xieyuankun

Sorry for the inconvenience.
I use the Fairseq code from their github, not from the fairseq in pip.
Fairseq github code has been updated regularly. I need to use a specific commit from that github.

To make the life easier, please run this script to install the conda environment.
It will install fairseq and other dependency.
https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts/blob/master/env-fs-install.sh

# make sure other conda envs are not loaded
bash env-fs-install.sh

# load
conda activate fairseq-pip2

This is written in https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts#python-environment

You can also follow the script and install fairseq manually.

@xieyuankun
Copy link
Author

Thanks for you quick reply @TonyWangX
Unfortunately, I just use the previous version of fairseq from your env-fs-install.sh and get the same error.

File "preprocess.py", line 274, in
w2vmodel = SSLModel('/home/xieyuankun/data/wav2vec_big_960h.pt', 1024)
File "preprocess.py", line 228, in init
md, _, _ = checkpoint_utils.load_model_ensemble_and_task([mpath])
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/checkpoint_utils.py", line 457, in load_model_ensemble_and_task
model = task.build_model(cfg.model)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/tasks/audio_pretraining.py", line 198, in build_model
model = super().build_model(model_cfg)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/tasks/fairseq_task.py", line 320, in build_model
model = models.build_model(cfg, self)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/models/init.py", line 107, in build_model
return model.build_model(cfg, task)
"/home/xieyuankun/asvspoof2019_wav2vec2_fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 176, in build_model
w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
TypeError: object of type 'NoneType' has no len()

I ’ll keep trying to fix this bug. By the way, may i ask you another question about wav2vec2 extraction. Have you ever tried the huggingface version of wav2vec2 ? I have used following code to extract wav2vec2 feature.

  waveform = sf.read(path) 
  feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-large-960h") 
  model = Wav2Vec2model.from_pretrained("facebook/wav2vec2-large-960h").cuda()
  
  input_values = feature_extractor(waveform, sampling_rate=16000, return_tensors="pt").input_values.cuda()  
  wav2vec2_vector = model(input_values)['last_hidden_state']

However, the above wav2vec2_vector with LCNN backbone got 12%+ EER for 19LA. So, what's wrong with this extraction process?

@TonyWangX
Copy link
Member

TonyWangX commented Nov 14, 2022

I guess that wav2vec_big_960h is a fine-tuned model.
facebookresearch/fairseq#3831

Could you try https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt instead?

@TonyWangX
Copy link
Member

I haven't used huggingface wav2vec hence cannot guess.

Maybe related to the above issue, if the model is fine-tuned for ASR, using the last self-attention layer may not be a good option.

FYI, not relevant to the above issue, some prefer using intermediate layers' output from the wav2vec 10.21437/Interspeech.2022-11460

@xieyuankun
Copy link
Author

I guess that wav2vec_big_960h is a fine-tuned model. facebookresearch/fairseq#3831

Could you try https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt instead?

Thanks! You are right !
I read the source code for a long time, but I didn't expect the checkpoint to be wrong.
I wish you good luck with your research study and looking forward to your future works.

@TonyWangX
Copy link
Member

Thank you for the feedback!
Noted for one potential issue ; )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants