Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configuration files for the "no pre-training" setups #15

Closed
YUCHEN005 opened this issue Feb 3, 2022 · 10 comments
Closed

configuration files for the "no pre-training" setups #15

YUCHEN005 opened this issue Feb 3, 2022 · 10 comments

Comments

@YUCHEN005
Copy link

Hi authors,

Thank you for sharing such nice research work! Thanks to your previous help, I have successfully completed the LRS3 and MUSAN data preparation.

I am now intersted in directly finetune the AVSR system (without pretraining, because of computing resource limits), and hope to reproduce the following highlighted systems in the paper "Robust Self-Supervised Audio-Visual Speech Recognition":

image

I wonder if you can share the config files for these four systems?

If inconvenient, can you guide me some details on how to modify the existed config files in directory conf/av-finetune/? I am new in this field and don't know how to do this, hope you can help me:)

Thank you very much!!

@chevalierNoir
Copy link
Contributor

Hi,

To train a model from scratch, you can simply use the config files in conf/av-finetune and set: model.no_pretrained_weights to true and model.freeze_finetune_updates to 0 in the configuration. Also make sure you have downloaded a pre-trained model and the path to it is set to model.w2v_path.

The best hyperparameters for training from scratch might also differ from those with pre-training, so you may need to tune hyperparameters a bit such as number of updates, warmup steps, etc.

@YUCHEN005
Copy link
Author

Thank you for details! I have made the config file and it seems to be uploaded successfully, but here comes an issue:

(avhubert) [huyuchen@ntu-sce-headnode avhubert]$ fairseq-hydra-train --config-dir /home3/huyuchen/pytorch_workplace/av_hubert/avhubert/conf/av-finetune --config-name base_noise_30h.yaml
Traceback (most recent call last):
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/_internal/utils.py", line 350, in
overrides=args.overrides,
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 112, in run
configure_logging=with_log_configuration,
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/core/utils.py", line 129, in run_job
ret.return_value = task_function(task_cfg)
File "/home3/huyuchen/pytorch_workplace/av_hubert/fairseq/fairseq_cli/hydra_train.py", line 45, in hydra_main
distributed_utils.call_main(cfg, pre_main)
File "/home3/huyuchen/pytorch_workplace/av_hubert/fairseq/fairseq/distributed/utils.py", line 369, in call_main
main(cfg, **kwargs)
File "/home3/huyuchen/pytorch_workplace/av_hubert/fairseq/fairseq_cli/train.py", line 53, in main
utils.import_user_module(cfg.common)
File "/home3/huyuchen/pytorch_workplace/av_hubert/fairseq/fairseq/utils.py", line 482, in import_user_module
importlib.import_module(module_name)
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home3/huyuchen/pytorch_workplace/av_hubert/avhubert/init.py", line 6, in
from .hubert import * # noqa
File "/home3/huyuchen/pytorch_workplace/av_hubert/avhubert/hubert.py", line 46, in
from .hubert_pretraining import (
File "/home3/huyuchen/pytorch_workplace/av_hubert/avhubert/hubert_pretraining.py", line 30, in
from .hubert_dataset import AVHubertDataset
File "/home3/huyuchen/pytorch_workplace/av_hubert/avhubert/hubert_dataset.py", line 34, in
from . import utils as custom_utils
File "/home3/huyuchen/pytorch_workplace/av_hubert/avhubert/utils.py", line 7, in
import cv2
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/cv2/init.py", line 4, in
from .cv2 import *
ImportError: dlopen: cannot load any more object with static TLS
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home3/huyuchen/anaconda3/envs/avhubert/bin/fairseq-hydra-train", line 33, in
sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-hydra-train')())
File "/home3/huyuchen/pytorch_workplace/av_hubert/fairseq/fairseq_cli/hydra_train.py", line 76, in cli_main
hydra_main()
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/main.py", line 37, in decorated_main
strict=strict,
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/_internal/utils.py", line 347, in _run_hydra
lambda: hydra.run(
File "/home3/huyuchen/anaconda3/envs/avhubert/lib/python3.7/site-packages/hydra/_internal/utils.py", line 237, in run_and_report
assert mdl is not None
AssertionError

I have googled and find the problem lies in the import order of cv2 and torch in /home3/huyuchen/pytorch_workplace/av_hubert/avhubert/utils.py, but no matter how I modified, this bug stays the same, do you have any idea how to solve it?

@YUCHEN005
Copy link
Author

Hi, I have sovled this issue, cv2 should not be imported globally, it should be imported inside each method.

I find another issue that deserves your notice, the class AVHubertPretrainingTask has a method load_dictionaries which loads dictionary from "{label_dir}/dict.{label}.txt", where label belongs to {mfcc, km} in first or later iterations, but after applying kmeans on mfcc features to obtain the pretraining labels, I don't see any dict.mfcc.txt, I wonder if there is any steps omitted?

@chevalierNoir
Copy link
Contributor

Hi,

Thank you for reporting this issue. The dict.{label}.txt is only generated automatically if one uses submit_cluster.py to do clustering. Otherwise, one needs to create it manually by for i in $(seq 1 $((n_cluster-1)));do echo $i 10000;done > $lab_dir/dict.{label}.txt. We will add that to the readme.

@YUCHEN005
Copy link
Author

Hi,

Thank you for reporting this issue. The dict.{label}.txt is only generated automatically if one uses submit_cluster.py to do clustering. Otherwise, one needs to create it manually by for i in $(seq 1 $((n_cluster-1)));do echo $i 10000;done > $lab_dir/dict.{label}.txt. We will add that to the readme.

yep, thanks a lot!

@YUCHEN005
Copy link
Author

Hi authors,

Thank you for sharing such nice research work! Thanks to your previous help, I have successfully completed the LRS3 and MUSAN data preparation.

I am now intersted in directly finetune the AVSR system (without pretraining, because of computing resource limits), and hope to reproduce the following highlighted systems in the paper "Robust Self-Supervised Audio-Visual Speech Recognition":

image

I wonder if you can share the config files for these four systems?

If inconvenient, can you guide me some details on how to modify the existed config files in directory conf/av-finetune/? I am new in this field and don't know how to do this, hope you can help me:)

Thank you very much!!

Hi authors, I wonder if all the systems in Table 3 are finetuned on clean data (instead of noisy data)?

@chevalierNoir
Copy link
Contributor

Hi,

The models in table 3 are fine-tuned on noisy data.

@YUCHEN005
Copy link
Author

Thank you for quick reply.

But I feel some confusing, if models in table 3 are fine-tuned on noisy data, then compare table 3(a) and table 4(d) above, under the same train setting (no pretrain + 30h noisy data finetune), the LARGE model performs even worse than BASE model by 15%, which is abnormal:(

@chevalierNoir
Copy link
Contributor

Hi,

This is probably due to overfitting of the large model in this low-resource setting.

@YUCHEN005
Copy link
Author

Oh I see, thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants