How to adapt or train AV-HuBERT for other languages? #92

cooelf · 2023-05-18T13:24:23Z

Thanks for the awesome work! I am wondering if it is possible to make AV-HuBERT work for other languages, e.g., Chinese.

I notice that there is a multilingual version in the paper. Is it compatible with different languages? Otherwise, could you provide any suggestions, assuming there is a Chinese lip movement dataset.

Thanks!

chevalierNoir · 2023-06-03T22:01:12Z

@cooelf Yes, using AV-HuBERT for other languages should also work. You can choose a pre-trained checkpoint (large or base) and fine-tune that with Chinese lip reading dataset following the fine-tuning command and refer to this for how to prepare the data. Alternatively, pre-training an AV-HuBERT model of Chinese version from scratch is also doable if you have sufficiently large amount of the audio-visual data.

We mentioned a multilingually pre-trained AV-HuBERT in the paper but that model was not released as it's not as good as the English-only one on LRS3 benchmark. JFYI, we did multilingually fine-tuned AV-HuBERT in our follow-up work and you can find the model checkpoints in this repo.

chevalierNoir mentioned this issue Jan 3, 2024

I'm trying to pretrain a model with another langauge #103

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to adapt or train AV-HuBERT for other languages? #92

How to adapt or train AV-HuBERT for other languages? #92

cooelf commented May 18, 2023

chevalierNoir commented Jun 3, 2023

How to adapt or train AV-HuBERT for other languages? #92

How to adapt or train AV-HuBERT for other languages? #92

Comments

cooelf commented May 18, 2023

chevalierNoir commented Jun 3, 2023