Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language issues [No default align-model for language: gu] #21

Closed
alloc7260 opened this issue Apr 11, 2023 · 7 comments
Closed

Language issues [No default align-model for language: gu] #21

alloc7260 opened this issue Apr 11, 2023 · 7 comments

Comments

@alloc7260
Copy link

Error :
Detected language: Gujarati
100%|██████████| 8533/8533 [00:16<00:00, 512.95frames/s]There is no default alignment model set for this language (gu). Please find a wav2vec2.0 model finetuned on this language in https://huggingface.co/models, then pass the model name in --align_model [MODEL_NAME]


ValueError Traceback (most recent call last)
in <cell line: 8>()
6
7 device = "cuda"
----> 8 alignment_model, metadata = whisperx.load_align_model(
9 language_code=whisper_results["language"], device=device
10 )

/usr/local/lib/python3.9/dist-packages/whisperx/alignment.py in load_align_model(language_code, device, model_name)
51 print(f"There is no default alignment model set for this language ({language_code}).
52 Please find a wav2vec2.0 model finetuned on this language in https://huggingface.co/models, then pass the model name in --align_model [MODEL_NAME]")
---> 53 raise ValueError(f"No default align-model for language: {language_code}")
54
55 if model_name in torchaudio.pipelines.all:

ValueError: No default align-model for language: gu

@MahmoudAshraf97
Copy link
Owner

Not all languages are supported right now, I'm actively working on supporting more languages

@alloc7260
Copy link
Author

I am also willing to contribute for the same.
Just wanted little guidance.

@alloc7260
Copy link
Author

Can you tell me how many languages are supported right now?

@MahmoudAshraf97
Copy link
Owner

Right now word timestamps are generated using WhisperX, languages that are not supported in whisperx can be generated using Whisper Dynamic Time Warping, you can find tutorals for that on the original whisper repo, and supported languages are in the code

@alloc7260
Copy link
Author

mn = "skylord/wav2vec2-large-xlsr-hindi" #@param
alignment_model, metadata = whisperx.load_align_model(
language_code=whisper_results["language"], device=device, model_name=mn
)

I have changes this line
it is used to take language specific model from hugging face

there are many language model available for many languages there

take model name from there that suits your language and put it in mn variable

and continue running...

WER will vary according to model you choose

@MahmoudAshraf97
Copy link
Owner

You can modify this in whisperX repo, we import supported languages from there

@MahmoudAshraf97
Copy link
Owner

@alloc7260 Hello, all languages that are supported in whisper are supported in the code now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants