Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper v3 #560

Open
RaulKite opened this issue Nov 6, 2023 · 22 comments
Open

Whisper v3 #560

RaulKite opened this issue Nov 6, 2023 · 22 comments

Comments

@RaulKite
Copy link

RaulKite commented Nov 6, 2023

How to use new model whisper v3 with whisper X?

Just delete current model large and download it again or is needed an update?

Thanks

@olivierduponttav
Copy link

olivierduponttav commented Nov 6, 2023 via email

@petiatil
Copy link

petiatil commented Nov 6, 2023

Wouldn't you just replace "large-v2" with "large-v3"?

@xrishox
Copy link

xrishox commented Nov 7, 2023

ValueError: Invalid model size 'large-v3', expected one of: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2, large

@Hokyjack
Copy link

Hokyjack commented Nov 7, 2023

@petiatil
Copy link

petiatil commented Nov 7, 2023

Has anyone explored if the large-v3 model link can be integrated for dynamic downloading as for the rest of the models (instead of requiring pre-download). If possible, please point out which file or files in the repository would need to be modified to achieve this.

@picheny-nyu
Copy link

For use in whisperx I think the large-v3 models needs to be converted to the "faster whisper" format. It looks straightforward but perhaps best done by the original authors as something always goes wrong and requires some debugging.....

@alexivdv
Copy link

alexivdv commented Nov 7, 2023

There is already open pull request in faster-whisper to support this SYSTRAN/faster-whisper#548 including the model that was converted to ctranslate2 format, but also the features size increased to 128 from 80

@colemanhindes
Copy link

Just need to wait openAi release it on HuggingFace and it will be easy to implement. Olivier Dupont [Tavira Monaco SAM] D: +33 9 75 17 85 81tel:+33%209%2075%2017%2085%2081 M: +33 6 86 70 86 05tel:+33%206%2086%2070%2086%2005 F: +377 93 25 15 06tel:+377%2093%2025%2015%2006 www.tavira.group<http://www.tavira.group/>

------------------------------------------------------------------------------------------------------ This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of the Tavira group. If you are not the intended recipient of this email and its attachments, you must take no action based upon them, nor must you copy or show them to anyone. Please contact the sender if you believe you have received this email in error. Tavira Monaco SAM is accredited and regulated by the CCAF. ------------------------------------------------------------------------------------------------------ Le 6 nov. 2023 à 21:21, Raul Sanchez @.> a écrit :  How to use new model whisper v3 with whisper X? Just delete current model large and download it again or is needed an update? Thanks — Reply to this email directly, view it on GitHub<#560>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A5GDKUPUJQUSZ2FXXGATWFTYDFBC3AVCNFSM6AAAAAA7ACXXF6VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3TSOJZGQ4TMMQ. You are receiving this because you are subscribed to this thread.Message ID: @.>

Looks like it has been added to huggingface https://huggingface.co/openai/whisper-large-v3

@picheny-nyu
Copy link

I tried to run the above and get the following error:

File "/ext3/miniconda3/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2065, in _from_pretrained
raise ValueError(
ValueError: Non-consecutive added token '<|0.02|>' found. Should have index 50365 but has index 50366 in saved vocabulary.

Has anyone else seen this?

@vilsonrodrigues
Copy link

Tentei executar o acima e recebi o seguinte erro:

Arquivo "/ext3/miniconda3/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", linha 2065, em _from_pretrained raise ValueError( ValueError: Token adicionado não consecutivo '<|0.02|>' encontrado. Deveria ter índice 50365, mas possui índice 50366 no vocabulário salvo.

alguém mais viu isso?

OpenAI added a new language. so there is a new token

@picheny-nyu
Copy link

I got the huggingface large-v3 working by upgrading the transformers package. Apparently there is new tokenization code (sigh). However, I don't think there is a new version of faster-whisper yet. When there is, can we just get it with a pip install whisperx --upgrade type of command, or must we upgrade the faster_whisper package manually ourselves?

@vilsonrodrigues
Copy link

I got the huggingface large-v3 working by upgrading the transformers package. Apparently there is new tokenization code (sigh). However, I don't think there is a new version of faster-whisper yet. When there is, can we just get it with a pip install whisperx --upgrade type of command, or must we upgrade the faster_whisper package manually ourselves?

There are already 2 PR in faster-whisper, the official maintainer is no longer providing direct support, we have to wait for another

https://github.com/guillaumekln/faster-whisper/pulls

@petiatil
Copy link

petiatil commented Nov 8, 2023

My use case (If anyone has any insight on how whether I can manually update whisperx v2 to integrate large-v3):

  • I am using WhisperX v2 for now due to a recent issue when upgrading
  • Due to some manual changes to whisperx for my use-case, I might ideally be able to manually update to allow use of 'large-v3' (To avoid having to upgrade and have to manually change the files again).
    • So if updating to enable 'large-v3' is straightforward and likely won't cause unforeseen issues, I'd appreciate any advice on what to update

@wahahaer
Copy link

wahahaer commented Nov 9, 2023

wait for the update

@SebaM90
Copy link

SebaM90 commented Nov 11, 2023

Ctranslate2 (fast inference engine used by Faster Whisper) has been successfully updated to support Whisper large-v3:
https://github.com/OpenNMT/CTranslate2/releases/tag/v3.21.0

@MahmoudAshraf97
Copy link
Contributor

large-v3 should be supported after #599 is merged

@s-h-a-d-o-w
Copy link

@MahmoudAshraf97
Would you mind confirming that it should work when using the main branch via pip?

Because when I just tried it, I first encountered this problem: #444
Then after applying the workaround mentioned there, I got this: SYSTRAN/faster-whisper#547 (And unfortunately, I don't see how I could work around that.)

@MahmoudAshraf97
Copy link
Contributor

@s-h-a-d-o-w I thoroughly tested it before submitting this PR, upgrade whisperx, faster-whisper, and ctranslate2 and try again

@picheny-nyu
Copy link

large-v3 now works for me but I did have to do a force-reinstall in addition o the upgrade.

@li-henan
Copy link

li-henan commented Feb 1, 2024

large-v3 it can be downloaded here: https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt

dear friend, could I ask how to use it after downing the large-v3.pt?

@li-henan
Copy link

li-henan commented Feb 1, 2024

large-v3 now works for me but I did have to do a force-reinstall in addition o the upgrade.

dear friend, could I ask how do you install and use it after downing the large-v3.pt?

@picheny-nyu
Copy link

You don't need to download it; you can just refer to it as the "large-v3" model the same way you do for "medium", "large", "large-v2" once the code it updated. It should download in the background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests