-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need a simple python script to run WhisperSpeech locally to compare to bark #67
Comments
You can pretty much take the necessary lines from the notebook itself: from whisperspeech.pipeline import Pipeline
pipe = Pipeline(s2a_ref='collabora/whisperspeech:s2a-q4-tiny-en+pl.model', torch_compile=True)
pipe.generate_to_file("output.wav", "Hello from WhisperSpeech.") you need to install |
Thanks for the quick response...here's my entire script so far:
However, I keep getting an error saying that torch.save can't get the correct "backend", which apparently originates from "torchaudio?" BTW. I'm using PyTorch 2.1.2 and CUDA 11.8 if that matters. Lastly, I'm running Windows 10, not Linux... |
The traceback indicates that the error originates from the line reading
The next step in the traceback refers to utils.py within the torchaudio library, line 311:
The "dispatcher" method it calls is as follows:
The runtime error it creates matches what I'm getting:
That's as far as I got without further reverse engineering the code invocations between one another...Any idea? |
That's quite strange. It seems There seems to be a pretty long discussion (3+ years) about Windows torchaudio support here: pytorch/audio#425 The docs say that Maybe you could add |
This script worked, albeit it's functionally different somewhat...it merely saves the .wav file:
HOWEVER, please note that I altered the
As I understand it, instead of using |
Wow, great that you got it to work. We'd love to integrate this solution into WhisperSpeech itself so it works out of the box on Windows. Before we do that you could use |
Withdrawn message...I found the model files I was asking for... |
Here's the completed script. Was necessary to move to cpu as explained in the comments:
|
These names come from the file names in this repo: https://huggingface.co/collabora/whisperspeech/tree/main You can try out the other |
Anyone knows what is the script to one-shot voice cloning? This one from colab is also not working "pipe.generate(""" |
What is the error you see? |
@Joosheen I've seen that loading the sample directly from a URL did not seem to work on Windows. Could you try downloading the file, putting it in the same folder as the script and modifying the command to use |
OSError: [WinError 1314] Klient nie ma wymaganych uprawnień: 'C:\Users\AJusi\.cache\huggingface\hub\models--speechbrain--spkrec-ecapa-voxceleb\snapshots\5c0be3875fda05e81f3c004ed8c7c06be308de1e\hyperparams.yaml' -> '~\.cache\speechbrain\hyperparams.yaml' |
ok, I saved the file in the same folder as script and made command: "from whisperspeech.pipeline import Pipeline pipe = Pipeline(s2a_ref='collabora/whisperspeech:s2a-q4-tiny-en+pl.model') pipe.generate(""" and the the same error: I'm a newbie so i don't know if i'm not doing something really wrong.. |
I couldn't get it to work either...I ran into some kind of error regarding couldn't load a .ogg file though, don't have the error anymore. I gave up after I created my basic script to test whisperspeech...But for what it's worth I can confirm that the code snippet telling to just add the "speaker" parameter didn't work for some reason. |
There's some more discussion here #72 This issue is was more geared towards getting a simple working script... I'd recommend that the DEVS close this issue and we continue our discussion at the other issue for ease of reference?? |
Closing since it's been addressed. Thanks! |
I really want to test this out and compare it to Bark, but it's implementation seems convoluted to someone like me who's never used Google Colab notebooks. I didn't see any straight Python scripts on the repository that I could tweak to get it working....I'm running an 4090 and am very familiar with OpenAI's whisper, ctranslate2's implementation (of which WhisperX is an offshoot), Transformers libraries, etc. I DO NOT use llama.cpp but suppose I could if need be.
Basically, is there or can they be provided....straight python scripts?
The text was updated successfully, but these errors were encountered: