AllTalk v2 BETA Download Details & Discussion #245
Replies: 18 comments 34 replies
-
Installing and Testing now, Amazing job. Will post results / comments soon |
Beta Was this translation helpful? Give feedback.
-
All good here (windows 10). Thank you for your hard work. It looks great! Love being able to easily switch between finetuned xtts models. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your work. Testing it on RunPod Ubuntu. Installation worked fine but running it I get The "Running in Docker" is strange as I don't have docker installed. After manually editing the script.py I got it to work. Unfortunately I can't get DeepSpeed to work. It says it is installed Can't get Gradio UI to work since RunPod creates a Cloudflare Tunnel and afaik there is no way to specify a a custom API / Gradio domain during the AllTalk setup. I still hope a future version of AllTalk can nicely integrate with running the application in the cloud (Runpod, Collab, etc) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hey, thanks for the beta! One issue I noticed is not being able to access the gradio page from other devices in the network. The api page and TTS generator page is accessible using 192.168... but not gradio, not sure if this is an issue on my end. Everything is accessible from the host computer on 127.0 etc |
Beta Was this translation helpful? Give feedback.
-
Hi, thank you for the great work. I was trying to create a dataset for Arabic language, however, I'm getting the following error: If I switch the language to English (but still using an Arabic audio files), it works fine, generates wavs correctly, and actually translate the sentences in metadata_train & metadata_eval to English. So it's understanding the language fine, but there seems to be an issue with file/sentence generation in native Arabic. |
Beta Was this translation helpful? Give feedback.
-
I ran into a couple of issues running the beta as an Oobabooga plugin. The first is that on first startup, it looked for
In my specific case, it should have looked for that file in Editing the script_path variable with the correct path fixed the issue on the next startup. The second issue is tricker and I haven't been able to figure it out. Despite seemingly having all the requirements installed correctly, the app complains that there is a missing Gradio "system" module:
Note that I've tried installing the requirements first through atsetup.bat, and subsequently via Here are the relevant sections from running the diagnostics:
(Another side note, the message above mentions "you should see 'text-generation-webui' listed in the path of the above folders." That is no longer correct, because dashes in the folder name now cause AllTalk to throw an error on startup. Could be confusing to some users.) Let me know if you'd like me to try anything specific to troubleshoot. |
Beta Was this translation helpful? Give feedback.
-
From my experience with v1, and looking at the screenshot of v2, this is going to be phenomenal. So glad you're doing all of this. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Hello, the second variation is really great. By the way, is it possible for you to make it so that it can serve multiple clients simultaneously rather than sequentially as the requests come in? So, can it be asynchronous? Of course, if there are enough resources, but could it be done even through Docker? Thanks. |
Beta Was this translation helpful? Give feedback.
-
Hey, V2 is great. Here are some suggestions to further streamline the user interface. The contents of the AllTalk v2 Beta, Generate Help, API Endpoints & Dev, and About This Project tabs should all be moved under the Documentation and Help section. TTS-generation settings can also be shifted under Global Settings. Please consider. |
Beta Was this translation helpful? Give feedback.
-
Hey, great work on the project, I've been using v1 for a few days now and have started moving towards v2 with my project totally-real-news-bot so I can use RVC models. I'm using alltalkv2 in TGWUI mode I'm using the API with piper TTS, that works great, generation is muiltiple times faster than coqui (my pc build is chinese e-waste), but when I use RVC, it seems to start the conversion process, it finds a .pth model, VRAM usage goes up but then my CPU shoots to 100% like its processing something. Is it possible RVC conversion is in cpu mode or could my setup be incorrectly configured? |
Beta Was this translation helpful? Give feedback.
-
I tested the project, it's amazing! works perfectly with English. |
Beta Was this translation helpful? Give feedback.
-
I've been using v2 for a while now and it's fantastic. I use the standalone version, usually with SillyTavern. I did have some difficulty getting the SillyTavern settings to work. Something about having so many voice/narrator dropdowns and having to match them with selections in the webui, it's confusing to me. It would be great to have some way to save some presets - for example, when selecting preset A, it populates alltalk character, narrator, and rvc character, narrator. Can't wait for the large generator to be added to the main webui. That is my main wish, together with being able to import .txt files into it or, better yet, process an entire folder of .txt files. Wow!! Having RVC applied at the same time is so much less hassle than exporting .wav, then running it through RVC webui manually... Last thing: are you aware of any XTTS2 finetunes for accents or gender (in English)? I've googled a lot and been to websites that claim to host tons of models but have found only a handful of XTTS2 finetunes, and not a single interesting one. Thx! |
Beta Was this translation helpful? Give feedback.
-
Would you be looking to add MeloTTS to v2 at some point? Seems like one of the better (and faster) TTS models that you can also train locally. |
Beta Was this translation helpful? Give feedback.
-
Can't get it work :(
|
Beta Was this translation helpful? Give feedback.
-
On a fresh install of Debian 12 with an Nvidia GPU, I followed the setup instructions - cloned the beta branch, ran I noticed some python requirements were missing when I tried to run it, so I tried to reapply/reinstall requirements, and got this error:
In |
Beta Was this translation helpful? Give feedback.
-
On Debian 12 (fresh install), after installing the proper Nvidia drivers, launching AllTalk Beta resulted in the following error:
The fix was to downgrade pytorch to 2.2.0 by activating the conda env and executing: conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia I also had to make sure the following lines were at the bottom of my export CUDA_HOME=/usr/local/cuda-12.1
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64 With that, the beta is working with CUDA+DeepSpeed! |
Beta Was this translation helpful? Give feedback.
-
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory This error has been a nightmare. Im trying to install with Ubuntu 2204. I can get the webui up and going but then I hit the create dataset button and it throws this error. I have tried all pytorch with the 121 url and 118, I have tried it with pip and with conda. I have reinstalled CUDA and cdnn I am running out of ideas |
Beta Was this translation helpful? Give feedback.
-
I'm intending this to be a simple place for people to discuss the v2 BETA. If there is something you want to discuss or a small technical issue, we can do that here. If you have a big technical issue, where there will probably be crashes/logs and lots to discuss, lets open that in a ticket.
For those whom want to try it, the BETA is here https://github.com/erew123/alltalk_tts/tree/alltalkbeta (sorry the instructions are a bit rough and ready at the moment)
This is not a direct update over V1 as you would need to delete/rebuild some of the Python environment (its completely possible, I just don't have time to explain ATM).
Since upload I have tested a fresh download/setup on Windows as a Standalone installation. That seems to be fine.
Over the last 2-3 days I had to change about some code for when AllTalk is installed into Text-generation-webui's Python environment, Im reasonably sure that all should be ok if you install AllTalk that way, but Ive not managed to have hours of testing.
With Linux installation. I had a lot of difficulties with Nvidia's own packages, breaking other Nvidia packages when they installed {insert swear words here}. As such this took me about 40-50 installation tests on Linux and writing my own fix to repair some symlinks during the installation. generally all should be good with Linux installs, but now you know why the I was delayed (40-50 installs x 20 minutes each + troubleshooting). but there was no point pushing something out that may cause problems down the line.
Otherwise:
Now that the core of AllTalk is rebuilt, I hope to just make sure things are stable, fix any issues or clear up any documentation before adding in other features/engines etc that I've not had time to yet.
In effect though, the code base should be mostly stable now, so if people want to look at adding/changing any code, that should be ok.
FYI, for those whom are interested in trying to get AMD, Intel or Mac acceleration working, You're welcome to give it a go. Obviously different TTS engines code will be capable of supporting different things, however, all the AllTalk code to speak to any TTS engines is now broken out into each engine you can find in
/system/tts_engines/{enginename}/
and you can simply work on one of those OR even just copy one and add it to thetts_engines.json
list as an engine to test/work with. There will be no need to touch or interact with any other code.I'm welcome to general feedback here, but, if you have a big technical problem lets do that through a ticket and try keep this cleaner here for general discussion.
The Feature Requests List of things I haven't gotten around to is here
I've literally spent X days in front of a computer screen 10+ hours each day, so do excuse me taking a bit of a day or two's break (aka, my responses my be slow).
One Piper voice (that I know of)
en_US-ryan-xxxxxx.onnx
has issues. Its a known issue and not AllTalk. It sometimes works and sometimes speaks a garbled mess.Thanks
Beta Was this translation helpful? Give feedback.
All reactions