Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with TTS in 2.8 #1707

Closed
Jasonthefirst opened this issue Feb 14, 2024 · 17 comments · Fixed by #1711 or #1713
Closed

Problem with TTS in 2.8 #1707

Jasonthefirst opened this issue Feb 14, 2024 · 17 comments · Fixed by #1711 or #1713
Labels
bug Something isn't working

Comments

@Jasonthefirst
Copy link

We are using LocalAI in Docker but have Problems with all TTS models described in TTS in LocalAI .

But when calling the following curl:

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{ "backend": "bark", "input":"Hello!" }' | aplay

we get the following error:

stderr OSError: /opt/conda/envs/transformers/lib/python3.11/site-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZN2at4_ops10zeros_like4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEENS6_INS5_12MemoryFormatEEE

This error is thrown with bark, qoqui and Vall-E-X. Piper works.

LocalAI version:
v2.8.0-cublas-cuda12-ffmpeg

Environment, CPU architecture, OS, and Version:
Linux aifb-bis-mlpc 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:33:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

To Reproduce
Run v2.8.0-cublas-cuda12-ffmpeg LocalAI on Server an the curl command.

Expected behavior
LocalAI shouldn't return an error but a tts file.

Logs
I added a log file. _Shared_LocalAI_logs.txt

@Jasonthefirst Jasonthefirst added bug Something isn't working unconfirmed labels Feb 14, 2024
@mudler
Copy link
Owner

mudler commented Feb 14, 2024

this should be already fixed in master images - @Jasonthefirst could you please test it out?

@golgeek
Copy link
Collaborator

golgeek commented Feb 14, 2024

@mudler not sure if your fix was also deployed for vllm, but I just tried loading a model with vllm backend with the master image and got the same error:

3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr Traceback (most recent call last):
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr   File "/build/backend/python/vllm/backend_vllm.py", line 13, in <module>
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr     from vllm import LLM, SamplingParams
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr   File "/opt/conda/envs/transformers/lib/python3.11/site-packages/vllm/__init__.py", line 3, in <module>
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr     from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr   File "/opt/conda/envs/transformers/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 6, in <module>
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr     from vllm.config import (CacheConfig, ModelConfig, ParallelConfig,
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr   File "/opt/conda/envs/transformers/lib/python3.11/site-packages/vllm/config.py", line 9, in <module>
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr     from vllm.utils import get_cpu_memory, is_hip
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr   File "/opt/conda/envs/transformers/lib/python3.11/site-packages/vllm/utils.py", line 11, in <module>
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr     from vllm._C import cuda_utils
3:11PM DBG GRPC(casperhansen/mixtral-instruct-awq-127.0.0.1:45981): stderr ImportError: /opt/conda/envs/transformers/lib/python3.11/site-packages/vllm/_C.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15to_dtype_layout4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbbNS6_INS5_12MemoryFormatEEE

@mudler
Copy link
Owner

mudler commented Feb 14, 2024

@golgeek I've tried only with TTS models (vall-e-x specifically), can you confirm that? please open up another issue for vLLM

@Jasonthefirst
Copy link
Author

We tested it with the master branch (master-cublas-cuda12-ffmpeg). As input we used the standard curl and we got the following error:

curl:

curl
http://localhost:8080/tts
-H "Content-Type: application/json" -d '{
   "backend": "bark",
   "input":"Hello!"
}' | aplay
4:32PM DBG Loading model in memory from file: /models/model_configuration
4:32PM DBG Loading Model  with gRPC (file: /models/model_configuration) (backend: hello!): {backendString:Hello! model: threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000230800 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:true parallelRequests:true}
[172.24.0.1]:37026 500 - POST /tts

@golgeek
Copy link
Collaborator

golgeek commented Feb 14, 2024

@golgeek I've tried only with TTS models (vall-e-x specifically), can you confirm that? please open up another issue for vLLM

Sorry, the error seemed too suspiciously similar, and I thought it might be the same origin.

Was coming back to report the same as @Jasonthefirst.

And I opened #1710 for vLLM.

@mudler
Copy link
Owner

mudler commented Feb 14, 2024

@golgeek / @Jasonthefirst any chance you can give #1711 a shot?

@mudler mudler reopened this Feb 14, 2024
@mudler
Copy link
Owner

mudler commented Feb 14, 2024

waiting for feedback, merged the PR so master images are going to be built soon so we can try it out much easier just consuming master images

@golgeek
Copy link
Collaborator

golgeek commented Feb 14, 2024

Sorry that it took me forever to realize that the images weren't pushed, then an equal amount of time to build a docker image from your branch.

I just ran a quick test for vLLM and the model loaded successfully, so I'd say it's fixed but maybe it's better to wait a bit more and confirm with the images from the master branch.

@golgeek
Copy link
Collaborator

golgeek commented Feb 15, 2024

I ran some tests again with master images, and can confirm that #1710 is fixed (just closed the issue, thanks a lot!).

As for this issue specifically, I tested with master-cublas-cuda12-ffmpeg (sha256:de26b09328fea0bd57ff2e14ae28ba9a54ca489a1bc96208131f8d4c1d494672), and while the initial error is definitely fixed, there still seems to be something wrong as, when curling:

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{"backend": "bark","input":"Hello!"}'
{"error":{"code":500,"message":"grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/hello!. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS","type":""}}

I'm getting:

3:03PM DBG Request for model:
3:03PM INF Loading model with backend Hello!
3:03PM DBG Loading model in memory from file: /build/models
3:03PM DBG Loading Model  with gRPC (file: /build/models) (backend: hello!): {backendString:Hello! model: threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0001f6000 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}

It appears there might be a mixup between the backend and input fields as LocalAI tries to load the Hello! backend, though the TTSEndpoint code looks totally legit.

@mudler
Copy link
Owner

mudler commented Feb 15, 2024

I ran some tests again with master images, and can confirm that #1710 is fixed (just closed the issue, thanks a lot!).

As for this issue specifically, I tested with master-cublas-cuda12-ffmpeg (sha256:de26b09328fea0bd57ff2e14ae28ba9a54ca489a1bc96208131f8d4c1d494672), and while the initial error is definitely fixed, there still seems to be something wrong as, when curling:

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{"backend": "bark","input":"Hello!"}'
{"error":{"code":500,"message":"grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/hello!. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS","type":""}}

I'm getting:

3:03PM DBG Request for model:
3:03PM INF Loading model with backend Hello!
3:03PM DBG Loading model in memory from file: /build/models
3:03PM DBG Loading Model  with gRPC (file: /build/models) (backend: hello!): {backendString:Hello! model: threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0001f6000 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}

It appears there might be a mixup between the backend and input fields as LocalAI tries to load the Hello! backend, though the TTSEndpoint code looks totally legit.

ouch, good catch, this is a regression introduced in #1692.

mudler added a commit that referenced this issue Feb 15, 2024
fixes #1707 

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
@golgeek
Copy link
Collaborator

golgeek commented Feb 15, 2024

lol I've read that line at least four times before writing it looked legit

mudler added a commit that referenced this issue Feb 15, 2024
fixes #1707

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
@mudler
Copy link
Owner

mudler commented Feb 15, 2024

my muscle memory still use "fixes" - and GH automatically closes the issue :)

@mudler
Copy link
Owner

mudler commented Feb 16, 2024

v2.8.2 images has been released with all the fixes @golgeek / @Jasonthefirst could you test it?

@golgeek
Copy link
Collaborator

golgeek commented Feb 16, 2024

Just tested, v2.8.2 image worked flawlessly! Thanks @mudler!

4:11PM DBG GRPC(-127.0.0.1:40203): stderr tts for
4:11PM DBG GRPC(-127.0.0.1:40203): stderr text: "Hello!"
4:11PM DBG GRPC(-127.0.0.1:40203): stderr dst: "/tmp/generated/audio/piper.wav"
4:11PM DBG GRPC(-127.0.0.1:40203): stderr
[172.18.0.1]:37598 200 - POST /tts

@mudler
Copy link
Owner

mudler commented Feb 16, 2024

Just tested, v2.8.2 image worked flawlessly! Thanks @mudler!

4:11PM DBG GRPC(-127.0.0.1:40203): stderr tts for
4:11PM DBG GRPC(-127.0.0.1:40203): stderr text: "Hello!"
4:11PM DBG GRPC(-127.0.0.1:40203): stderr dst: "/tmp/generated/audio/piper.wav"
4:11PM DBG GRPC(-127.0.0.1:40203): stderr
[172.18.0.1]:37598 200 - POST /tts

cool, thanks for checking it out!

@mudler mudler closed this as completed Feb 16, 2024
@Jasonthefirst
Copy link
Author

Thank you so much. (nearly) everything works now.
We still have a problem with Musicgen (https://localai.io/features/text-to-audio/#transformers-musicgen). This throws the error: no module named 'google' and with selecting different speakers with bark. Then we get sendfile: file /tmp/generated/audio/piper_28.wav not found.

But besides that it is awesesome and the speed that this got fixed is nice as well. We really appreciate it.

@mudler
Copy link
Owner

mudler commented Feb 19, 2024

Thank you so much. (nearly) everything works now. We still have a problem with Musicgen (https://localai.io/features/text-to-audio/#transformers-musicgen). This throws the error: no module named 'google' and with selecting different speakers with bark. Then we get sendfile: file /tmp/generated/audio/piper_28.wav not found.

Please open separate tickets for it with full logs and how to reproduce it, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants