[Bug] HF transformers update breaks XTTS streaming #31

eginhard · 2024-05-28T07:15:37Z

Describe the bug

huggingface/transformers#30624 broke the XTTS streaming code (https://github.com/idiap/coqui-ai-TTS/blob/df088e99dfda8976c47235626d3afb7d7d70fea2/TTS/tts/layers/xtts/stream_generator.py) as reported in huggingface/transformers#31040. The streaming code needs to be updated accordingly.

To Reproduce

https://coqui-tts.readthedocs.io/en/latest/models/xtts.html#streaming-manually

Expected behavior

No response

Logs

Loading model...
DEBUG:fsspec.local:open file: ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/model.pth
Computing speaker latents...
Inference...
~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/stream_generator.py:138: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(
Traceback (most recent call last):
  File "~/projects/testing/xtts.py", line 33, in <module>
    for i, chunk in enumerate(chunks):
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 657, in inference_stream
    gpt_generator = self.gpt.get_generator(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/gpt.py", line 602, in get_generator
    return self.gpt_inference.generate_stream(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/stream_generator.py", line 186, in generate
    model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/transformers/generation/utils.py", line 473, in _prepare_attention_mask_for_generation
    torch.isin(elements=inputs, test_elements=pad_token_id).any()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: isin() received an invalid combination of arguments - got (test_elements=int, elements=Tensor, ), but expected one of:
 * (Tensor elements, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
 * (Number element, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
 * (Tensor elements, Number test_element, *, bool assume_unique, bool invert, Tensor out)

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.3.0+cu121",
        "TTS": "0.24.0",
        "numpy": "1.26.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.12.3",
        "version": "#35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May  7 09:00:52 UTC 2"
    }
}

Additional context

No response

transformers>=4.41 break XTTS streaming, see #31

eginhard · 2024-05-29T15:42:39Z

coqui-tts version 0.24.1 limits transformers to lower versions for a temporary fix until someone properly updates the streaming code.

pseudotensor · 2024-05-29T17:10:24Z

https://github.com/h2oai/h2ogpt/blob/main/docs/xtt.patch

eginhard · 2024-05-30T08:10:48Z

https://github.com/h2oai/h2ogpt/blob/main/docs/xtt.patch

@pseudotensor Cool, thanks! Do you want to send a PR?

….41.1 Fixes #31. The handling of special tokens in `transformers` was changed in huggingface/transformers#30624 and huggingface/transformers#30746. This updates the XTTS streaming code accordingly.

eginhard added bug Something isn't working good first issue Good for newcomers labels May 28, 2024

eginhard mentioned this issue May 28, 2024

isin() received an invalid combination of arguments huggingface/transformers#31040

Closed

4 tasks

eginhard added a commit that referenced this issue May 28, 2024

build: set upper version limit for transformers

dc629f8

transformers>=4.41 break XTTS streaming, see #31

eginhard mentioned this issue May 28, 2024

Fix XTTS streaming #33

Merged

eginhard mentioned this issue Jun 12, 2024

[Bug] torch.isin(elements=inputs, test_elements=pad_token_id).any() TypeError: isin() received an invalid combination of arguments - got (elements=Tensor, test_elements=int, ) coqui-ai/TTS#3786

Closed

eginhard self-assigned this Jun 15, 2024

This was referenced Jun 16, 2024

Fix XTTS streaming for transformers update #46

Merged

Fix Stream Generator on MacOS coqui-ai/TTS#3792

Open

eginhard closed this as completed in #46 Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] HF transformers update breaks XTTS streaming #31

[Bug] HF transformers update breaks XTTS streaming #31

eginhard commented May 28, 2024

eginhard commented May 29, 2024

pseudotensor commented May 29, 2024

eginhard commented May 30, 2024

[Bug] HF transformers update breaks XTTS streaming #31

[Bug] HF transformers update breaks XTTS streaming #31

Comments

eginhard commented May 28, 2024

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

eginhard commented May 29, 2024

pseudotensor commented May 29, 2024

eginhard commented May 30, 2024