Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support text-to-speech in pipeline function and in Optimum #22487

Closed
josephrocca opened this issue Mar 31, 2023 · 11 comments · Fixed by #24952
Closed

Support text-to-speech in pipeline function and in Optimum #22487

josephrocca opened this issue Mar 31, 2023 · 11 comments · Fixed by #24952
Labels
Core: Pipeline Internals of the library; Pipeline. Feature request Request for a new feature

Comments

@josephrocca
Copy link
Contributor

josephrocca commented Mar 31, 2023

Feature request

SpeechT5 was recently added to Transformers:

It would be great if text-to-speech could be supported across the Transformers stack.

Motivation

@xenova bumped into this as an issue when trying to get SpeechT5 working in the browser (Transformers.js).

Your contribution

Probably unable to help with this at the moment.

@sgugger
Copy link
Collaborator

sgugger commented Mar 31, 2023

cc @sanchit-gandhi

@sanchit-gandhi
Copy link
Contributor

Indeed, a TTS pipeline would be super helpful to run SpeechT5. We're currently planning on waiting till we have 1-2 more TTS models in the library before pushing ahead with a TTS pipeline, in order to verify that the pipeline is generalisable and gives a benefit over loading a single model + processor.

cc @hollance

@josephrocca
Copy link
Contributor Author

Any viable contenders for the other 1-2 models? https://paperswithcode.com/task/text-to-speech-synthesis

@mayankagarwals
Copy link
Contributor

Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models

@xenova
Copy link
Contributor

xenova commented Apr 6, 2023

Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models

We can probably just select the most popular models from the hub: https://huggingface.co/models?pipeline_tag=text-to-speech&sort=downloads

@hollance
Copy link
Contributor

hollance commented Apr 7, 2023

There is an open PR for FastSpeech2. I think this is a good new model to add. If anyone is interested in taking that PR to completion, that would be awesome!

@xenova
Copy link
Contributor

xenova commented Apr 18, 2023

Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models

Let me know if you need any help! I’m excited for this to be added 🔥

@xenova
Copy link
Contributor

xenova commented Apr 27, 2023

Here's another model which could fall into the text-to-speech category: #23036

@jozefchutka
Copy link

Just added one more #23050

@bil-ash
Copy link

bil-ash commented Jul 22, 2023

Please add support for the mms-tts model as mentioned in above issue to the TTS pipeline.

@xenova
Copy link
Contributor

xenova commented Jul 22, 2023

Good news! This is currently being worked on: #24952 🚀🔥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core: Pipeline Internals of the library; Pipeline. Feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants