Long-form synthesis #9

fakerybakery · 2024-04-10T23:27:46Z

Hi,
Congrats on the release!! Is long form synthesis planned?
Thank you!

sanchit-gandhi · 2024-04-11T11:07:21Z

Currently we train on a maximum of 30-second audios. With @ylacombe we're looking at increasing the context length to potentially longer audio lengths. Alibi embeddings (or a variant thereof) look promising for this https://arxiv.org/abs/2108.12409

As a future works, it would be amazing if you could feed an entire chapter of an audiobook to the model, and have it learn the prosody and intonation directly from training examples (with no guidance from the text prompt)

fakerybakery · 2024-04-11T17:19:33Z

That would be nice. I was wondering if it would be possible to use chunking, and have previous chunks as context, to make the speech sound natural with different speakers. (This would be nice for audiobooks with multiple characters.)

lmxue · 2024-05-02T08:58:46Z

Currently we train on a maximum of 30-second audios. With @ylacombe we're looking at increasing the context length to potentially longer audio lengths. Alibi embeddings (or a variant thereof) look promising for this https://arxiv.org/abs/2108.12409

As a future works, it would be amazing if you could feed an entire chapter of an audiobook to the model, and have it learn the prosody and intonation directly from training examples (with no guidance from the text prompt)

Is there any updates aobut the long-form speech synthesis? I'm looking forward to the results.
What's more, for the future works you mentioned, it sounds more applicable in the audiobook scene. But I'm curious about what the voice be like. A pre-defined voice?

bkutasi mentioned this issue Apr 17, 2024

Model stumbling on its words #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-form synthesis #9

Long-form synthesis #9

fakerybakery commented Apr 10, 2024

sanchit-gandhi commented Apr 11, 2024

fakerybakery commented Apr 11, 2024

lmxue commented May 2, 2024 •

edited

Long-form synthesis #9

Long-form synthesis #9

Comments

fakerybakery commented Apr 10, 2024

sanchit-gandhi commented Apr 11, 2024

fakerybakery commented Apr 11, 2024

lmxue commented May 2, 2024 • edited

lmxue commented May 2, 2024 •

edited