Add support for Chatterbox

### Model description

Chatterbox is a multilingual, zero-shot Text-to-Speech (TTS) model designed for flexible voice synthesis across a wide range of languages without requiring task-specific fine-tuning. It is built on a 0.5B parameter Llama-based architecture.

Find some more information on [deepwiki](https://deepwiki.com/resemble-ai/chatterbox/1-overview)

### Prerequisites

- [ ] The model is supported in Transformers (i.e., listed [here](https://huggingface.co/docs/transformers/index#supported-models-and-frameworks))
- [ ] The model can be exported to ONNX with Optimum (i.e., listed [here](https://huggingface.co/docs/optimum/main/en/exporters/onnx/overview))

### Additional information

The main repo along with some examples can be found [here](https://github.com/resemble-ai/chatterbox).
Some attention to it has already been made in the `onnx-community`, providing `onnx` exports for the pre-trained model [here](https://huggingface.co/onnx-community/chatterbox-ONNX)

### Your contribution

With appropriate guidance I can contribute directly to the implementation of this feature.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for Chatterbox #1434

Model description

Prerequisites

Additional information

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for Chatterbox #1434

Description

Model description

Prerequisites

Additional information

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions